Project Roadrunner: PostgreSQL Support for the Ed-Fi ODS / API
Overview
The most popular request at the 2018 Ed-Fi Summit's tech town hall was for an option to run the Ed-Fi ODS / API on an open source database solution. Historically, the Operational Data Store (ODS) database has been developed on Microsoft SQL Server, matching the preference of educational agencies that rely on heavily discounted licensing terms for on-premises operation of SQL Server. The advent of cloud-based hosting has changed that dynamic, especially since Microsoft ended the "bring your own license" practice. One implication of that change is that educational agencies wishing to use SQL Server must pay full price when using Amazon RDS. So, with the help of an Ed-Fi Special Interest Group, we narrowed the field to one alternative database platform. (For now…)
Enter Project Roadrunner: refactoring the Ed-Fi tech stack to support the increasingly popular PostgreSQL open source relational database management system. The final version number and product name for PostgreSQL support are yet to be determined, hence the use of a codename. The initial release will support Data Standard v3.1.
To follow the development progress in Tracker, see the version reports:
Project Scope
After careful study, the Ed-Fi Alliance has decided to create a new version of the Ed-Fi ODS / API, targeted primarily at local education agencies, with the following scope of features as a minimum viable product:
- API running in Windows
- PostgreSQL running anywhere (VM, Docker, cloud providers)
- Admin App in on-premises mode (which can still be in the cloud on a VM)
- Client-side bulk loader
- New tools for database deployment, which will support both SQL Server and PostgreSQL
- Populated and minimal template databases for quick-start and testing
- Binary deployment of the API
- Documentation on how to deploy and secure the ODS/API running on PostgreSQL.
What does this leave out of scope?
- Code Generation
- Bulk load utility
- Security Config / Key Retrieval
- Sandboxing
- Extensions
- Composites
- Profiles
- Change Queries
- Unique ID System Integration
- Deploy scripts for Azure, AWS
- Sample Data Generator
- Glendale / Apple Grove / Northridge sample databases.
The old XML bulk-load process is probably permanently out of scope, as is the security config tool, which is being phased out in favor of the Admin App that was created for the Cloud ODS and which now supports an on-premises mode. The other features could conceivably be supported in a future release, and may not require much more effort. Yet in targeting local educational agencies, with more constrained budgets than state agencies, the Alliance feels that it can get a useful release out to the community more quickly with those features disabled.
Code Generation and Binary Deployment
Currently, the code generation relies on SQL Server for a portion of its work. Since MetaEd and extensions are out of scope for the initial release, code generation will not be updated to support PostgreSQL. The source code for Roadrunner will continue to be freely available to all Ed-Fi licensees so they can download and recompile. Even if licensees intend to target PostgreSQL deployments, they will still require a SQL Server instance during the initdev process, which runs the code generation.
However, since extensions, composites, and profiles will not be supported, there should be no need for product users to recompile the code. Thus, the application will have a process for quickly and easily installing from binaries, either using PowerShell scripts or a wizard-based user interface.
Administrative Tools
The two traditional user interface tools will not be available: the Sandbox Admin web site and the pairing of the Security Config and Key Retrieval tools. Instead, users will need the Admin App running in on-premises mode. The Security Config tool is officially on a path of deprecation. The sandboxing concept could be supported in the future. In the meantime, users will be encouraged to simply setup multiple installations.

Technical Challenges and Solutions
Data Definition and Data Manipulation Language
T-SQL and PG-SQL are the respective SQL dialects for Microsoft SQL Server and PostgreSQL relational database management systems. Due to ANSI standards compliance they are largely the same. However, there are some critical differences with respect to data types, semantics, and syntax. In creating an ODS that will run on PostgreSQL, the SQL schema scripts and queries must be adapted by:
- Shortening any object names that are longer than 63 characters. For example, the table name
StudentSchoolFoodServiceProgramAssociationSchoolFoodServiceProgramServiceis too long for PostgreSQL.- The Alliance working on an algorithm for shortening names, which will probably combine abbreviating components such as "Education" to "Edu" and "Organization" to "Org", as well as a final truncation with suffix if the name is still too long (e.g.
EdOrgRestIsTruncated~1A)
- The Alliance working on an algorithm for shortening names, which will probably combine abbreviating components such as "Education" to "Edu" and "Organization" to "Org", as well as a final truncation with suffix if the name is still too long (e.g.
- Using appropriate data types (e.g.
uuidinstead ofuniqueidentifier). - Replace square bracket qualifiers (
[object]) with quotation marks ("object"). - Constraints, indexes, primary keys, and auto increments must be rewritten.
For a detailed analysis, see Converting to PostgreSQL: Analysis in the SIG space.
The MVP release is not predicated on MetaEd support, although MetaEd support is part of the long-term plan. The initial schema objects for PostgreSQL will be created with the help of a database migration utility with manual adjustments as needed.
Application Code
Most database queries in the ODS and associated tools are mediated through an ORM, either NHibernate (ODS) or Entity Framework (Admin App). Both of these are capable of writing queries against PostgreSQL databases with little additional effort. The primary concern will be adjusting the object-to-table mapping when object names must be shortened.
In some places, use of the ORM was previously ruled out because of threading or performance issues. In these cases ADO.NET was used directly. These ADO.NET calls contain raw SQL that is typically almost compatible with PostgreSQL. These ADO.NET calls will be modified:
- Using a special ORM instance where possible, or
- Moving complex queries into T-SQL and PG-SQL Views and writing generic ANSI-SQL compatible queries through ADO.NET (using the Npgsql library)
Database Deployment
The existing ODS / API v2.x and v3.x code bases use PowerShell scripts to deploy the three Ed-Fi databases: Admin, ODS, and Security. The PowerShell commands use SQL Server Management Objects (SMO) to install all of the relevant scripts into the ODS database, and Entity Framework migrations for setting up the Admin and Security databases. SMO will not work for PostgreSQL deployment and must be replaced. Although Entity Framework migrations will work for PostgreSQL, it is not an ideal solution.
Instead, one or more new console applications will created, using DbUp to manage database state transitions. This, we hope, will also provide a new path forward for easier upgrades of both PostgreSQL and SQL Server-based systems. DbUp is used in both the existing database migration utility and in the Analytics Middle Tier. It tracks installation of scripts on the server by name. The Admin database has 13 tables in it; there would be one named script for each table. Once DbUp has run a script, it never runs that same script again; it does this by creating a journal table that records which scripts have run and when. Any modification to one of those tables goes into a new script, rather than altering the old one. Thus each change is a state transition (e.g. ALTER TABLE) instead of a drop and recreate. This is what makes upgrades from one version to the next easier to manage.
It has not been decided whether all three databases will be handled by one console application or by separate ones. Whatever the case, the application will have the following characteristics:
- Written in .NET Core so that it can be run in Windows or Linux environments (although initial release testing will focus on Windows testing).
- Command line options to include:
- Connection string
- Connection timeout
- Database type flag
- Idempotency, meaning the application(s) can be run repeatedly with no surprises or side-effects.
Such a command-line program can easily be executed using continuous integration / deployment tools such as TeamCity, Octopus, or Jenkins. The command would look something like:
# PostgreSQL .\EdFi.ODS.Deploy.exe deploy -d ODS -c "Host=myserver;Username=mylogin;Password=mypass;Database=EdFi_Ods" -t 60 -p Postgres .\EdFi.Ods.Deploy.exe deploy -d Admin -c "Host=myserver;username=mylogin;Password=mypass;Database=EdFi_Admin" -t 60 -p Postgres # SQL Server .\EdFi.ODS.Deploy.exe deploy -d ODS -c "Server=myserver;user id=mylogin;Password=mypass;Database=EdFi_Ods" -t 60 -p SqlServer .\EdFi.Ods.Deploy.exe deploy -d Admin -c "Server=myserver;user id=mylogin;Password=mypass;Database=EdFi_Admin" -t 60 -p SqlServer
Deployment Scenarios
The ODS / API application will likely still be running on the .NET Framework (not .NET Core) at the time of release. Thus, while the database server could be running in Linux, the ODS / API is expected to be Windows-only. The same is true for the Admin App.
Planned Docker Support
From the start, the development team will be using PostgreSQL in Docker for much of its development and testing effort.
Hoped-for Docker Support
There is currently a set of scripts for running the ODS / API v2.3.1, v2.4.0, or v2.5.0 in Docker-on-Windows. We hope — but are not committed — to deliver updated Docker scripts for running the Roadrunner version of the ODS / API in Docker-on-Windows. Admin App would require changes to the authentication mechanism, and thus would be less likely to be supported in Docker than the API.
Cloud Services
In addition to Docker, we anticipate that the deployment utility will be able to target PostgreSQL managed services on AWS, Azure, Google Cloud, Heroku, and so forth. Testing on cloud-managed services will likely rely on help from partners and interested community members and thus is not necessarily in scope for the MVP release.