Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Info
titleIntroduction

This document provides a rough analysis of what will be needed if the Ed-Fi ODS/API were to be converted over to run on another relational database management system (RDBMS). PostgreSQL was used as an anchor; however, many of the findings will be generally applicable to other RDBMS's.

...

  1. Eliminate boilerplate at the top of the script - things related to SQL Cmd mode, connection properties (e.g. ansi_padding), etc.
  2. Replace all [ ] qualifiers with nothing. Postgres uses quotation marks as qualifiers as needed. However, when you use them to create a object, then you are stuck permanently using them. Best to create without the quotation marks.
  3. Create a dbo schema, belonging to the postgres user.
  4. Alter other schemas so that they belong to the postgres user instead of dbo.
  5. Fix data types

    T-SQL TypePG-SQL TypeNotes
    NVARCHARVARCHARpostgres implicitly converts handles all text as Unicode.
    DATETIMETIMESTAMPNot using TIMESTAMPTZ (with timezone) because the source data has no record of the time zone to begin with.
    DATETIME2TIMESTAMP
    UNIQUEIDENTIFIERCHAR(40)
    BITSMALLINT

    While postgres has a bit data type, it does not implicitly convert integer to bit. Instead it requires use a string '0' or '1' (or 'true', etc.). For maximum compatibility with existing code that uses the integer 0/1 instead of string '0'/'1', we should simply switch to smallint.


  6. Postgres's cluster concept is different and more analysis is needed to know if it is needed (explicitly) for performance benefits. Remove CLUSTERED and NONCLUSTERED keywords from all indexes and primary key constraints. Also remove PAD_INDEX from constraints. Safe to leave in FILLFACTOR.
  7. Remove WITH CHECK and NOCHECK from constraints.
  8. The pg-equivalent of INT IDENTITY is SERIAL. Reg ex: /INT\s+IDENTITY\s*\(1, 1\)/ replace with SERIAL
  9. Remove all PRINT statements and extended properties (table/column descriptions). Extended properties can be rewritten as COMMENTs but the effort was not worthwhile in this POC.
  10.  Views
    1. Remove WITH SCHEMABINDING
    2. Replace COUNT_BIG(*) with COUNT(*), as postgres's COUNT function already returns a big integer.
    3. Remove all indexes on views.

      Note

      In a real conversion, some views might need to materialized views for performance optimization.


    4. Ensure that all view definitions end in a semi-colon.
    5. In auth.LocalEducationAgencyIdToStaffUSI, cast null values as integer.
    6. UNPIVOT must be rewritten with UNION ALL statements.

      Expand


      T-SQLPG-SQL


      Code Block
      languagesql
      CREATE VIEW [auth].[EducationOrganizationIdToStateAgencyId]
      AS
          SELECT    DISTINCT StateEducationAgencyId, EdOrgId as EducationOrganizationId
          FROM
                  (SELECT    * FROM auth.EducationOrganizationIdentifiers) As Source
          UNPIVOT (EdOrgId FOR IdType IN (EducationServiceCenterId, LocalEducationAgencyId, SchoolId)) -- Only ESCs, LEAs and Schools are accessible to State-level claims
                  As UnpivotedSource
          UNION    
          -- State-level claims also can access the State
          SELECT    StateEducationAgencyId, StateEducationAgencyId AS EducationOrganizationId
          FROM    edfi.StateEducationAgency
      GO



      Code Block
      languagesql
      CREATE VIEW auth.EducationOrganizationIdToStateAgencyId
      AS
      
      
          SELECT 
              StateEducationAgencyId,
              EducationServiceCenterId as EducationOrganizationId
          FROM
              auth.EducationOrganizationIdentifiers
          WHERE
              EducationServiceCenterId IS NOT NULL
          UNION
          SELECT 
              StateEducationAgencyId,
              LocalEducationAgencyId as EducationOrganizationId
          FROM
              auth.EducationOrganizationIdentifiers
          WHERE
              LocalEducationAgencyId IS NOT NULL
          UNION
          SELECT 
              StateEducationAgencyId,
              SchoolId as EducationOrganizationId
          FROM
              auth.EducationOrganizationIdentifiers
          WHERE
              SchoolId IS NOT NULL
          UNION   
          -- State-level claims also can access the State
          SELECT  StateEducationAgencyId, StateEducationAgencyId AS EducationOrganizationId
          FROM    edfi.StateEducationAgency;




    7. Views that use common table expressions (CTE) recursively need to be modified by adding the keyword RECURSIVE immediately after WITH. Affected views: StudentAssessmentStudentObjectiveAssessmentH, ObjectiveAssessmentH_CTE, LocalEducationAgencyH, LearningStandardH, LearninbjectiveH, AssessmentFamilyH.
    8. Default values in postgres are not constraints, so the syntax needs to be altered.

      1. Substitute now() for getdate().

        FromTo
        /ALTER TABLE ([^\s]+)[\s\n]+(?:ADD CONSTRAINT (?:[^\s]+)) DEFAULT \(getdate\(\)\) FOR (\w+)/ALTER TABLE $1 ALTER COLUMN $2 SET DEFAULT now()


      2. Replace newid() with uuid_generate_v4(). This requires first configuring uuid support by running CREATE EXTENSION "uuid-ossp".

        FromTo
        /ALTER TABLE ([^\s]+)[\s\n]+(?:ADD CONSTRAINT (?:[^\s]+)) DEFAULT \(newid\(\)\) FOR (\w+)/ALTER TABLE $1 ALTER COLUMN $2 SET DEFAULT uuid_generate_v4()


    9. The maximum identifier length is 63. Longer identifiers are automatically truncated. This results in many foreign key names overlapping. To avoid this, use reasonable abbreviations to shorten these seven problematic beginnings to foreign keys:

      FromTo
      fk_educationorganizationnetworkassociation_educationorganizationfk_edorgnetworkassociation_edorg

      fk_learningstandardprerequisitelearningstandard_learningstandard

      fk_lrnstdprerequisitelrnstd_lrnstd

      fk_educationcontentderivativesourceeducationcontent_educationcontent

      fk_edconderivativesourceedcon_edcon

      fk_studentassessmentstudentobjectiveassessmentperformancelevel

      fk_studassessmentstudobjassessmentperflvl

      fk_postsecondaryeventpostsecondaryinstitutionidentificationcode

      fk_pstscdryeventpstscdryinstidcode

      fk_postsecondaryeventpostsecondaryinstitutionmediumofinstruction

      fk_pstscdryeventpstscdryinstmediumofnstrn

      fk_educationorganizationidentificationcode_educationorganization

      fk_edorgidcode_edorg


Other constructs used in Admin, Bulk, and Security databases

...

Entity Framework

placholder

Migrations

placholderThe EdFi_Admin database is managed through Entity Framework migrations. In general migrations seem to be supported with Npgsql.EntityFramework, and we should be able to modify EdFi.Ods.Admin.Models to target postgres. This has not been tested yet.

Work Required to Migrate

There is no simple solution for migrating to Postgres or another database. We can split this into two components: generation of the database schema, and runtime execution.

...

Generation

Raw SQL Scripts

SQL scripts to create the database

MetaEd

placeholder

EF Migrations

placeholder

Runtime Execution

NHibernate

placeholder

Entity Framework

placeholder

Bulk Load

placeholder