Migration Utility

Migration Utility

This content is archived.

March 31st, 2022: Ed-Fi Alliance announced EOL for ODS Migration utility. While the utility could be used for upgrade to Ed-Fi ODS / API v5.3 version for both SQL Server and PostgreSQL, there will be no new migration support or enhancements to this product. 

Overview

The ODS Migration Utility is a command-line tool built to upgrade the schema of an ODS instance up to the latest version, along with data migration.

It currently supports data migration from Ed-Fi Data Standard v2.0 and Ed-Fi Data Standard v2.2 to Ed-Fi Data Standard v3.3. The utility has out-of-the-box support for migrating an as-shipped ODS to the latest version. With additional customized scripting, the Migration Utility can be easily adapted and used to migrate extended ODS instances. ODS shared instances may take advantage of this utility. For year-specific instances, migration may not be a concern as a new ODS is created at the beginning of every school year.

Contents:

Usage Scenarios

The following table summarizes the supported scenarios for the migration utility:

Database Type

Databases

Upgrade/Migration strategy

Database Type

Databases

Upgrade/Migration strategy

Core Databases 
Databases that surface the Ed-Fi model and store user data.

EdFi_Ods, EdFi_Ods_YYYY, EdFi_Ods_Sandbox

The current migration utility release  supports an in-place upgrade for the following upgrade paths for SQL Server:

2.4 -> 5.3
2.5 -> 5.3
3.0 -> 5.3

and following upgrade path for PostgreSQL:

3.4 -> 5.3

 

The migration utility also supports migrating extensions with additional custom scripts.

Support Databases
Databases that provide supporting functions and store user data.

EdFi_Admin, EdFi_Security, EdFi_Bulk

These databases can either be recreated, or database deployment tool can be used to migrate them.

Transient Databases
Databases that either surface the Ed-Fi model or perform supporting functions, but do not have user data persistence requirements.

EdFi_ODS_Empty, EdFi_Ods_Minimal_Template EdFi_Ods_Populated_Template

No upgrade supported.

 

Developer Quick Start

The basic steps are simple:

  • Restore a backup copy of the target ODS to your local SQL Server instance. We recommend:

  • Make sure .NET Core 3.1 SDK is installed.

  • Choose one of the two options below to launch the migration utility:

Step 1. Install the Ed-Fi ODS Migration tool:

c:\>mkdir {YourInstallFolder}
c:\>dotnet tool install EdFi.Suite3.Ods.Utilities.Migration --tool-path {YourInstallFolder} --version 2.2.*

Note: As onetime setup, you may need to add Ed-Fi package source by running the following  command in PowerShellprompt using the following command:  

if (-not [Net.ServicePointManager]::SecurityProtocol.HasFlag([Net.SecurityProtocolType]::Tls12)) { [Net.ServicePointManager]::SecurityProtocol += [Net.SecurityProtocolType]::Tls12 } Register-PackageSource -Name Ed-FiAzureArtifacts -Location https://pkgs.dev.azure.com/ed-fi-alliance/Ed-Fi-Alliance-OSS/_packaging/EdFi/nuget/v3/index.json -ProviderName NuGet

Step 2. Open a console window and change to the directory containing the executable

Step 3. Launch the upgrade tool from the command line

Step 1. Clone git repository: https://github.com/Ed-Fi-Exchange-OSS/Ed-Fi-MigrationUtility

Step 2. Open the Visual Studio solution file, Migration.sln.

Step 3. Set up command line input for debugging in Visual Studio 2019:

  • Right click the EdFi.Ods.Utilities.Migration project.

  • Select properties.

  • Select debug.

  • Add command line arguments:

Step 4. Set the EdFi.Ods.Utilities.Migration project as your startup project.

Step 5. Launch in debug mode (F5).

Downloads

Ed-Fi ODS Migration Utility binaries: EdFi.Suite3.Ods.Utilities.Migration v2.2 (Prerequisite: .NET Core 3.1 SDK)

Ed-Fi ODS Migration Utility source code, hosted on Ed-Fi Alliance GitHub: https://github.com/Ed-Fi-Exchange-OSS/Ed-Fi-MigrationUtility

Example calendar configuration files: Sample Calendar Config

Development Overview: The Basics

The table below describes files and folders used by the Migration Utility along with a description and purpose for each resource.

Overview Item

Needed by Whom?

Brief Description & Purpose

Overview Item

Needed by Whom?

Brief Description & Purpose

Script Directory:
\Scripts

  • Users writing custom upgrade scripts

  • Maintainers of the upgrade utility

  • All database upgrade scripts go here, including:

    • Ed-Fi upgrade scripts.

    • Custom upgrade scripts (user extensions).

    • Compatibility checks.

    • Dynamic / SQL-based validation.

  • Subdirectories contain code for each supported ODS upgrade version.

Directory:
\Descriptors

  • Maintainers of the upgrade utility only

  • Ed-Fi-Standard XML files containing Descriptors for each ODS version.

  • These XML files are imported directly by the scripting in the Utilities\Migration\Scripts directory above.

Library/Console:
EdFi.Ods.Utilities.Migration

(console application created via dotnet publish)

  • Maintainers of the upgrade utility only

  • Small, reusable library making use of DbUp to drive the main upgrade.

  • Executes the SQL scripts contained in the Utilities\Migration\Scripts directory above.

  • Takes a configuration object as input, and chooses the appropriate scripts to execute based on ODS version and current conventions.

Test Project:
EdFi.Ods.Utilities.Migration.Tests

  • Maintainers of the upgrade utility only (does not contain test coverage for extensions)

  • Contains integration tests that perform a few test upgrades and assert that the output is as expected.

  • Like the console utility, makes direct use of the EdFi.Ods.Utilities.Migration library described above.

Development Troubleshooting

This section outlines general troubleshooting procedures.

Compatibility Errors

  • Before the schema is updated, the ODS data is checked for compatibility. If changes are required, the upgrade will stop and exception will be thrown with instructions on how to proceed. 
    An example error message follows:

Example compatibility error
  • After making the required changes (or writing custom scripts), simply launch the upgrade utility again. The upgrade will proceed where it left off and retry with the last script that failed.

Other Exceptions During Development

  • Similar to compatibility error events, the upgrade will halt and an exception message will be generated during development if a problem is encountered.

  • After making updates to the script that failed, simply re-launch the update tool. The upgrade will proceed where it left off starting with the last script that failed. 

  • If you are testing a version that is not yet released, or if you need to re-execute scripts that were renamed or modified during active development: restore your database from backup and start over.

  • Similar to other database migration tools, a log of scripts successfully executed will be stored in the default journal table. DbUp's default location is [dbo].[SchemaVersions].

  • A log file containing errors/warnings from the most recent run may be found by default in "{YourInstallFolder}\.store\edfi.suite3.ods.utilities.migration\{YourMigrationUtilityVersion}\edfi.suite3.ods.utilities.migration\{YourMigrationUtilityVersion}\tools\netcoreapp3.1\any\Ed-Fi-Migration.log".

Additional Troubleshooting

  • The step-by-step usage guide below contains runtime troubleshooting information.

Design/Convention Overview

The table below outlines some important conventions in the as-shipped Migration Utility code.

What

Why

Optional Notes

What

Why

Optional Notes

In-place upgrade

Database upgrades are performed in place rather than creating a new database copy

Extensions

  • Preserve all unknown data in extension tables

  • Ensure errors and exceptions are properly generated and brought to the upgrader's attention during migration if the upgrade conflicts with an extension or any other customization

As a secondary concern, this upgrade method was chosen to ease the upgrade process for a Cloud-based ODS (e.g., on Azure).

  • Users who have extension tables (or any other schema with foreign key dependencies on the edfi schema) will be notified, and must explicitly acknowledge it by adding the BypassExtensionValidationCheck option at the command line when upgrading. This will ensure that the installer is aware that custom upgrade scripts may be required

Sequence of events that occur during upgrade

Specifics differ for each version, but in general the upgrade sequence executes as follows

  1. Validate user input at the command line

  2. Create tempdata (tables/stored procedures) that will be used for upgrade

  3. Check current ODS data for upgrade compatibility, and display action messages to the user if existing data requires changes

  4. Before modifying the edfi schema: calculate and store hash values for primary key data that expected NOT to change during upgrade

  5. Drop views, constraints, stored procs

  6. Import descriptor data from XML

  7. Create all new tables for this version that did not previously exist

  8. Update data in existing tables

  9. Drop old tables

  10. Create views/constraints/stored procs for the new version

  11. Validation check: recalculate the hash codes generated previously, and make sure that all data that is not supposed to change was not mistakenly modified

  12. Drop all temporary migration data

Minimize the number of scripts with complex dependencies on other scripts in the same directory/upgrade step.

  • Compatibility checks are designed to run before any changes to the edfi schema have been made. This prevents the user from having to deal with a half-upgraded database while making updates

  • Initial hash codes used for data validation also must be generated before touching the edfi schema to ensure accuracy.

    • It is also better for performance to do this step while all of our indexes are still present

  • Dropping of constraints, views, etc is taken care of before making any schema changes to prevent unexpected sql exceptions

  • New descriptors are imported as an initial step before making changes to the core tables. This ensures that all new descriptor data is available in advance for reference during updates

  • After creation of descriptors, the sequence of the next steps is designed to ensure that all data sources exist unmodified on the old schema where we expect it to.

    1. Create tables that are brand new to the schema only (and populate them with existing data)

    2. Modify existing tables (add/drop columns, etc)

    3. Drop old tables no longer needed

  • Foreign keys, constraints, etc are all added back in once the new table structure is fully in place.

  • Once the edfi schema is fully upgraded and will receive no further changes, we can perform the final data validation check.

The suite 2 to suite 3 upgrade is a good example case to demonstrate the upgrade steps working together due its larger scale:

  • For suite 2 to suite 3: All foreign keys and other constraints were dropped during this upgrade in order to adopt the new naming conventions

  • Also for suite 2 to suite 3: ODS types were replaced with new descriptors. This change impacted nearly every table on the existing schema

One script per table in each directory, where possible

Scripts are named in the format:

#### TableName [optional_tags].sql

This convention does not apply to operations that are performed dynamically

Troubleshooting, Timeout prevention

Custom, unknown extensions on the ODS are common. As part of the process of upgrading a highly-customized ODS, an installer is likely to run into a sql exception somewhere in the middle of upgrade (usually caused by a foreign key dependency, schema bound view, etc).

In general, we do not want to attempt to modify an unknown/custom extension on an installer's behalf to try and prevent this from happening. It is important that a installer be aware of each and every change applied to their custom tables. Migration of custom extensions will be handled by the installer.

Considering the above, in the event an exception does occur during upgrade, we want to make the troubleshooting process as easy as possible. If an exception is thrown, an installer should immediately be able to tell:

  • Which table was being upgraded when it occurred (from the file name)

  • What were the major changes being applied (from the file tags)

  • What went wrong (from the exception message)

  • Where to find the code that caused it

Many issues may be fixable from the above information alone. If more detail is needed, the installer can view the code in the referenced script file. By separating script changes by table, we make an effort to ensure that there are only a few lines to look though (rather than hundreds)

In addition, each script will be executed in a separate transaction. Operations such as index creation can take a long time on some tables with a large ODS. Splitting the code into separate transactions helps prevent unexpected timeout events

The major downside of this approach is the large number of files it can produce. For example, the suite 2 to suite 3 upgrade was a case where all existing tables saw modifications. This convention generates a change script for every table in more than one directory.

With updates becoming more frequent in the future, future versions should not be impacted as heavily.

Most change logic is held in sql scripts (as of V3)

As of v3: Most of the upgrade logic is performed from the SQL scripts, rather than using .NET based upgrade utility to write database changes directly

 

 

As of v3, most upgrade tasks are simple enough where they can be executed straight from SQL (given a few stored procedures to get started).

Given this advantage, effort was made to ensure that each part of the migration tool (console utility, library, integration tests) could be replaced individually as needed

The current upgrade utility contains a library making use of DbUp to drive the upgrade process. In the future, if/when this tool no longer suits our needs, we should be able to take existing scripting and port it over to an alternative upgrade tool (such as RoundhousE), or even a custom built tool if the need ever arises.

This convention could (and should) change in the future if upgrade requirements become too complex to execute from SQL scripting alone.

 

Two types of data validation/testing options

  • Dynamic, SQL based

    • Ensures data that is expected to remain the same does not change

    • Can run on any ODS in the field even if the data is unknown

  • Integration tests

    • Runs on a known, given set of inputs

    • Used to test logic in areas where changes should occur

 

Prevent data loss

The first type of validation, (dynamic, sql based) is executed on on data that we know should not ever change during the upgrade.

  • The source and destination tables do not need to be the same. This validation type is most commonly used to verify that data was correctly moved to the expected destination table during upgrade

  • Can be executed on any field ODS to ensure that unknown datasets do not cause unexpected data loss during upgrade

  • The data in these tables does not need to be known

The second type of data validation, integration test based, is used to test the logic and transformations where we know the data should change:

  • For example, during the suite 2 to suite 3 upgrade, descriptor namespaces are converted from the suite 2 "http://EdOrg/Descriptor/Name.xml" format to the suite 3 "uri://EdOrg/Name" format. Integration tests are created to ensure that the upgrade logic is functioning correctly for several known inputs

Together, the two validation types (validation of data that changes, and validation of data that does not change) can be used to create test coverage wherever it is needed for a given ODS upgrade.

The the dynamic validation is performed via reusable stored procedures that are already created and available during upgrade.

See scripts in the "*Source Validation Check" and "*Destination Validation Check" directories for example usages.

Upgrade Issue Resolution Approach

It is common to encounter scenarios where data cannot be upgraded directly from one major version to another due to schema changes. A common example is that a primary key change causes records allowable by the previous version schema to be considered duplicates on the upgrade version schema and therefore not allowed.

The general approaches included here are a result of collaboration with the community on how to resolve these common situations, and are documented here as a reference for utility developers to apply in their own work.

  • Approach 1: Make non-breaking, safe changes to data on the user's behalf. This is the preferred option when practical and safe to do so.

    • Example: Duplicate session names in the suite 2 schema that will break the new suite 3 primary key will get a term name appended during upgrade to meet schema requirements.

    • Consider logging a warning for the user to review to inform them that a minor change has taken place, and mention the log file in documentation.

  • Approach 2: Throw a compatibility exception asking the user to intervene with updates. Used when we are unable to safely make assumptions on the user's behalf to perform the update for them

    • See the troubleshooting section of this document for an example of this

    • The returned user-facing message should explain what updates are required, and very briefly include the reason why if it is not immediately obvious.

    • Compatibility exceptions should be thrown before proceeding with core upgrade tasks. 

      • We want to make sure that we are not asking someone to make updates to a database in an intermediate upgrade state.

    • Each message class has a corresponding error code, mirrored in code by an enumeration

      • This is done so that specific compatibility scenarios can be covered by integration tests where needed

  • Approach 3: Back up deprecated objects that cannot be upgraded into another schema before dropping them. Last resort option that should be mainly restricted to deprecated items that are no longer have a place in the new model

    • As a general approach, it is preferred to avoid dropping data items without a way to recover them in case of disaster. The user may choose to delete the backup if they desire.

    • Avoid this option for tables that exist in both models but simply cannot be upgraded. Should this (hopefully rare) situation occur, consider the option throwing a compatibility exception instead and ask the user back up/empty the table before proceeding.

In general, the option requiring the least amount of user effort while safely preserving all data has been chosen in order to reduce user burden as much as we are able.

 

Usage Walkthrough

This section explains how to upgrade an existing suite 2 ODS to suite 3 ODS v5.3. 

The steps can be summarized as:

 A compatibility reference chart, a command-line parameter list, and a troubleshooting guide are included. Each upgrade step is outlined in detail below. 

Step 1. Read the Ed-Fi ODS v5.3 Upgrade Overview

Before you get started, you should review and understand the information in this section.

Target Audience

These instructions have been created for technical professionals, including software developers and database administrators. The reader should be comfortable with performing the following types of tasks:

  • Creating and managing SQL Server database backups

  • Performing direct SQL commands and queries 

  • Execution of a command-line based tool that will perform direct database modifications

  • Creating a configuration file for upgrade (.csv format)

  • Writing custom database migration scripts (Extended ODS only)

General Notes

  • Your suite 2 Ed-Fi ODS / API will checked for compatibility automatically during the migration process. If changes are needed, you will be prompted at the command line by the migration utility. A summary of commonly encountered compatibility conditions has been included in this section for reference.

  • The new schema may contains upgrades to the structure of primary keys on several tables. In most instances, these new uniqueness requirements will be resolved automatically for you with no action required.

  • There are some areas where new identities cannot be generated automatically on your behalf during upgrade. These tables will need to be updated manually.
     

 

Compatibility Conditions

This section describes compatibility conditions (i.e., requirements that may need intervention for the compatibility tool to function properly) and suggested remediation.

 

Table

Data Compatibility Requirement

Table

Data Compatibility Requirement

[edfi].[Assessment]

All assessments must have a [Namespace] set. (This data may be found in [edfi].[Assessment] or [edfi].[AssessmentFamily]).

[edfi].[StudentProgramParticipation]
[edfi].[StudentCharacteristic]
[edfi].[StudentIndicator]
[edfi].[StudentLearningStyle]
[edfi].[StudentAddress]
[edfi].[StudentIdentificationCode]
[edfi].[StudentElectronicMail]
[edfi].[StudentInternationalAddress]
[edfi].[StudentLanguage]
[edfi].[StudentRace]
[edfi].[StudentDisability]
[edfi].[StudentTelephone]
[edfi].[PostSecondaryEventPostSecondaryInstitution]

The upgrade utility must be able to locate an [EducationOrganizationId] for every student with data in the listed tables to proceed.

The easiest way to meet this requirement is to ensure that every student has a corresponding record in [edfi].[StudentSchoolAssociation] or [edfi].[StudentEducationOrganizationAssociation].

[edfi].[StaffCredential]

The column [StateOfIssueStateAbbreviationTypeId] must be non-null for all records.

This is the abbreviation for the name of the state (within the United States) or extra-state jurisdiction in which a license/credential was issued.

(Any extension table)

Additional steps are required when extensions are present. Please review the upgrade process detailed below for additional guidance.

Table

Data Compatibility Requirement

Table

Data Compatibility Requirement

[edfi].[GradingPeriod]

  • There must be no duplicate [PeriodSequence] values for the same school during the same grading period.

  • If prompted by the upgrade utility, all [PeriodSequence] values must be non-null

Technical Details:

This compatibility requirement is a result of a primary key change between suiteand suite 3 version 3.1

  • Old 2.0 Primary Key: [GradingPeriodDescriptorId], [SchoolId], [BeginDate]

  • New 3.1 Primary Key: [GradingPeriodDescriptorId], [SchoolId], [PeriodSequence] (new), [SchoolYear] (new). ([BeginDate] is removed)

[edfi].[DisciplineActionDisciplineIncident]

The 3.1 schema no longer allows discipline action records with students that are not associated with the discipline incident.

Every record in [edfi].[DisciplineActionDisciplineIncident] must have a corresponding record in [edfi].[StudentDisciplineIncidentAssociation] with the same [StudentUSI], [SchoolId], and [IncidentIdentifier].

 

[edfi].[RestraintEvent]

Ensure that there are no duplicate [RestraintEventIdentifier] values for the same student at the same school

[edfi].[OpenStaffPosition]

Ensure that there are no two duplicate [RequisitionNumber] entries for the same education organization

[edfi].[AccountCode]

This table must be empty before upgrading. Due to a major schema change, data in this table cannot be preserved during upgrade from suite 2 to suite 3

Other Compatibility Conditions

There are several other less common items not included above. The migration utility will check for these items automatically and provide guidance messages as needed. For additional technical details, please consult the TroubleshootingGuide below.

Step 2. Install Required Tools