Support for Data Standard 3.3.1-b
Introduction
Data Standard 3.3.1-b introduces an important usage change that impacts the new StudentInternetAccessDim View - Experimental view that was added to support the online engagement starter kit (see - DATASTD-1594Getting issue details... STATUS )
Previously, there was a recommendation to store Digital Equity information in the StudentEducationOrganizationAssociationStudentIndicator
table (ED-FI WORKING DRAFT 3 - DIGITAL EQUITY COLLECTION). This recommendation still holds for Data Standard 3.1, 3.2, and 3.3.0-a. However, Data Standard 3.3.1-b introduces a more formal pattern for storing these data by adding columns to the StudentEducationOrganizationAssociation
table. Thus for the upcoming release of the Ed-Fi ODS/API Tech Suite 3, version 5.3, the Analytics Middle Tier needs to change how it is getting the data.
For additional context, currently the Analytics Middle Tier treats Data Standard 3.3.0-a as if it were Data Standard 3.2, because there were no breaking changes between them that impacted the views.
There are no breaking changes for Analytics Middle Tier version 2.5.x at the schema level. This normative change is the only known impact of this revision to the data standard.
An Observation About the Data Model
When StudentInternetAccessDim
was first envisioned, it was assumed that there would be a time period to it, and thus the relationship from StudentSchoolDim View to StudentInternetAccessDim
was assumed to be one-to-many. However, there was a mistake in the data modeling for the Student Indicator entity: the date was not part of the natural key. Therefore there could only be one record. Thus even with the initial release, it would have been possible to consolidate these columns onto the StudentSchoolDim
view.
This perspective is reinforced with the new Data Standard, where the Digital Equity columns are directly on the StudentEducationOrganizationAssociation
. The Analytics Middle Tier works best when the data model is kept as simple as possible (while remaining accurate). Therefore it would be better to eliminate the StudentInternetAccessDim
in favor of adding new columns to the StudentSchoolDim
.
Normally that would mean a breaking change and an update to the Major version number for the Analytics Middle Tier. However, with version 2.5.x, the view was labeled "experimental" and thus subject to change. In that sense, it is reasonable to consider this a feature update instead of a breaking change.
Default Solution
The procedure to date for handling breaking changes at the Data Standard level has been:
- Fix the automated tests
- With help from the ODS Platform team, create a new dacpac file with an empty version of the database for the new data standard.
- Create a new test harness that loads this dacpac.
- Modify / add tests scripts as necessary for loading sample data into the new test harness.
- Fix the source code
- Create a new .NET project for the new Data Standard.
- Copy the final state of all of the views from the prior Data Standard project to the new one. There is no reason to copy an prior state migration files.
- Adjust broken view files as required to make them work (correctly) in the new Data Standard.
- Add code SqlServerMigrationStrategy.cs class to detect the new Data Standard
- Run the automated tests to confirm that everything works.
Concerns About this Approach
This approach has generally been useful to date. However, it does mean that any bugs found in a view must be fixed in several places. Is it really worth creating that level of potential tech debt in order to support this one small change?
Proposed Alternative
- Continue treating Data Standards 3.2 and 3.3.1-b the same from a C# migration strategy perspective.
- Create new scripts to drop
StudentInternetAccessDim
- Add new columns (described below) to the
StudentSchoolDim
.- For "DS22", the new columns will have to have
null
value, as there is no available source. - For "DS31", the new columns would be derived from
StudentEducationOrganizationAssociationStudentIndicator
. In this case, the "DS32" script will have to detect if it is really installing into Data Standard 3.3.1-b, with something like this:
IF EXISTS ( SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'StudentEducationOrganizationAssociation' AND COLUMN_NAME = 'InternetAccessInResidence') BEGIN -- Build the query based on columns in StudentEducationOrganizationAssociation END ELSE BEGIN -- Build the query based on columns in StudentEducationOrganizationAssociationStudentIndicator END
- For "DS22", the new columns will have to have
New Columns
The columns in the StudentInternetAccessDim
view are:
- InternetAccessInResidence
- InternetAccessTypeInResidence
- InternetPerformance
- DigitalDevice
- DeviceAccess
Data Standard 3.3.1-b uses more descriptive column names and adds two more columns. To improve the reader's understanding about the source of data in the view, it is proposed that we instead use the following columns in the modified StudentSchoolDim
view.
- PrimaryLearningDeviceAwayFromSchool (string from a Descriptor) (previously "DigitalDevice")
- PrimaryLearningDeviceAccess (string from a Descriptor) (previously "DeviceAccess")
- PrimaryLearningDeviceProvider (string from a Descriptor) (new)
- InternetAccessInResidence (bool) (previously "AccessInResidence")
- BarrierToInternetAccessInResidence (string from a Descriptor) (new)
- InternetAccessTypeInResidence (string from a Descriptor) (previously "InternetAccessInResidence")
- InternetPerformanceInResidence (string from a Descriptor) (previously "InternetPerformance")