TAG Meeting 2021-08-19
- Eric Jansson
- Ann Su
Attendees
First Name | Last Name | Organization |
Marcos | Alcozer | Ed-Fi Alliance |
Rohith | Chintamaneni | Arizona Department of Education |
David | Clements | Ed-Fi Alliance |
Patrick | Devanney | ClassLink |
Rosh | Dhanawade | Indiana University INsite |
Mindy | DuFault | Infinite Campus |
Stephen | Fuqua | Ed-Fi Alliance |
Jean-Francois | Guertin | EdWire |
David | Hefley | Nebraska Department of Education |
Eric | Jansson | Ed-Fi Alliance |
Jim | McKay | Instructure |
Chris | Moffatt | Ed-Fi Alliance |
Doug | Quinton | PowerSchool |
Andrew | Rice | Education Analytics |
Audrey | Shay | Wisconsin Department of Public Instruction |
Grishma | Shrestha | Infinite Campus |
Sayee | Srinivasan | Ed-Fi Alliance |
Patrick | Yoho | InnovateEDU Inc |
Agenda
- Expectations re APIs and API clients re an API that has multiple years of data (see some initial details below)
- Data out issues (3rd priority raised by the TAG)
- “Story workshop” format for this, to capture ideas
- See prioritization from TAG Meeting 2021-06-17
Background on Agenda Item #1: Multi-year API issues
This item has been reported via a couple of tickets – see https://tracker.ed-fi.org/browse/EIF-6 and https://tracker.ed-fi.org/browse/EDFI-977 The issue is whether the following is a valid use case/one that should be supported (it is not natively supported currently):
Use case: As a SIS vendor, I need to be able to read current year data from the API -- even when the API has multiple years of data -- so that I can synchronize (i.e., identify and add missing records, and delete extra records that are on the API but not in my system) my data with the data in the ODS. I need to do this because environmental conditions (e.g., system errors, bugs, network outages) lead to issues in synchronization over time.
Example case:
A SIS wants to reconcile its records for discipline actions with the records on an API (i.e., these two stores of data are no longer in sync, for whatever reason), and the API/ODS it is reconciling with has multiple years of school data.
So the SIS does a GET on the API, and receives back BOTH current year AND past year records on discipline actions. The SIS now has to filter past year records out manually. In addition, disciple actions have no school year element (SchoolYear) attached, but only a date, so the filtering must be done by using the date field.
Materials
Notes
Multi-year / Shared Instance issues was the only point of discussion
The TAG strongly recommended against use of multi-year / shared instance deployments. Those who were in a position to advise agencies noted that they advise clients against its use for all use cases.
Generally TAG members cited the inherent complexity of multi year data management and the difficulty of mixing operational (current year) and historical data.
Some example reasons given were as follows. There is an attempt to capture these in terms of importance, based on how often the objection was raised, but there was no attempt to catalog all the complexities.
- The extra complexity of supporting multi-year versioning for entities in a current year model
- The difficulty of handling entities that evolve over time and need versioning (descriptor options, course codes, etc.) in multi-year models
- The difficulty with segmenting years in a multi-year store so that the data can be "certified" for future audits or otherwise "frozen"
- The difficulty of authorization for multi-year data, and lack of current ODS support for authorization patterns that factor in a time or school year dimension
- Potential performance problems with operational schemas, particularly as they are applied to historical data
- The extra complexity to data uses as there is no ability to segment off of or transform to make it more comprehensible and focused for data analysts
- The complexity of migration of highly normalized and relational operational schemas
TAG members generally cited the need to separate operational and historical data as known best practice.
It was also mentioned that it was logical that a current year-focused system would expect that the API it is expected to communicate with is also current year. Attempts to derive a school year from dates was mentioned as problematic, as some entities may be classified by application of local or state policy.
In terms of the use case presented, this meant that the solution #3, to ask a SIS to deal with multi-year data received back from an API received no support from nay TAG member.
There was, however, some support for the notion of expanding the multi-year capabilities of the Ed-Fi data model, but considerably less than for the notion of moving off of multi-year deployments generally. Whether that should be done by making the data model more "multi-year" by means of "tagging" transactions to a school year or of introducing more dates into the model was not covered: the general notion was that it could be helpful to have a SchoolYear in the database. The use cases for that were not explored.
This discussion raised the issue of how agencies using Ed-Fi would access analytics that required multi-year data. They would be left without a clear solution. The notion of an open source data warehouse was raised as a possible solution