TAG Meeting 2024-10-10
Agenda
New public documentation site
Tracking data lineage
Ed-Fi roadmap
Selectively hiding domains
Materials
Participants
Notes
These notes complement the slide deck above and make the most sense when read along with the deck.
Lineage
Who, in this call, is interested in using this lineage information?
Innive, South Carolina District Data Governance Group, Michigan Data Hub, and Region 4 all spoke up with interest.
Option 1
Clarification: last modified by would be the vendor application name, in Ed-Fi terms.
Yes, useful in the short term. With a data warehouse solution, can compare the data to see what changed (option 3 information), to complement the who that is addressed with _lastModifiedBy.
Option 2
Progress on the way to option 3
How much history are you holding onto? Could this become a data storage issue?
It should be easy to make this an opt-in feature for anyone worried about that.
Option 3
This is ideal for audit purposes.
Could also be expensive - in terms of development cost, and also storage and processing cost.
What about writing lineage information in POST and PUT requests?
Not currently envisioning this.
If we did, then it might be better to move this into the Data Standard instead of being an “extension” on the API Standard.
Do we need to distinguish between POST and PUT?
Both can modify a record, not seeing value to distinguishing between the two.
The voiced consensus is that option 3 is the preference, though lacking this, the other options are also valuable.
Editor’s idea after the meeting: alternative to writing lineage information directly:
Read data from the API.
Perform an algorithm on this data.
Write the result back into the API.
At step 3, it might be nice to save lineage information reflecting the algorithm / process used to generate the new data.
Suggestion: create a new Application with its own client credentials (key and secret), naming the Application after the process. Thus, in all three options, the name of the new Application would provide the algorithm/process information, without having to submit the data in a POST or PUT request.
Editor’s observation after the meeting: vendors who delete and re-post data (which is common) will be removing historical lineage information. Documentation will need to point this out. The certification process discourages delete/re-post, but it happens anyway.
Ed-Fi API Roadmap
The proposal is simply this: only ship Data Standard 6 with the new Data Management Service. Do not incorporate into the ODS/API, due to the development and opportunity costs.
Clarification: ODS/API 7.x will continue to receive bug fixes and potentially minor Data Standard version updates (e.g. 4.1, 5.3).
What is the level of effort to include in ODS/API?
Cannot predict at this early stage.
As a breaking change year, it is likely to cause significant churn in tests and in some of the code.
Best educated guess: one to two developers for around a month.
Parallel year operation may be important, and organizations need to look closely at the timelines for the two applications.
We can include Data Standards 4 and 5 in the Data Management Service, at low cost.
Consider: some people may be skipping Data Standard 5 strategically, planning to move to Data Standard 6. This would also force them to adopt the new application.
Selectively Hiding Domains
The underlying concern seems to be more about the developer building integrations with the API, rather than the hosts who are running it.
From a security perspective, claimsets and profiles are adequate for preventing inappropriate data access.
Limiting the scope of the “Swagger documentation” (Open API specification) not only helps developers find the right integration points, it would also keep them from inquiring about endpoints that they should not be accessing it.
Simplest approach then is just to tailor the Open API specification that is made available to developers, rather than changing the application code to “physically” remove endpoints.
Probably a modification to MetaEd when it starts generating Open API specifications in the future.
Next Meeting
Dec 12, 2024 2:00 - 3:15 pm CDT
Table of Contents