TAG Meeting 2024-10-10

Agenda

  • New public documentation site

  • Tracking data lineage

  • Ed-Fi roadmap

  • Selectively hiding domains

Materials

 

Participants

First Name

Last Name

Organization

David

Clements

Ed-Fi Alliance

Dirk

Bradley

Kalamazoo RESA/Michigan Data Hub

Don

Dailey

Keen Logic

Eric

Jansson

Ed-Fi Alliance

Jay

Lindler

South Carolina District Data Governance

John

Parker

Innive

Jushua

Impson

Resultant

Katie

Favara

Texas Region 4

Matt

Hoffman

Aeries Software

Rosh

Dhanawade

Education Analytics

Stephen

Fuqua

Ed-Fi Alliance

Vinaya

Mayya

Ed-Fi Alliance

Wyatt

Cothran

South Carolina Department of Education

Support: Ann Su

Notes

These notes complement the slide deck above and make the most sense when read along with the deck.

Lineage

  • Who, in this call, is interested in using this lineage information?

    • Innive, South Carolina District Data Governance Group, Michigan Data Hub, and Region 4 all spoke up with interest.

    • Option 1

      • Clarification: last modified by would be the vendor application name, in Ed-Fi terms.

      • Yes, useful in the short term. With a data warehouse solution, can compare the data to see what changed (option 3 information), to complement the who that is addressed with _lastModifiedBy.

    • Option 2

      • Progress on the way to option 3

      • How much history are you holding onto? Could this become a data storage issue?

        • It should be easy to make this an opt-in feature for anyone worried about that.

    • Option 3

      • This is ideal for audit purposes.

      • Could also be expensive - in terms of development cost, and also storage and processing cost.

    • What about writing lineage information in POST and PUT requests?

      • Not currently envisioning this.

      • If we did, then it might be better to move this into the Data Standard instead of being an “extension” on the API Standard.

    • Do we need to distinguish between POST and PUT?

      • Both can modify a record, not seeing value to distinguishing between the two.

    • The voiced consensus is that option 3 is the preference, though lacking this, the other options are also valuable.

Editor’s idea after the meeting: alternative to writing lineage information directly:

  1. Read data from the API.

  2. Perform an algorithm on this data.

  3. Write the result back into the API.

At step 3, it might be nice to save lineage information reflecting the algorithm / process used to generate the new data.

Suggestion: create a new Application with its own client credentials (key and secret), naming the Application after the process. Thus, in all three options, the name of the new Application would provide the algorithm/process information, without having to submit the data in a POST or PUT request.

Editor’s observation after the meeting: vendors who delete and re-post data (which is common) will be removing historical lineage information. Documentation will need to point this out. The certification process discourages delete/re-post, but it happens anyway.

Ed-Fi API Roadmap

  • The proposal is simply this: only ship Data Standard 6 with the new Data Management Service. Do not incorporate into the ODS/API, due to the development and opportunity costs.

  • Clarification: ODS/API 7.x will continue to receive bug fixes and potentially minor Data Standard version updates (e.g. 4.1, 5.3).

  • What is the level of effort to include in ODS/API?

    • Cannot predict at this early stage.

    • As a breaking change year, it is likely to cause significant churn in tests and in some of the code.

    • Best educated guess: one to two developers for around a month.

  • Parallel year operation may be important, and organizations need to look closely at the timelines for the two applications.

  • We can include Data Standards 4 and 5 in the Data Management Service, at low cost.

  • Consider: some people may be skipping Data Standard 5 strategically, planning to move to Data Standard 6. This would also force them to adopt the new application.

Selectively Hiding Domains

  • The underlying concern seems to be more about the developer building integrations with the API, rather than the hosts who are running it.

  • From a security perspective, claimsets and profiles are adequate for preventing inappropriate data access.

  • Limiting the scope of the “Swagger documentation” (Open API specification) not only helps developers find the right integration points, it would also keep them from inquiring about endpoints that they should not be accessing it.

  • Simplest approach then is just to tailor the Open API specification that is made available to developers, rather than changing the application code to “physically” remove endpoints.

    • Probably a modification to MetaEd when it starts generating Open API specifications in the future.

Next Meeting

  • Dec 12, 2024 2:00 - 3:15 pm CDT