TAG Meeting 2020-09-16 - Subgroup on Data Out 2

Participants

  1. Andrew Rice
  2. Erik Joranlien
  3. Jean-Francois G
  4. Patrick Yoho, InnovateEdu
  5. David Clements, Ed-Fi Alliance
  6. Vinaya Mayya, Ed-Fi Alliance
  7. Eric J

Materials/Agenda

Follow-up on the previous month's conversation:

  1. There is ambiguity about the best/right product/tech supports to allow agencies to derive value from their ODS.
    • Field practice suggests that local data marts are the place where value is being derived, not direct queries on the ODS
    • Given this, should tools like the AMT be seen as strategies for hydrating these datamarts
    • Are the former recommendations of the Data Out SIG – pushing in the direction of use-case focused read only APIs as a bridge to Graph QL on the ODS API the right targets?
  2. There seems to be a growing need for ODS-to-ODS replication via API in the community.
    • the scope of that need is unclear
    • the features to  allow that replication seem to be in place for Suite 3, but missing for Suite 2

Notes

  • Major points
    • There is a growing need for ODS-to-ODS replication via API in the community, but the scale of that pattern is not yet clear.
    • There is ambiguity about the best product supports to allow agencies to derive value from their ODS
      • The dominant opinion of the participants is that the ODS database is not an effective analytics store, and that in many – if not most cases – the data will leave the ODS before analysis. The roadmap emphasis should therefore be on bulk data movement and replication
      • This advice is counter to the advice of the former Data Out SIG, which encouraged the view that the ODS database was the analytics store.
      • On the Analytics Middle Tier: seems useful as a way to support some basic analytics, but likely limited in utility for more mature implementations.
    • There is evidence of performance issues for medium to large LEA implementations for the Suite 2 API, but Suite 3 API has not been tested.
  • Prioritize development of regular (i.e. with each release) data out benchmarking for the ODS API.
    • Default size should be at least a mid-sized school district, but comparison of performance at multiple scales is most desirable
    • Include a benchmark that is a comparison of API extraction time to extraction via SQL time
    • Focus on endpoints lest likely to scale due to large natural keys, significant denormalization in the database ORM, or other similar issues
  • Action: TAG input captured here:  ODS-2947 - Getting issue details... STATUS