TAG Meeting 2020-09-16 - Subgroup on Data Out 2
Participants
- Andrew Rice
- Erik Joranlien
- Jean-Francois G
- Patrick Yoho, InnovateEdu
- David Clements, Ed-Fi Alliance
- Vinaya Mayya, Ed-Fi Alliance
- Eric J
Materials/Agenda
Follow-up on the previous month's conversation:
- There is ambiguity about the best/right product/tech supports to allow agencies to derive value from their ODS.
- Field practice suggests that local data marts are the place where value is being derived, not direct queries on the ODS
- Given this, should tools like the AMT be seen as strategies for hydrating these datamarts
- Are the former recommendations of the Data Out SIG – pushing in the direction of use-case focused read only APIs as a bridge to Graph QL on the ODS API the right targets?
- There seems to be a growing need for ODS-to-ODS replication via API in the community.
- the scope of that need is unclear
- the features to allow that replication seem to be in place for Suite 3, but missing for Suite 2
Notes
- Major points
- There is a growing need for ODS-to-ODS replication via API in the community, but the scale of that pattern is not yet clear.
- There is ambiguity about the best product supports to allow agencies to derive value from their ODS
- The dominant opinion of the participants is that the ODS database is not an effective analytics store, and that in many – if not most cases – the data will leave the ODS before analysis. The roadmap emphasis should therefore be on bulk data movement and replication
- This advice is counter to the advice of the former Data Out SIG, which encouraged the view that the ODS database was the analytics store.
- On the Analytics Middle Tier: seems useful as a way to support some basic analytics, but likely limited in utility for more mature implementations.
- There is evidence of performance issues for medium to large LEA implementations for the Suite 2 API, but Suite 3 API has not been tested.
- Prioritize development of regular (i.e. with each release) data out benchmarking for the ODS API.
- Default size should be at least a mid-sized school district, but comparison of performance at multiple scales is most desirable
- Include a benchmark that is a comparison of API extraction time to extraction via SQL time
- Focus on endpoints lest likely to scale due to large natural keys, significant denormalization in the database ORM, or other similar issues
- Action: TAG input captured here: - ODS-2947Getting issue details... STATUS