Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 5 Next »

July 25, 2024, 11:00 am - 11:45 am central. Contact Stephen Fuqua for the meeting invitation.

Agenda

  • Demonstration

  • Review the roadmap

  • Discuss / provide input on architecture

Demonstration

Rather than review milestone v0.1.0 release, let’s see what we have today, now that we have added reference validation.

Roadmap

Milestone

Functional Goals

0.1 DONE

Compliant Discovery API, Descriptor API, and Resource API definition (except GET by query): able to run bulk upload, smoke test. Includes JSON validation based on API schema file. Fake OAuth (1).

0.2 IN PROGRESS

Reference validation, Streaming, and Profiles: rejects POST, PUT, and DELETE requests that would violate referential integrity. Streaming data out. Build basic Profiles support (2).

0.3 BY SUMMIT

GET by query and cascading updates: use search engine or relational DB to fulfill GET by query requests. Support cascading updates on allowed resources.

0.4 NEED TO ACCELERATE

Namespace authorization: real OAuth; JWT inspection; duplicate ODS/API's namespace authorization. First release of the Configuration Service.

0.5

Data model flexibility and Concurrency: extensions, choosing between DS 4 and DS 5, swapping data standards at start up (not compile). Dynamic Discovery API definition, based on actual Data Standard/extensions. Full support for eTag-based concurrency.

0.6

Dynamic profiles and multitenancy: full-fledged support for XML-based dynamic profiles, and for ODS/API 7 style multi-tenant routing and database segmentation.

0.7 tech congress?

Ed-org based authorization. (3)

0.8

Change queries.

Meeting notes:

  • Swap “get by query” and “basic profiles” support between milestones 0.2 and 0.3.

  • Who can help?

    • JF: perhaps help with multinenancy / routing? Suggested to write up design notes first in this GitHub discussion.

    • Max: Lambda function as alternate front end interface. We can also look into other areas of interest in the roadmap.

Architecture

Database Design

Project-Tanager/docs/DMS/PRIMARY-DATA-STORAGE at main · Ed-Fi-Alliance-OSS/Project-Tanager (github.com)

image-20240725-014052.png
  • Each table has 16 partitions by default. Not difficult to configure for more.

Bulk Load Performance

Grand Bend data set (“populated template”). Running in Docker containers on localhost.

System

Data Set

Timing

Row count

DMS

Descriptors only

0:19.9 minutes

3,201 in document

+ 3,201 in alias

ODS/API 7.1

Descriptors only

6:45.6 minutes

3,201 in descriptor

+ 3,201 across all of the descriptor tables

Isolation Level

Read Committed vs. Repeatable Read vs. Snapshot vs. Serialized

Ex: read before write when a delete fails because the deleted item is referenced by something else. Read committed could allow a change to the “something else”, making it impossible to report the problem with the delete. Repeatable read solves this, but can cause a transaction rollback for another “concurrent” (but second-in) transaction. We think this conflict is unlikely to occur frequently. If using read committed, would probably want to do more manual locking.

Meeting notes:

  • No specific concerns expressed. Give people time to digest and come back to this in the future.

  • Configurability will be important.

Search Database

image-20240725-025521.png

Project-Tanager/docs/DMS/CDC-STREAMING.md at main · Ed-Fi-Alliance-OSS/Project-Tanager (github.com)

image-20240725-025604.png

Reading straight from dms.document, no outbox event table → open to suggestions / design for an additional outbox table.

Meeting notes:

  • CDC from this single table may be all that we need for an “outbox” at this time.

Alternative

Additional query tables in PostgreSQL / MSSQL with painful indexing.

  • No labels