Choosing a Database Platform

Overview

Historically, the Ed-Fi ODS/API supported only Microsoft SQL Server; recently, PostgreSQL support was added. Both support the relational database modeling that has been popular for several decades now. That data model has proven effective and efficient in countless situations, including in the ODS/API. But there are limitations, some of which have been strongly felt in the ODS/API:

Normalization of the Data Model results in a proliferation of tables that then must be re-joined in expensive read operations.
Table schema evolution is slow, painstaking, and difficult to make backwards-compatible.
Natural key as primary keys is an anti-pattern on a relational database, leading to larger disk space usage and more difficulty in writing accurate queries.

Is there another database engine or modeling paradigm that might be appropriate for a future architecture that supports the Ed-Fi Data Model?

Data Storage and Modeling Requirements for an Ed-Fi API

Key requirements for an Ed-Fi API that is modeled on the Unified Data Model:

Must have full authentication and authorization support.
Must be capable of authorizing partial access, directly in the engine or through query filtering.
Must support queries by natural keys.
Must have transport encryption.
Must have audit logging.
Should support queries on other fields.
Should have paging.
Should have horizontal scaling capability.
Should have a reasonable query language.
Nice to have optimistic concurrency support.
Nice to have consistent reads (all queries at same time return same result).
Nice to have schema flexibility for supporting changes to the Unified Data Model.
Nice to have field selection.

Additional factors to consider for Project Meadowlark:

Should use managed services to the extent available.
Horizontal (preferably "elastic") database scalability to support large school districts and collaboratives.
Using multiple storage technologies together is potentially viable and may open up additional features, such as relaxed resolution of references during initial write actions.

Out of scope for this analysis:

Analytics support.

Utilizing Multiple Databases

In traditional application development, including the Ed-Fi ODS/API Platform, all Create-Read-Update-Delete (CRUD) operations are served by a single database instance. Project Meadowlark has instead adopted the strategy of choosing database engines that are a good fit-to-purpose. "NoSQL" databases are a good fit for online transaction processing (OLTP) because they enable storage of the raw API payloads (JSON) directly in the database. This improves both the write speed and the speed of retrieving a single object from the database, since there are no joins to perform between different tables. In the current implementation, AWS's native and proprietary DynamoDB was selected as the primary transaction database for the simple reason that its architecture was interesting to explore. There are other document and key-value storage systems that could easily be used instead of DynamoDB.

A key difference between this document storage approach, compared to relational database modeling, comes in the form of searchability. Many key-value and document databases have the ability to add "secondary indexes" that can help find individual items by some global criteria. But these are limited and very different than the indexes found in a relational database, which can be tuned to identify items based on any column. In other words, when storing an entire document, most key-value and document databases fare poorly when trying to search by query terms (e.g. "get all students with last name Doe").

This is where OpenSearch shines. Based on ElasticSearch, OpenSearch is also a NoSQL document store. The key difference is that it indexes everything in the document, and has a powerful search engine across the indexes. OpenSearch is not designed to be a robust solution for high performance write operations, so it does not make sense to write directly to it.

To streamline the Meadowlark API functionality, that code only writes to one database (DynamoDB). It then use's that database's change data capture trigger or stream to push new data out to another serverless function. That next function writes data to OpenSearch in a completely asynchronous / non-blocking process, and naturally deletes removed objects from OpenSearch.

Early on, the development team also experimented with writing the item out to blob storage (S3) in addition to OpenSearch. With JSON objects stored in S3, it was incredibly easy to build a simple analytics dashboard in Amazon QuickSight, with Amazon Athena sitting in the middle as the query engine.

The S3 work was removed for expediency after an upgrade to the AWS SDK broke the code, and it may be restored in the future. Additionally, it may be useful to explore having the "GET by ID" requests served from blob storage instead of from the transaction database to take advantage of lower-cost reads; this can also be combined with CDN caching (for example with descriptors) to further improve performance and potentially lower the per-transaction cost.

Eventual Consistency

Highly scalable databases such as DynamoDB and Cassandra store multiple copies of the data for resiliency and high availability, and only one of these copies receives the initial write operation. The service guarantees that all other copies will eventually come up to date with that initial write operation: the data will eventually be consistent. The tradeoff is in favor of connection reliability: queries are not blocked by write operations.

Many people find this disturbing at first, if they are used to thinking about transaction locking in relational databases. But the reality is less scary than it sounds.

Amazon states that it typically takes "one second or less" to bring all copies up to date. Let's compare the outcomes of the following three scenarios:

Time	Scenario 1	Scenario 2	Scenario 3
10:01:01.000 AM	Client A reads a record	Client B writes an update to that record	Client B writes an update to that record
10:01:01.500 AM (half second)	Client B writes an update to that record	Client A reads a record	All DynamoDB copies are up-to-date
10:01:02.000 AM (full second)	All DynamoDB copies are up-to-date	All DynamoDB copies are up-to-date	Client A reads a record
Status	Client A has stale data	Client A might have stale data	Client A has current data

In Scenario 1, Client A receives stale data because they requested it half a second before Client B writes an update. And this is no different than in a relational database.

In Scenario 2, the Client B writes an update half a second before Client A sends a read. Client A might coincidentally be assigned to read from the first database node that received the record, or it might read from a node that is lagging by half a second. Thus it might get stale data, though this is not guaranteed.

Finally in Scenario 3, Client A asks for a record a full second after Client B had written an update, and Client A is nearly guaranteed to get the current (not stale) data. Again, same as with a standard relational database.

The practical difference between the guaranteed consistency of a relational database and the eventual consistency of a distributed database like DynamoDB is thus more a matter of happenstance than anything else. In either case, if Client A reads from the system a millisecond before Client B writes, then Client A will have stale data. If Client A reads after Client B writes, then the window of time for getting stale data goes up to perhaps a second. But if they do get stale data, they will never know that they weren't in scenario 1.

Eventual consistency is likely "good enough." But it does deserve further community consideration before using it in a production system.

DynamoDB + OpenSearch

Storage Design

Meadowlark uses the single-table design approach for storage in DynamoDB, with the following structure:

Column Name	Purpose
info	Contains the JSON document for a resource
pk	Hash key (aka partition key) - one half of the primary key.
naturalKey	Plain text version of the natural key
sk	Range key (aka sort key) - the other half of the primary key

There are also a couple of experimental columns and secondary indexes for exploring relationship-based authorization.

Meadowlark creates a unique resource ID by calculating a SHA-3 (cShake 128) hash value from the natural key. This value is stored as the sort key, sk . The partition key, pk , contains entity type information: schema, model version, and domain entity name.

In DynamoDB, an "item" is analogous to a "record" in a relational database. Thus a single object being stored in a DynamoDB table is stored as "an item".

Also see:

Meadowlark 0.1.0 - Referential Integrity

Streaming to OpenSearch

DynamoDB has native change data capture streaming. The change stream can trigger execution of a Lambda function. This function in turn can write data out to OpenSearch.

References

Alex DeBrie's DynamodDB Guide and The DynamoDB Book
Single table design with DynamoDB. "Covers a fair amount of his book content".
re:Invent 2019 - DynamoDB Deep Dive
re:Invent 2020 - DynamoDB Advanced Design Patterns, part 1
re:Invent 2020 - DynamoDB Advanced Design Patterns, part 2

Table of Contents

Meadowlark 0.1.0 - Data Storage