Work-in-progress draft, May 2024.
The Ed-Fi Resources API, based on the Ed-Fi Data Standard, provides fine-grain access to educational data, modeled largely on the common denominators of the source systems that provide the data. For applications that consume data from an Ed-Fi API, this can result in a very “chatty” application integration: the consumer must make large numbers of calls over the network to retrieve the required data.
Furthermore, the authorization model in an Ed-Fi API is designed for client-server interactions, not for user interactions. Thus, the Ed-Fi API should not be used directly from a user interface.
In this article, we will explore design patterns and implementation concerns for building backend applications that address these problems.
Data Access Patterns
In the data access patterns below, user applications or other backend services call another service that sit between them and the Ed-Fi service host. The Ed-Fi client application could be providing data through many different access patterns. The most common access patterns are described below under the heading of Frontend API Design Patterns.
What about writes? These are written to deal with reads. Must revise to also treat the subject of writes.
Direct Database Interaction
How it works
A dedicated backend application interacts directly with the Ed-FI database.
When to use
The Ed-Fi Alliance strongly discourages this pattern for the following reasons:
It bypasses the authorization security in the Ed-Fi API, potentially affecting both read and write operations.
This approach may put too much strain on the primary data storage, causing resource contention for other Ed-Fi client applications that are using the Ed-Fi API.
The Ed-Fi database is not a standard. Thus, different implementations or even different versions of the same implementation could have unstated breaking changes at the database layer. For example, the Ed-Fi ODS/API Platform and the Ed-Fi Data Management Service (unreleased at the time of writing) have very different backend database structures. An integration built on the ODS/API’s
EdFi_ODS
database would not be compatible with the Data Management Service’s database.
Implementation Notes
Although no longer considered advisable, the performance benefits make this a tempting option. Many applications have been built on this model in the past. In such cases, it is advisable to limit the direct database interaction to read operations only, and to run from a read-only database copy. The copy could be a snapshot or replica. Using a read-only copy mitigates the resource contention concern on the primary database. Limiting to read operations eliminates half of the authorization security concern.
Also see Row-Level Authorization below.
Real-time API Interaction
How it works
A dedicated backend application interacts directly with the Ed-FI API, translating the incoming coarse-grained request into many fine-grained requests that utilize the Ed-Fi Resources API or other API specifications implemented in the Ed-Fi service application.
When to use
Use when true “real-time” interaction with the Ed-Fi API is required. Note that many vendor integrations with an Ed-Fi API are not truly real-time integrations. If business requirements expect a literal real-time user interface, it may be worthwhile to first check on the actual timeliness of data landing in the Ed-Fi API before committing to this real-time pattern.
This pattern works best when only a small number of calls to the Ed-Fi API are needed or when the calling service can safely wait for an extended period of time. If many requests are required of the Ed-Fi API to fulfill the “user” application’s originating request, then there could be a substantial delay before responding. This might not be acceptable in a user interface application.
Implementation Notes
Ed-Fi API client credentials will need to be managed directly in the backend application. The client_id
and client_secret
should be secured as strongly as one would secure credentials to a backend database.
Batch and Save
How it works
The backend application retrieves optimized data from a local data store, thus improving the response time on the originating request. A separate ETL process runs on a schedule to pull data from the Ed-Fi API, reshape it according to the backend application’s needs, and place into the shared data store.
When to use
Use when user interface responsiveness is more critical than data freshness, and/or when a single “front end” request would generate more than some small number (2? 3?) of synchronous calls to the Ed-Fi API.
Implementation Notes
Scheduling
To optimize the batch scheduling, it may be useful to analyze the arrival time of data in the Ed-Fi API, potentially using queries on the backend database if it is accessible. If that database is not accessible, then try having a conversation with the service host to see if they can provide insight on the frequency and time of day when modifications are made in the Ed-Fi API.
Change Queries
If satisfying the frontend requirements only requires storing hundreds to thousands of records, it may be feasible to perform a full refresh of the data on schedule. As the number of records to retrieve increases, a full refresh will take longer and can put significant strain on the Ed-Fi API. In such cases, the Change Queries API can be used to detect deleted records and to look for new or updated records.
Streaming Data
How it works
Ed-Fi Resources are copied into a streaming platform, such as Kafka, generally in real-time. Another application reads from the data streams, transforms data to fit the frontend application requirements, and saves the results into a local data store.
When to use
This pattern inverts the ETL process described in the Batch and Save pattern, by pushing changed records into the database instead of requiring a scheduled pull operation. It combines the “real-time” benefit of direct API integration with the data storage optimization of Batch and Save.
Caution: this pattern is not advisable when using the Ed-Fi ODS/API Platform. Technically feasible, it would require running Change Data Capture on the Ed-Fi ODS database. The result would be a data stream that looks like the ODS database, rather than looking like the Ed-Fi Unifying Data Model (as surfaced in the Ed-Fi API). Thus, this is in essence a more complex version of the Direct Database Interaction anti-pattern described above.
Other Ed-Fi API applications, such as the forthcoming Ed-Fi Data Management Service, or applications developed by parties other than the Ed-Fi Alliance, may support this pattern.
Implementation Notes
How it works
When to use
Implementation Notes
Frontend API Design Patterns
Backend-for-frontend
How it works
This pattern starts from the needs of a specific user interface, creating a finely-tuned API specification that optimizes data transfer for that application.
The backend-for-frontend service (BFF) then handles translation of the custom specification into requests
When to use
There is only a single front-end application that needs access to the Ed-Fi resources.
Implementation Notes
Central Aggregating Gateway
How it works
When to use
There are multiple front-end applications with different use cases.
Implementation Notes
GraphQL
How it works
When to use
Implementation Notes
Cross-cutting Concerns
Row-Level Authorization
Caching
Ed-Fi API Client Credentials
Define the problem:
Course-grained access to fine-grained resources (“chattiness”)
Network latency
Getting rid of composites
Alternate security protocols
Solution: create a specialized backend API, which sits “close” to the Ed-Fi API (for low network latency).
Central Aggregating Gateway: more generic design than a BFF, can support multiple user interfaces
“A central-purpose aggregating gateway sits between external user interfaces and downstream microservices and performs call filtering and aggregation for all user interfaces. Without aggregation, a user interface may have to make multiple calls to fetch required information, often throwing away data that was retrieved but not needed.”
BFF: designed for a single front-end
“The main distinction between a BFF and a central aggregating gateway is that a BFF is single purpose in nature—it is developed for a specific user interface.”
https://learning.oreilly.com/library/view/building-microservices-2nd/9781492034018/ch14.html by Sam Newman, ch 14, O’Reilly Media, Inc.
Solves the impedance mismatch between systems.
Similar to the Facade pattern in OO systems
Alternatives
GraphQL - why not? Well-defined need is easier to express and reason about. Create a simple single definition of an API.
Design
BEF app needs a key and secret to an ODS/API. Maybe to many of them.
Analyze your authorization strategy and needs.
Cache local data for higher performance.
Possible implication: ETL
Change Queries (ETL!)
Show basic interaction
Definitely(?) implies caching.
Storing multiple tenants' data? Decide on a multi-tenancy pattern for segregating the data in the caching layer.
Evolutionary diagrams:
FE to BFF to BEF to ODS/API
Add caching layer and ETL
Resiliency patterns and more