The Ed-Fi Resources API, based on the Ed-Fi Data Standard, provides fine-grain access to educational data, modeled largely on the common denominators of the source systems that provide the data. For applications that consume data from an Ed-Fi API, this can result in a very “chatty” application integration: the consumer must make large numbers of calls over the network to retrieve the required data.
Furthermore, the authorization model in an Ed-Fi API is designed for client-server interactions, not for user interactions. Thus, the Ed-Fi API should not be used directly from a user interface.
In this article, we will explore design patterns and implementation concerns for building backend applications that address these problems.
Data Access Patterns
In the data access patterns below, user applications or other backend services call another service that sit between them and the Ed-Fi service host. The Ed-Fi client application could be providing data through many different access patterns. The most common access patterns are described below under the heading of Frontend API Design Patterns.
Direct Database Interaction
This is an anti-pattern.
How it works
A dedicated backend application interacts directly with the Ed-FI database.
When to use
The Ed-Fi Alliance strongly discourages this pattern for the following reasons:
It bypasses the authorization security in the Ed-Fi API, potentially affecting both read and write operations.
This approach may put too much strain on the primary data storage, causing resource contention for other Ed-Fi client applications that are using the Ed-Fi API.
The Ed-Fi database structure is not a standard. Thus, different implementations or even different versions of the same implementation could have unstated breaking changes at the database layer. For example, the Ed-Fi ODS/API Platform and the Ed-Fi Data Management Service (unreleased at the time of writing) have very different backend database structures. An integration built on the ODS/API’s
EdFi_ODS
database would not be compatible with the Data Management Service’s database.
Implementation Notes
Although no longer considered advisable, the performance benefits make this a tempting option. Many applications have been built on this model in the past. In such cases, it is advisable to limit the direct database interaction to read operations only, and to run from a read-only database copy. The copy could be a snapshot or replica. Using a read-only copy mitigates the resource contention concern on the primary database. Limiting to read operations eliminates half of the authorization security concern.
Also see Row-Level Authorization below.
Real-time API Interaction
How it works
A dedicated backend application interacts directly with the Ed-FI API, translating the incoming coarse-grained request into many fine-grained requests that utilize the Ed-Fi Resources API or other API specifications implemented in the Ed-Fi service application.
When to use
Use when true “real-time” interaction with the Ed-Fi API is required. This pattern works best when only a small number of calls to the Ed-Fi API are needed or when the calling service can safely wait for an extended period of time. If many requests are required of the Ed-Fi API to fulfill the “user” application’s originating request, then there could be a substantial delay before responding. This might not be acceptable in a user interface application.
Caution: If this integration is intended to access data from the Ed-Fi API that were sourced from a different system, then real-time integration might not be feasible. Check to see if the other source system(s) have real-time or batched integrations. If not, help the end-users adjust their expectations about data freshness.
Implementation Notes
Ed-Fi API client credentials will need to be managed directly in the backend application. The client_id
and client_secret
should be secured as strongly as one would secure credentials to a backend database.
Batch and Save
How it works
The backend application retrieves optimized data from a local data store, thus improving the response time on the originating request. A separate ETL process runs on a schedule to pull data from the Ed-Fi API, reshape it according to the backend application’s needs, and place into the shared data store.
When to use
Use when user interface responsiveness is more critical than data freshness, and/or when a single “front end” request would generate more than some small number (2? 3?) of synchronous calls to the Ed-Fi API.
Implementation Notes
API Credentials
Ed-Fi API client credentials will need to be managed in the ETL application. The client_id
and client_secret
should be secured as strongly as one would secure credentials to a backend database.
Scheduling
To optimize the batch scheduling, it may be useful to analyze the arrival time of data in the Ed-Fi API, potentially using queries on the backend database if it is accessible. If that database is not accessible, then try having a conversation with the service host to see if they can provide insight on the frequency and time of day when modifications are made in the Ed-Fi API.
Change Queries
If satisfying the frontend requirements only requires storing hundreds to thousands of records, it may be feasible to perform a full refresh of the data on schedule. As the number of records to retrieve increases, a full refresh will take longer and can put significant strain on the Ed-Fi API. In such cases, the Change Queries API can be used to detect deleted records and to look for new or updated records.
Streaming Data
How it works
Ed-Fi Resources are copied into a streaming platform, such as Kafka, generally in real-time. Another application reads from the data streams, transforms data to fit the frontend application requirements, and saves the results into a local data store.
When to use
This pattern inverts the ETL process described in the Batch and Save pattern, by pushing changed records into the database instead of requiring a scheduled pull operation. It combines the “real-time” benefit of direct API integration with the data storage optimization of Batch and Save.
This architecture would be most appropriate when the end-user application is managed by the same organization that is managing the Ed-Fi API. Otherwise, it may be difficult to overcome the network and authorization security challenges between two different parties.
Caution: this pattern is not advisable when using the Ed-Fi ODS/API Platform. Technically feasible, it would require running Change Data Capture on the Ed-Fi ODS database. The result would be a data stream that looks like the ODS database, rather than looking like the Ed-Fi Unifying Data Model (as surfaced in the Ed-Fi API). Thus, this is in essence a more complex version of the Direct Database Interaction anti-pattern described above.
Other Ed-Fi API applications, such as the forthcoming Ed-Fi Data Management Service, or applications developed by parties other than the Ed-Fi Alliance, may support this pattern.
Implementation Notes
This is an emergent pattern that the Ed-Fi community has not widely used. It requires deployment of several additional components that are not present in other patterns (stream processor, change data capture, etc.).
Frontend API Design Patterns
Coming soon