Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 29 Next »

The Ed-Fi Resources API, based on the Ed-Fi Data Standard, provides fine-grain access to educational data, modeled largely on the common denominators of the source systems that provide the data. For applications that consume data from an Ed-Fi API, this can result in a very “chatty” application integration: the consumer must make large numbers of calls over the network to retrieve the required data.

Furthermore, the authorization model in an Ed-Fi API is designed for client-server interactions, not for user interactions. Thus, the Ed-Fi API should not be used directly from a user interface.

In this article, we will explore design patterns and implementation concerns for building backend applications that address these problems.

Data Access Patterns

These patterns describe several different architectures for an application to access Ed-Fi data.

Direct Database Interaction

This is an anti-pattern.

How it works

A dedicated backend application interacts directly with the Ed-Fi database.

When to use

The Ed-Fi Alliance strongly discourages this pattern for the following reasons:

  1. It bypasses the authorization security in the Ed-Fi API, potentially affecting both read and write operations.

  2. This approach may put too much strain on the primary data storage, causing resource contention for other Ed-Fi client applications that are using the Ed-Fi API.

  3. The Ed-Fi database structure is not a standard. Thus, different implementations or even different versions of the same implementation could have unstated breaking changes at the database layer. For example, the https://edfi.atlassian.net/wiki/spaces/ODSAPIS3V72 and the Data Management Service Platform have very different backend database structures. An integration built on the ODS/API’s EdFi_ODS database would not be compatible with the Data Management Service’s database.

Implementation Notes

Although no longer considered advisable, the performance benefits make this a tempting option. Many applications have been built on this model in the past. In such cases, it is advisable to limit the direct database interaction to read operations only, and to run from a read-only database copy. The copy could be a snapshot or replica. Using a read-only copy mitigates the resource contention concern on the primary database. Limiting to read operations eliminates half of the authorization security concern.

Also see Row-Level Authorization below.

Real-time API Interaction

How it works

A dedicated backend application interacts directly with the Ed-Fi API, translating the incoming coarse-grained request into many fine-grained requests that utilize the Ed-Fi Resources API or other API specifications implemented in the Ed-Fi service application.

When to use

Use when true “real-time” interaction with the Ed-Fi API is required. This pattern works best when only a small number of calls to the Ed-Fi API are needed or when the calling service can safely wait for an extended period of time. If many requests are required of the Ed-Fi API to fulfill the “user” application’s originating request, then there could be a substantial delay before responding. This might not be acceptable in a user interface application.

(warning) Caution: If this integration is intended to read data from the Ed-Fi API that were sourced from a different system, then real-time integration might not be feasible. Check to see if the other source system(s) have real-time or batched integrations. If batched, then help the end-users adjust their expectations about data freshness.

Implementation Notes

Ed-Fi API client credentials need to be managed directly in the backend application. The client_id and client_secret should be secured as strongly as one would secure credentials to a backend database.

Application performance may be improved by caching some data from the Ed-Fi API if real-time updates are not required for those cached data.

The Ed-Fi ODS/API Platform has a feature allowing API clients to use a read replica database. Using a read replica on GET requests can help reduce contention with systems that are actively writing to the API.

It may be useful to prepare for additional server load by monitoring resource consumption and performance and having a contingency plan for vertical scale-up (additional memory or CPU) and/or horizontal scale-out (additional nodes in clustered deployments).

Batch and Save

How it works

The backend application retrieves optimized data from a local data store, thus improving the response time on the originating request. A separate ETL process runs on a schedule to pull data from the Ed-Fi API, reshape it according to the backend application’s needs, and place into the shared data store.

When to use

Use when user interface responsiveness is more critical than data freshness, and/or when a single “front end” request would generate more than some small number (2? 3?) of synchronous calls to the Ed-Fi API.

This pattern is also advisable when further preparation is necessary before using the data – the “transform” portion of “ETL”.

Implementation Notes

API Credentials

Ed-Fi API client credentials will need to be managed in the ETL application. The client_id and client_secret should be secured as strongly as one would secure credentials to a backend database.

Scheduling

To optimize the batch scheduling, it may be useful to analyze the arrival time of data in the Ed-Fi API, potentially using queries on the backend database if it is accessible. If that database is not accessible, then try having a conversation with the service host to see if they can provide insight on the frequency and time of day when modifications are made in the Ed-Fi API.

Change Queries

If satisfying the frontend requirements only requires storing hundreds to thousands of records, it may be feasible to perform a full refresh of the data on a schedule. As the number of records to retrieve increases, a full refresh will take longer and can put significant strain on the Ed-Fi API. In such cases, the Change Queries API can be used to detect deleted records and to look for new or updated records.

Streaming Data

How it works

Ed-Fi Resources are copied into a streaming platform, such as Kafka, generally in real-time. Another application reads from the data streams, transforms data to fit the frontend application requirements, and saves the results into a local data store.

When to use

This pattern inverts the ETL process described in the Batch and Save pattern, by pushing changed records into the database instead of requiring a scheduled pull operation. It combines the “real-time” benefit of direct API integration with the data storage optimization of Batch and Save.

This architecture would be most appropriate when the end-user application is managed by the same organization that is managing the Ed-Fi API. Otherwise, it may be difficult to overcome the network and authorization security challenges between two different parties.

(warning) Caution: this pattern is not advisable when using the Ed-Fi ODS/API Platform. Technically feasible, it would require running Change Data Capture on the Ed-Fi ODS database. The result would be a data stream that looks like the ODS database, rather than looking like the Ed-Fi Unifying Data Model (as surfaced in the Ed-Fi API). Thus, this is in essence a more complex version of the Direct Database Interaction anti-pattern described above.

(tick) Other Ed-Fi API applications, such as the forthcoming Ed-Fi Data Management Service, or applications developed by parties other than the Ed-Fi Alliance, may support this pattern.

Implementation Notes

This is an emergent pattern that the Ed-Fi community has not widely used. It requires deployment of several additional components that are not present in other patterns (stream processor, change data capture, etc.).

Frontend API Design Pattern

These patterns describe common approaches for building an end-user application that uses one of the Data Access patterns above to access the Ed-Fi data.

Backend-for-frontend

How it works

This pattern starts from the needs of a specific user interface, creating a finely tuned API specification that optimizes data transfer for that application. The backend-for-frontend service (BFF) then handles translation of the custom specification into requests for data from a local data store or from the Ed-Fi API, following one of the Data Access Patterns above.

When to use

There is only a single front-end application that needs access to the Ed-Fi resources.

Implementation Notes

See Row Level Security below.

Also see: Backends for Frontends pattern - Azure Architecture Center | Microsoft Learn

Central Aggregating Gateway

How it works

Like the BFF pattern, this pattern creates a custom API that is more appropriate to the use case than the Ed-Fi API, aggregating what would otherwise be multiple calls to fetch data into a single (or fewer, at least) call to the gateway service. Unlike the BFF, the aggregating gateway is generalized to support multiple use cases or applications. It may even present a GraphQL interface instead of a REST interface.

When to use

When multiple applications need access to an Ed-Fi data, with strong overlap in the data required. If the required data sets are very different, then the optimization of a BFF service may be a better fit for purpose.

Implementation Notes

See Row Level Security below.

Also see: Gateway Aggregation pattern - Azure Architecture Center | Microsoft Learn

Row-Level Security

The Family Educational Rights and Privacy Act (FERPA) outlines certain data privacy rights for students plus the rules by which student data can be shared to anyone other than the student or parent. Systems that utilize Ed-Fi data must provide appropriate data security so that school officials, parents, and so forth are only authorized to view "need-to-know" records. What is appropriate may vary from state to state and district to district.

In K–12 scenarios, several common roles clearly require different degrees of authorization to view student data:

  • Superintendents see data for all students in their district.

  • Principals see data for all students in their school.

  • Teachers see data for all students in their classes.

  • Parents see data for all their children.

  • Students see only data for themselves.

Real-world usage might not map job titles to data authorization levels in such a simple manner. There may be district employees other than superintendents who need access to all students. An Assistant Principal might take the lead on checking an Early Warning system. Rather than speaking about roles, it may be more useful to speak of access scopes, such as:

  • District

  • School

  • Section

Each application will need to devise its own mechanism for determining the correct scope of access for a user.

  • No labels