Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 18 Next »

Work-in-progress draft, May 2024.

The Ed-Fi Resources API, based on the Ed-Fi Data Standard, provides fine-grain access to educational data, modeled largely on the common denominators of the source systems that provide the data. For applications that consume data from an Ed-Fi API, this can result in a very “chatty” application integration: the consumer must make large numbers of calls over the network to retrieve the required data.

Furthermore, the authorization model in an Ed-Fi API is designed for client-server interactions, not for user interactions. Thus, the Ed-Fi API should not be used directly from a user interface.

In this article, we will explore design patterns and implementation concerns for building backend applications that address these problems.

Data Access Patterns

In the data access patterns below, user applications or other backend services call another service that sit between them and the Ed-Fi service host. The Ed-Fi client application could be providing data through many different access patterns. The most common access patterns are described below under the heading of Frontend API Design Patterns.

Direct Database Interaction

This is an anti-pattern.

How it works

A dedicated backend application interacts directly with the Ed-FI database.

When to use

The Ed-Fi Alliance strongly discourages this pattern for the following reasons:

  1. It bypasses the authorization security in the Ed-Fi API, potentially affecting both read and write operations.

  2. This approach may put too much strain on the primary data storage, causing resource contention for other Ed-Fi client applications that are using the Ed-Fi API.

  3. The Ed-Fi database structure is not a standard. Thus, different implementations or even different versions of the same implementation could have unstated breaking changes at the database layer. For example, the Ed-Fi ODS/API Platform and the Ed-Fi Data Management Service (unreleased at the time of writing) have very different backend database structures. An integration built on the ODS/API’s EdFi_ODS database would not be compatible with the Data Management Service’s database.

Implementation Notes

Although no longer considered advisable, the performance benefits make this a tempting option. Many applications have been built on this model in the past. In such cases, it is advisable to limit the direct database interaction to read operations only, and to run from a read-only database copy. The copy could be a snapshot or replica. Using a read-only copy mitigates the resource contention concern on the primary database. Limiting to read operations eliminates half of the authorization security concern.

Also see Row-Level Authorization below.

Real-time API Interaction

How it works

A dedicated backend application interacts directly with the Ed-FI API, translating the incoming coarse-grained request into many fine-grained requests that utilize the Ed-Fi Resources API or other API specifications implemented in the Ed-Fi service application.

When to use

Use when true “real-time” interaction with the Ed-Fi API is required. This pattern works best when only a small number of calls to the Ed-Fi API are needed or when the calling service can safely wait for an extended period of time. If many requests are required of the Ed-Fi API to fulfill the “user” application’s originating request, then there could be a substantial delay before responding. This might not be acceptable in a user interface application.

(warning) Caution: If this integration is intended to access data from the Ed-Fi API that were sourced from a different system, then real-time integration might not be feasible. Check to see if the other source system(s) have real-time or batched integrations. If not, help the end-users adjust their expectations about data freshness.

Implementation Notes

Ed-Fi API client credentials will need to be managed directly in the backend application. The client_id and client_secret should be secured as strongly as one would secure credentials to a backend database.

Batch and Save

How it works

The backend application retrieves optimized data from a local data store, thus improving the response time on the originating request. A separate ETL process runs on a schedule to pull data from the Ed-Fi API, reshape it according to the backend application’s needs, and place into the shared data store.

When to use

Use when user interface responsiveness is more critical than data freshness, and/or when a single “front end” request would generate more than some small number (2? 3?) of synchronous calls to the Ed-Fi API.

Implementation Notes

API Credentials

Ed-Fi API client credentials will need to be managed in the ETL application. The client_id and client_secret should be secured as strongly as one would secure credentials to a backend database.

Scheduling

To optimize the batch scheduling, it may be useful to analyze the arrival time of data in the Ed-Fi API, potentially using queries on the backend database if it is accessible. If that database is not accessible, then try having a conversation with the service host to see if they can provide insight on the frequency and time of day when modifications are made in the Ed-Fi API.

Change Queries

If satisfying the frontend requirements only requires storing hundreds to thousands of records, it may be feasible to perform a full refresh of the data on schedule. As the number of records to retrieve increases, a full refresh will take longer and can put significant strain on the Ed-Fi API. In such cases, the Change Queries API can be used to detect deleted records and to look for new or updated records.

Streaming Data

How it works

Ed-Fi Resources are copied into a streaming platform, such as Kafka, generally in real-time. Another application reads from the data streams, transforms data to fit the frontend application requirements, and saves the results into a local data store.

When to use

This pattern inverts the ETL process described in the Batch and Save pattern, by pushing changed records into the database instead of requiring a scheduled pull operation. It combines the “real-time” benefit of direct API integration with the data storage optimization of Batch and Save.

This architecture would be most appropriate when the end-user application is managed by the same organization that is managing the Ed-Fi API. Otherwise, it may be difficult to overcome the network and authorization security challenges between two different parties.

(warning) Caution: this pattern is not advisable when using the Ed-Fi ODS/API Platform. Technically feasible, it would require running Change Data Capture on the Ed-Fi ODS database. The result would be a data stream that looks like the ODS database, rather than looking like the Ed-Fi Unifying Data Model (as surfaced in the Ed-Fi API). Thus, this is in essence a more complex version of the Direct Database Interaction anti-pattern described above.

(tick) Other Ed-Fi API applications, such as the forthcoming Ed-Fi Data Management Service, or applications developed by parties other than the Ed-Fi Alliance, may support this pattern.

Implementation Notes

This is an emergent pattern that the Ed-Fi community has not widely used. It requires deployment of several additional components that are not present in other patterns (stream processor, change data capture, etc.).

Frontend API Design Patterns

Backend-for-frontend

How it works

This pattern starts from the needs of a specific user interface, creating a finely-tuned API specification that optimizes data transfer for that application.

The backend-for-frontend service (BFF) then handles translation of the custom specification into requests

When to use

There is only a single front-end application that needs access to the Ed-Fi resources.

Implementation Notes

Central Aggregating Gateway

How it works

When to use

There are multiple front-end applications with different use cases.

Implementation Notes

GraphQL

How it works

When to use

Implementation Notes

Use your Data Warehouse

Some existing standard, or create your own for bilateral agreement. Might evolve into a standard if useful for others.

Cross-cutting Concerns

Row-Level Authorization

Caching

Ed-Fi API Client Credentials


Define the problem:

  • Course-grained access to fine-grained resources (“chattiness”)

    • Network latency

  • Getting rid of composites

  • Alternate security protocols

Solution: create a specialized backend API, which sits “close” to the Ed-Fi API (for low network latency).

  • Central Aggregating Gateway: more generic design than a BFF, can support multiple user interfaces

  • “A central-purpose aggregating gateway sits between external user interfaces and downstream microservices and performs call filtering and aggregation for all user interfaces. Without aggregation, a user interface may have to make multiple calls to fetch required information, often throwing away data that was retrieved but not needed.”

  • BFF: designed for a single front-end

  • “The main distinction between a BFF and a central aggregating gateway is that a BFF is single purpose in nature—it is developed for a specific user interface.”

  • https://learning.oreilly.com/library/view/building-microservices-2nd/9781492034018/ch14.html by Sam Newman, ch 14, O’Reilly Media, Inc.

  • Solves the impedance mismatch between systems.

  • Similar to the Facade pattern in OO systems

Alternatives

  • GraphQL - why not? Well-defined need is easier to express and reason about. Create a simple single definition of an API.

Design

  • BEF app needs a key and secret to an ODS/API. Maybe to many of them.

  • Analyze your authorization strategy and needs.

  • Cache local data for higher performance.

    • Possible implication: ETL

  • Change Queries (ETL!)

    • Show basic interaction

    • Definitely(?) implies caching.

  • Storing multiple tenants' data? Decide on a multi-tenancy pattern for segregating the data in the caching layer.

  • Evolutionary diagrams:

    • FE to BFF to BEF to ODS/API

    • Add caching layer and ETL

  • Gateway Aggregation pattern

    • Resiliency patterns and more

  • No labels