Student ID to Identification Code Translation (Ed-Fi ODS / API)

Overview

There are multiple student identifiers in the education data ecosystem. Several cases have emerged in the Ed-Fi Community where student IDs in API transactions (i.e., the Ed-Fi studentUniqueId field) is not known to the client application, and, as a result, a related transaction fails. This issue has been raised in, for example, the ODS-1824, ODS-2664, ODS-2791 tickets. The topic was discussed in the TAG Meeting 2019-01-16.

Direction:

  • The proposed default solution is to facilitate, push, and drive rostering products to support configuration of different IDs for different agencies, and to store all roster IDs. (MasteryConnect, for example, has taken this approach.)
  • As a stopgap, the Ed-Fi ODS / API team has placed student identification code translation functionality (epics ODS-2827 and ODS-3083) on the 2019 development roadmap.

Actions:

  • The Ed-Fi ODS / API will provide a way to configure API clients so they can interact with the Ed-Fi ODS / API using their chosen student identifier in the StudentUniqueId properties everywhere they appear in requests and responses. This design is based on Ed-Fi's knowledge on how identification codes are currently being used in the ecosystem and is designed to fulfill a limited set of use cases.
  • Ed-Fi Certification has added a requirement that a SIS provide all 3 “priority” IDs at play in today’s ecosystem. See v2 and v3 certification tests for specifics.
  • The next data standard releases will surface definitions that clarify the 3 priority IDs. See, for example, DATASTD-1295.
  • This requirement was also added to the Assessment Outcomes API certification.

Introduction

After discussions with vendors working on API integrations, it is clear that they often need to match students using an identifier that may be different from the value used by the local education agency as the StudentUniqueId. A proposed approach discussed (in TAG Meeting 2019-01-16) whereby vendors would perform a separate search request to identify the StudentUniqueId was not well received.

As a result of this feedback, a different solution will be provided by the Ed-Fi ODS / API based on the concept of operational contexts already in use for Ed-Fi Descriptors. Essentially, an operational context is a context in which an API client would interact with the Ed-Fi ODS / API using the Descriptor values and identifiers (for education organization and people) that are most convenient for them. A longer-term possibility would be for this operational context to be easily portable to (or at least easily defined by) an API host, thereby facilitating much faster API integrations for vendors. While the metadata required to add operational context support for descriptors is significant, the metadata required for supporting students (or staff) is significantly lower — simply the selection of a particular student identification system.

With the changes described below, API clients will be able to interact with the Ed-Fi ODS / API using their chosen student identifier in the StudentUniqueId properties wherever they appear in the requests and responses.

Overall Approach

In the current Ed-Fi ODS / API, client systems identify students by supplying the StudentUniqueId used by the host in their API requests. Since the ODS uses an integer-based surrogate id internally (StudentUSI), the API translates the StudentUniqueId to and from the corresponding internal-facing StudentUSI in the entity layer. This mapping is handled by the PersonUniqueIdToUsiCache class through the GetUniqueId and GetUsi methods.

The Admin database will be modified to record a Student Identification System descriptor that is used by the host API (and written by the SIS vendor in the Student's identification codes collection) that corresponds to the value used by each API client. This selection (if made) will be communicated to the API server alongside the other claims information during the initial bearer token validation, and captured in the API key context.

The PersonUniqueIdToUsiCache will be expanded to support mappings between StudentUSI and the identification codes, and will utilize the API key context to determine which transformation is appropriate for the StudentUniqueId value.


Admin UI and Database Changes

The Admin UI and Admin database will need to be augmented to allow an API administrator to specify which Student Identification System Descriptor value will be used by each API client when identifying students. This setting will be optional and when no value is specified, the API will simply use the StudentUniqueId stored in the ODS. The StudentIdentificationSystemDescriptor setting will be captured and stored in the dbo.ApiClients table of the EdFi_Admin database:

The dbo.AccessTokenIsValid stored procedure will be updated to include the chosen identification system. This value will then flow to the API during token authorization and be stored in a new StudentIdentificationSystemDescriptor property on the ApiKeyContext class and made accessible elsewhere through the IApiKeyContextProvider interface.

PersonUniqueIdToUsiCache Changes

Currently, the StudentUniqueId transformation to StudentUSI is performed using the PersonUniqueIdToUsiCache in the generated entities.

Here are the accessors for the StudentUniqueId property that maps the persisted StudentUSI to StudentUniqueId for outbound (GET) requests: 

private int _studentUSI;
private string _studentUniqueId;

public virtual string StudentUniqueId
{
    get
    {
        if (_studentUniqueId == null)
            _studentUniqueId = PersonUniqueIdToUsiCache.GetCache().GetUniqueId("Student", _studentUSI);
            
        return _studentUniqueId;
    }
    set
    {
        _studentUniqueId = value;
    }
}

Here are the accessors for the StudentUSI property that maps the incoming StudentUniqueId to StudentUSI for inbound (PUT or POST) requests:

private int _studentUSI;
private string _studentUniqueId;


public virtual int StudentUSI 
{
    get
    {
        if (_studentUSI == default(int))
            _studentUSI = PersonUniqueIdToUsiCache.GetCache().GetUsi("Student", _studentUniqueId);

        return _studentUSI;
    } 
    set
    {
        _studentUSI = value;
    }
}

Since API clients will now use their chosen student identification system code values in place of the StudentUniqueIds in references sent in requests to the API, the PersonUniqueIdToUsiCache will be enhanced to support mappings between the persisted StudentUSI and the student identification codes.

It will now load the ConcurrentDictionary instances in a fashion similar to the existing implementation for mapping between StudentUniqueIds and StudentUSIs.

For incoming PUT or POST requests from an API client, the GetUsi method of the PersonUniqueIdToUsiCache will be used to try and find a distinct match using the key built based on the API client's student identification system Descriptor and the identification code value supplied in the StudentUniqueId:

  • Suggested v2.x key: 

    private struct StudentIdentificationCodeToUsiKey
    {
        public string StudentIdentificationSystemDescriptorUri;
        public string IdentificationCodeValue;
    }
  • Suggested v3.x key (adds EducationOrganizationId since the values are stored in the StudentEducationOrganizationAssociationStudentIdentificationCode table):

    private struct StudentIdentificationCodeToUsiKey
    {
        public string StudentIdentificationSystemDescriptorUri;
        public string IdentificationCodeValue;
        public int EducationOrganizationId;
    }
  • The value will be an array of StudentUSIs (int) because uniqueness is not enforced on Student Identification Codes and so there could be a single student identification code value associated with multiple students.

Lookup scenarios would be handled as follows:

  • If there is a match with a single StudentUSI, the resolved value will be used as the StudentUSI for persistence.
  • If there is no match found in the cache, a query will be performed to try and load it. If still not found, a 0 should be returned. (This behavior mirrors the existing behavior of cache misses on StudentUniqueId.)
  • If there is a match with multiple StudentUSIs, the API client should receive a 400 response status with an error message of "Ambiguous student match on the supplied StudentUniqueId." (NOTE: The recommended approach is to create and throw a custom exception type of AmbiguousUniqueIdMatchException and add the exception type to the BadRequestExceptionTranslator implementation.)

For outgoing GET requests for an API client, the GetUniqueId method would be used to try and find a match, again using a key built based on the API client's student identification system Descriptor and the StudentUSI found in the persistent entity.

  • Suggested v2.x key: 

    private struct StudentUSIToIdentificationCodeKey
    {
        public int StudentUSI;
        public string StudentIdentificationSystemDescriptorUri;
    }
  • Suggested v3.x key (adds EducationOrganizationId since the values are stored in the StudentEducationOrganizationAssociationStudentIdentificationCode table):

    private struct StudentUSIToIdentificationCodeKey
    {
        public int StudentUSI;
        public string StudentIdentificationSystemDescriptorUri;
        public int EducationOrganizationId;
    }
  • The value would ideally be the IdentificationCode value (string), but with the presence of the AssigningOrganizationIdentificationCode in the primary key, multiple identification code values could match. Thus, unless the Ed-Fi model changes to remove that column from the primary key, the value would need to be an array of structs, as follows:

    private struct StudentUSIToIdentificationCodeValue
    {
       public string AssigningOrganizationIdentificationCode;
       public string IdentificationCode;
    }

The lookup scenarios should be handled as follows:

  • If there is a unique match on the identification code value available for the StudentUSI, the value will be used as the StudentUniqueId in the response.

  • If there is no identification code, a database query will be performed to attempt to load it and augment the cache entry, if found.

    For Discussion:
    If there is still no match, what should the ODS / API do? Possible options include returning an error (what status and message?) or returning an empty or masked (e.g. "???") StudentUniqueId value. If an error is to be thrown, should the error really be thrown if the offending item is part of a multi-item response? Should the item be removed?
  • If there are multiple matching identification codes (from different assigning organizations), an error (or empty/masked response as decided based on the resolution of the above discussion) would result. In the case of an error response, the AssigningOrganizationIdentificationCodes would be surfaced in the AmbiguousUniqueIdMatchException message to provide information about how the host could ultimately resolve the situation. (NOTE: This scenario really should be a rare/theoretical edge case if SIS vendors are the ones primarily responsible for managing this data and are instructed to avoid creating this problem.)

Student Entity Property Changes

For Discussion:

With the support for transparent transformations of the StudentUniqueId values in references for affected API clients, this creates a technical challenge for supporting those clients with updates to the Student resource (where the raw StudentUniqueId value is stored).

GET requests for Students could be served in the same way as described above, but PUT or POST requests become more challenging. Given that operationally the API clients who are given an "operational context" for matching students based on a different student identification codes would not also be the ones updating the Student resource (and specifically the StudentUniqueId property), this may be another rare/theoretical edge case.

We could rely on "correct" authorization (and Profile definitions/assignments) for these types of vendors to prevent updates (and unexpected/unfortunate results), or we could explicitly return a 400 (or possibly 403) status response to prevent updates to the Student resource. The latter approach is probably best, but should be discussed.

Considerations for API v3.x

The movement in v3.x of the identification codes under the StudentEducationOrganizationAssociation table introduces tremendous flexibility into the Ed-Fi model. As a result it creates challenges for matching student identification codes appropriately. The level to which we can expect certain behavior from SIS vendors needs to be discussed and defined. If we have no constraints or expectations as to where any identification codes will be written, and we just want the implementation to match any identification code for a student without regard to where it is defined, this would certainly simplify the implementation — but would possibly allow identification code matches to bleed through from one district to another. For example, two "Local" identification code values could be defined as the same value for different students in different districts, but such an implementation — one devoid of LEA-level context — would interpret the values as a collision.

So, the identification code mapping entries in the cache dictionaries (used by the PersonUniqueIdToUsiCache) should probably incorporate the EducationOrganizationId into the keys in v3.x, but it is an open design question about what that EducationOrganizationId should be. Would we expect the SIS vendors to populate the student identification codes on the student's association with the LEA? The model will support any education organization. If they are not on an LEA, would they be excluded from our student identification code matching logic?

When a school-level vendor is making a request for a student's data with an LEA-centric view of the identification codes, we can't realistically expect to use their associated SchoolId from the claims information directly for matching purposes. But, we should probably use the containing LEA's EducationOrganizationId (this can be obtained utilizing the EducationOrganizationCache's  GetEducationOrganizationIdentifiers method).

So, defining expectation as to where these identification codes will be located would certainly simplify the API implementation, but it might not match actual usage of the model and may ultimately produce unexpected results for use cases that have yet to be encountered.