Last update:
Contents
Table of Contents | ||
---|---|---|
|
Guidance
Specific Ed-Fi API specifications downstream of the Ed-Fi Data Standard may contain normative rules or guidance on descriptor usage, and if such guidance is present it should be followed.
These are the recommendations for descriptor usage beyond any such prescriptions:
...
Purpose and Scope
The term "descriptor" in the Ed-Fi Data Standard refers to data model elements that capture a controlled enumeration value. These are often referred to as "code sets" by those who manage the source systems from which Ed-Fi data originates. GradeLevel, AcademicSubject, and AttendanceEventCategory are all examples of Ed-Fi descriptors.
This document covers guidance for usage of descriptors in the Ed-Fi Data Standard. The Ed-Fi Data Standard provides for a conceptual model and standard REST API and other bindings for that conceptual model. However, Ed-Fi standards as a whole can and do include additional standards that provide additional normative rules and guidance, and those downstream rules and guidance can include additional prescriptions and guidance on the use of descriptors. If such downstream prescriptions exist, those must be followed. (See section below on "Specific API Prescriptions for Ed-Fi Values" for an example case).
These are the guidance for descriptor usage beyond any such prescriptions. These guidelines represent and are drawn from observations from production field work in which Ed-Fi-defined data is flowing between systems.
Code Values and Definitions
Descriptor code values and definitions should be those that are needed to enable the core use cases for the data exchange context.
If the context is a individual local education agency (LEA), the code values should generally be the values as they appear in the source system.
...
- If other code values are used, there should be a clear reason why these values are being used instead of local values. The use of non-local values will increase the loss of fidelity to local operational use cases, so it should be clear what value is created and why this other context for the data exists.
...
To provide an example, if the use case for exchanging data is to support school-district-level analytics, the guidance would be to capture the code values as they exist in the school district source systems, and assign the namespaces to the organization that governs those values, which is most likely the school district itself.
This is a significant change in direction from guidance in Data Standard v3 and prior versions. Please consult the section below on the background for these recommendations for more discussion on the learnings behind this change.
Understanding Descriptors
What are Descriptors?
Descriptors are the code sets or enumeration values that critical in school operations and everywhere in school data. Descriptors capture important conceptual categories and classifications.
- Is the course a “math” course, a “social sciences” course, an “ELA” course, or other?
- Is the student in “first grade”, “second grade”, “third grade”, etc.?
- Is a student who doesn’t show up to school “absent”, “absent with medical documentation”, “homebound”, “absent for CTE program” or other?
- Is a school a “elementary”, “middle”, “high”, “credit recovery high” or other, or multiple of these?
In Ed-Fi, such attributes of entities are referred to as "descriptors" and a possible value for an attribute as a "descriptor value."
Parts of a Descriptor
In Ed-Fi, descriptors have these parts.
- code value: this is the "shorthand" code that appears in operational systems and in the data
- namespace: this defines the scope of the descriptor, as well as provides an indicator of the governance of the descriptor
- description: this is the definition of the descriptor value
When data is in transit, only the code value and namespace are included, and generally not the definition. A descriptor encoded in an Ed-Fi REST API looks as follows:
uri://grandbend.edu/AcademicSubjectDescriptor#PALG
This is the pattern followed:
[namespace]#[descriptor value]
In this case, the example is the code "PALG" and the namespace shows that the value is owned by Grand Bend (Grand Bend is the fictional school district that appears in Ed-Fi Data Standard sample data).
To help readers follow the examples below, we will include both the descriptor code and definition in a shorthand like this: "PALG"/"Pre-algebra"
Descriptor Namespaces
The namespace is a string value that specifies the scope of the value, and usually who governs or defines the value. In the case above, you can see that the value "PALG" is defined by "grandbend.edu". In other words, the school district Grand Bend defines a value called "PALG"/"Pre-Algebra"
If this was an agency with a different definition of mathematics, there could very well be a different descriptor value that looked like this in a REST API JSON:
uri://greenhills.edu/AcademicSubjectDescriptor#PALG
This says that there is an other code value "PALG" governed by "greenhills.edu". The definition of that descriptor value may be different from the one governed by Grand Bend.
By convention, namespaces usually follow a URL-like-pattern beginning with "uri://" and referencing an Internet domain owned by the organization that governs the descriptor value.
If the descriptor value is stored in a database, this namespace may be stored separately.
Multiple Mappings and Descriptor Mappings as Data
Many operational systems contain multiple descriptor values for the same descriptor attribute on an entity. The most common case is the student information system, which commonly contains both a local, operational value and a state mapping of that value.
Attendance is an illustrative case. Often school districts have long lists of attendance codes, but the state needs one of a smaller set of values to be sent for compliance reporting. The SIS therefore contains a table that maps one value to another.
This occurs in other systems as well, such as assessment systems and other learning systems. In some cases, the vendor systems itself is an important context, such as when a vendor maintains an internal list of academic subjects for the purpose of classifying curricular elements according to an internal taxonomy.
The Ed-Fi Data Standard contains a entity to capture descriptor mappings; it is called DescriptorMapping. It was introduced in Data Standard v4. The entity has these attributes:
DescriptorMapping
- Value The value being mapped
- Namespace The namespace of the value being mapped
- MappedValue The value to map to
- MappedNamespace The namespace of the value mapped to
- MappingScope The scope of the mapping; i.e. which entities or resources it applies to
A sample usage in Ed-Fi REST API JSON would look like this:
{
"Value" : "ABS-MED",
"Namespace" : "uri://grandbend.edu/AttendanceEvent",
"MappedValue" : "ABSENT",
"MappedNamespace" : "uri://somestate.edu/AttendanceEvent",
"MappingScope" : []
}
This mapping would capture that the local attendance code "ABS-MED"/"Absent with medical note" maps to the state code "ABSENT"/"Absent"
An empty MappingScope means that the mapping applies to all elements in the current data exchange context. If the mapping only applied to a subset of the elements in the model, those element classes would be listed in that attribute collection.
Background of these Recommendations
Principles and Key Learnings of the Community
Over the course of many years, Ed-Fi standards and community worked very hard to define and coordinate specific sets of values for descriptors, and to ask community members to use those and only those values. These were values in the "ed-fi.org" namespace.
What the community learned was that this was impossible to do at such a broad level. Specifically, we learned:
- At the highest level of the K12 education ecosystem, there are too many use cases and needs to declare sets of "universal" values
- The conversion of existing school district values to Ed-Fi or other values often produced confusion, introduced delays, and increased costs, all while not clearly assisting with the outcomes the agency wants.
- To force conversion of existing values to Ed-Fi-defined values results in loss of data fidelity, and therefore less usable or valuable data. Mappings are lossy by their very nature, and that loss should produce some value; in most cases, we found it did not produce any value.
- This conversion forces school district staff to learn new code values, when they are familiar with other code values from local system usage. That slows down the work of generating analytics/reports and introduces confusion when staff see other code values with which they are not familiar from their daily work.
- This conversion creates a lot of low-value, front-loaded work, and therefore causes delays in setting up data exchanges. This work tends to fall at the beginning of projects, and therefore delays useful outcomes, often by weeks or months.
- When more narrow use cases are defined (e.g., learning tool rostering, state compliance reporting) it is possible to set defined values as part of a more focused specification.
The largest learning of the community is that we have found over the course of many years is that it is impossible to have the K12 community agree on universal values for descriptors across all use cases. Even descriptors that look like they can be standardized, like grade level or academic subject, resist standardization.
Rather, required descriptor values are defined by the "operational context" of the data exchange. For example, if a district is doing state reporting, descriptor values are commonly set by the state. However, if a local district is sourcing their data from a student information system (SIS), the operational context is the local district context, and the descriptor values should remain as faithful as possible to local operational needs for use cases like local analytics.
As such, the general recommendation of the Alliance is that descriptor values should be sourced from the native source systems and placed in a namespace that accurately captures the governing organization.
This recommendation seems to be counter-intuitive: if the purpose of the Ed-Fi Data Standard is to standardize data exchange, shouldn't descriptors be standardized as well?
The answer is that they certainly can be standardized and – per the above discussion – there are good reasons to do this when the use case justifies such an approach. But many – probably most – use cases we have seen do not justify such an approach, and the community was spending lots of time mapping descriptor values for unclear purposes. Field work and experience has revealed that there are usually more reasons to allow values to be highly local, given that data insights have the most impact when analytics are able to access the richness of the same operational context of the source data.
Further, the descriptor mapping features coved above allow for a data exchange that both maximizes local semantics for descriptors and also allows for standardization of those values outside of the most local operational context.
The Purpose of Ed-Fi Data Standard Descriptor Values
Ed-Fi publishes a set of "default" values that are used in the Ed-Fi Data Standard sample data and are intended to help implementers of the Data Standard understand the semantics and common usage of each descriptor.
In some cases, these values may be required in a downstream data specification; per the discussion above, this would be a specification in service of a more specific use case identified as valuable by the community.
Additional Guidance
Most Commonly-Used Descriptors
There are over 160 descriptors in the core Ed-Fi data model and many more in domains under development or proposed for introduction into the core model. Not all of them will be relevant to your organization's use of Ed-Fi.
Generally, the descriptors that matter most to your organization initially will be those involved in data exchanges between systems. That list will generally depend on the APIs your organization is using.
A good first list to pay attention to are the descriptors that are required in an Ed-Fi API certification, as these must be implemented by compliant systems. By API, those descriptors are:
...
Core Student API for Suite 3
(covered by Ed-Fi Student Information Systems API v3 Certification)
...
Assessment Outcomes API
(covered by Ed-Fi Assessment Outcomes API for Suite 3 Certification)
...
These are the values that LEA staff are accustomed to; switching values is generally disruptive. As the intent is not for data to flow outside of this context, other standardized values provide little new value.
If the context is an education service agency (ESA) or similar collaboration of LEAs it is common for that organization to adopt a hybrid approach. For some descriptors, the managing organization will develop values needed to enable the data services provided for those LEAs for some descriptors; for other descriptors, the services may be able to consume and use local code values. The managing organization will make this determination. See also the section below on descriptor mapping as a newer construct in the Data Standard to support multiple contexts.
If the context is a state education agency (SEA) and the use case is limited to state data collections, states most often use the values they currently use in their data collections, in order to limit the change management burden for LEAs and vendors. However, modernization projects often lead to rethinking code sets, and that process may lead to development and use of new values.
If a state education agency is doing data collections and is also working to support local data interoperability for LEAs in the state, then the state should work to align its descriptors with the needs of the ESAs or other organizations that are managing those local data services.
In such a context, the goal is for both the state and the local education agency to share data specifications in order to limit the burden on vendors.
As part of that responsibility, the state should work with the organizations within the state to manage descriptor sets to ensure the clarity and coherence of those sets for vendors and source system operators who must use them (including removing duplicate values, ensuring clarity of definitions /avoiding semantic overlap in definitions).
See also the section below on descriptor mapping as a newer construct in the Data Standard to support multiple contexts.
Note that best practice at all levels for code values and definitions is to use values that are as close to source-system values (e.g., represent less aggregation; use code sets at similar granularity; etc.) as the context and use cases will permit. Values closer to the source system context provides more definition around the individual student; that provides is a richer context for decision-making which can improve student performance.
Namespaces
Descriptor namespaces should clearly indicate the organization that governs the value; this is often the education agency for local operational code sets, but in some cases code sets may be governed by external organizations (vendors, ESAs, states, etc.).
Best practice is to use a URI-format that indicates a domain name owned by the organization, to facilitate the ability of users to discover and learn about the organization that governs the value.
Use of uri://ed-fi.org Descriptor Values
The default uri://ed-fi.org
standards values can be useful as a means of populating descriptors that the Data Standard requires but that the agency does not depend on for its core use cases. When a use case relies on a specific descriptor, values in the uri://ed-fi.org
namespace should generally be avoided.
Over many years, the Alliance has learned that code sets are nearly always context-dependent, and even when values can be successfully mapped to Ed-Fi governed values, this practice introduces confusion to data users who must learn code values they have not seen before.
Organizations must not edit or change code sets or definitions in the uri://ed-fi.org
context. These values are governed by the Ed-Fi Alliance on behalf of the Ed-Fi community.
Specific API Prescriptions for uri://ed-fi.org
Values
Please note that per the guidance at top, specific Ed-Fi API specifications that are downstream of the Data Standard may add additional requirements that reflect community goals to better coordinate data exchange. Included in these requirements can be requirements to use uri://ed-fi.org
descriptors. As an example, the proposed Ed-Fi Enrollment API requires the use of uri://ed-fi.org
descriptors because the API envisions a K12-wide context for use of that API; see (see ED-FI RFC 19 - ENROLLMENT API FOR SUITE 3 under section "Enumerations (Descriptors)."
Use of Provider (Vendor) Values
Note that there may be situations and use cases where organizations choose to use provider-defined values. For example, many assessment vendors work across the K12 landscape and have created enumeration values whose semantics are very specific, well-documented and well-understood. Such values are often used locally, even though they are vendor-defined. Use of these values is not uncommon in some domains.
As with uri://ed-fi.org
values, values or definitions in the namespace of another organization should not be modified by anyone other than the organization that owns them.
When such values exist in REST API context, vendors are often permitted to write these values; this tends to simplify the setup of the data exchange and allow the vendor control over descriptor management.
Descriptor Mapping as Data
Data Standard v4.0 introduced a new feature: the ability to capture descriptor mappings as data (see What's New - v4.0#DescriptorMapping). This feature promises to allow data elements with descriptors to retain more local fidelity as the element moves between data exchange contexts.
Commonly in K12, source systems will contain descriptor mappings to multiple contexts. For example, a Student Information System typically has both local LEA descriptor values and also mappings of those values to state values. This is also true of other systems as well -- assessment, intervention, etc. Sometimes, the mapping is between local/state contexts, but there are also mappings between vendor/local contexts, state/federal contexts, vendor/state contexts, and others.
Allowing these canonical mappings to be captured as data promises to allow a system to transmit data that retains important contextual information for multiple contexts and use cases.
The Alliance hopes that this feature can be useful to projects; we invite feedback on its utility for inclusion in future descriptor guidance.