This version of the Ed-Fi ODS / API is no longer supported. See the Ed-Fi Technology Version Index for a link to the latest version.
Guidance on Descriptor Sets for LEAs
In the Ed-Fi community today, local education agencies (LEAs) often confront the issue of how to handle code sets – referred to in Ed-Fi as “descriptors” – in their Ed-Fi ODS API implementations. Essentially, the issue boils down to this question:
In our implementation of the Ed-Fi platform, do we map to and use the default Ed-Fi descriptor values (adding to those as necessary), or do we use our own, current values (and ignore or remove the Ed-Fi values)?
An example can be helpful – consider student absences.
An Example: Student Absences
The Ed-Fi data model includes a set of “default” attendance event values that have been refined via field work, and these are included in the ODS API by default. Those values are:
In Attendance
Excused Absence
Unexcused Absence
Tardy
Early departure
Partial
(Technically these are the default values for AttendanceEventCategory in Data Standard 3.1)
However, we can easily imagine other categories that add more specificity to these, such as “Medically Excused Absence” or “Homebound” or even possibly “Service Day.” We can also imagine that a LEA may not observe some of the Ed-Fi default values – perhaps the LEA has no general concept of “Unexcused” (maybe they only have specific sub-classes: “Medical”, etc.) or has no concept of “Early departure” (maybe they only have “Partial day”).
Options for LEAs
Enumerations are important classifiers of data and are therefore very important to analytics and operational use cases; the approach an LEA takes to this question matters, but many are confused as to best practice.
Note that there is no “right” answer to this question. This document summarizes the main answers we see today in field work to this question and was written to help agencies chose the path right for them.
Note that this document also focuses on Student Information System data, where extensive localization of option sets is most common. See the “Q & A” section at bottom for more info on areas where enumeration sets are more standardized.
Approach 1: Adopt and Extend
Some implementations take an adopt and extend approach. In this case, the LEA keeps the default Ed-Fi values but adds the additional descriptor values that are missing from the Ed-Fi set. If there are any Ed-Fi values that should not be used, these are excluded by external documentation and downstream validations; generally no Ed-Fi default values are removed.
Note that when values are added, they must always be added in the LEA namespace and should be given a definition.
What are the Pros and Cons of Adopt and Extend?
PROs | CONs |
External parties will understand many of your descriptor values, which can enhance “plug and play” interoperability | Implementations can take longer to get started, as they need to do more data mapping at the outset |
The work to map local and Ed-Fi values can drive internal conversations about if current values are needed or used | Mixing and matching sets of values often results in fuzzy or partial matches – minor sacrifices on data semantics and coherence |
| Ed-Fi values are not immediately obvious to local users – local staff are forced to learn new values |
Approach 2: Use Local Values
Some implementations elect to use local values approach. In this case, the agency adds all of its descriptor values natively and ignores all Ed-Fi values.
Descriptors added in this approach are always to be added in the LEA namespace, to avoid confusion with values governed by the Ed-Fi efforts. (See above section on “What are Descriptor Namespaces” for why this is important).
Also, Ed-Fi’s default descriptor values are generally not actually removed (though this is possible). It is generally not a problems to have 2 sets of values because it is easy to see all the local values and distinguish them from the un-used values, by looking at the descriptor namespace.
What are the Pros and Cons of Use Local Values?
PROs | CONs |
Reduces time to start an implementation, as less data mapping is needed | External parties less likely to understand the values and semantics. “Plug and play” interoperability will require more work. |
|
|
Value sets may be more coherent. | Internal conversations about values can be useful, as can norming with widely-used values. |
Internal users understand these values, so can work with Ed-Fi data easier |
|
Approach 3: Hybrid Values (Approach 2 + State Descriptors)
Some implementations are choosing to use a mix of values, most commonly local values and state values. In this case, the LEA keeps the local values in their namespace (for example "mydistrict.edu") but adds the additional descriptor values that are pertinent for state reporting in state namespace (for example "mystate.edu").
Note
This is a newer pattern in the community and so is less well understood.
What are the Pros and Cons of using Hybrid Values?
PROs | CONs |
Reduces time to start an implementation, as less data mapping is needed assuming the number of state descriptors are limited and most state mappings are already known. | External parties less likely to understand the values and semantics. “Plug and play” interoperability will require more work. |
Value sets may be more coherent. | Internal conversations about values can be useful, as can norming with widely-used values. |
Internal users generally understand these values, so can work with Ed-Fi data easier. | Translation of local definitions and values to state ones may result in some data loss, and therefore in lower quality analytics. |
Can enhance the LEA ability to understand impacts of data for state contexts. |
|
Q & A
Which approach is right for my agency?
Why doesn’t the Alliance recommend one approach or the other to ensure that the community is behaving consistently?
How can a technology standard leave open the questions about allowed enumeration values? Doesn’t that make a standard “non-standard”?
What is “operational context” and how does it relate to these questions?
What are Descriptor Namespaces?
For those not aware of descriptor namespaces, a descriptor has a few parts; among those are:
code value – what is the actual code that is transmitted?
definition – how is this value defined?
namespace – whose value is this?
The code value and definition should be self-evident.
However, the namespace is often less understood; the namespace is an indicator of whose value this is. Generally it is provided in URI format using a domain name under the control of the organization who governs this code, as in “mydistrict.edu” For example, all Ed-Fi governed values are in the namespace “ed-fi.org”, indicating that these are governed by the Ed-Fi Alliance.
When you create your own descriptor values, it is imperative that those values be in your namespace. No one but the Ed-Fi Alliance should ever publish values in the “ed-fi.org” namespace.