This version of the Ed-Fi Data Standard is no longer supported. See the Ed-Fi Technology Version Index for a link to the latest version.
XML Schema - Enumerations and Descriptors
- Ian Christopher
Enumerations and Ed-Fi Descriptors
Enumerations define a “controlled vocabulary” for the value for an attribute. Enumerations enable standard categorizations and tagging to enable standard reporting. In many cases, the controlled vocabulary is defined by rules or policies defined at the state or federal level.
Enumerations are a base simple type in the XSD specification containing a list of possible values. Single-valued enumerations are shown as restrictions of the base simple type xs:token
, as illustrated below:
<xs:simpleType name=”GraduationPlanTypeMapType”> . . . <xs:restriction base=”xs:token”> <xs:enumeration value=”Career and Technical Education”> . . . <xs:enumeration value=”Distinguished”> . . . <xs:enumeration value=”Minimum”> . . . <xs:enumeration value=”Recommended”> . . . <xs:enumeration value=”Standard”> . . . </xs:restriction> </xs:simpleType>
This convention was chosen in order to have descriptive enumerations instead of codes. Either approach in XML is equally rigorous. The rationale is to make the interchange more auditable and reduce mistakes due to similar, but different, code tables. For example, if the sender sends a code of “12” the receiver will interpret that with the same meaning only if they both are using identical code tables.
Issues with Enumerations in Education Standards
Enumerations, particularly when codified in a standard XML schema, pose several issues:
- Different governing bodies use different enumeration vocabularies and have not agreed upon on a common set. In many cases, these vocabularies are established or influenced by state laws or policies and cannot be the same. In the example above, the three values for GraduationPlanType are specific to Texas and do not match the types in other states.
- Enumeration vocabularies change as education policies and laws change over time. This makes standardizing on a set of enumerations difficult. This also poses problems for analyzing longitudinal data where the code values have changed year to year.
- Governing bodies have historically used a code-based approach for specifying enumerations. Each enumeration typically has a numeric value (e.g., 1, 2, 3...). This approach was born from flat-file data collections, where the codes were more concise than their descriptive label. This limitation does not exist in XML. However, after years of use, these codes are part of the vernacular of the administrative personnel and they would like to specify their enumerations using these codes. Most states publish their standard code and enumeration lists each year.
The Common Education Data Standards (CEDS) effort has looked to define the superset of values for the various enumerations. However, this limits validation of enumeration values to the ones actually used at a point in time by a specific organization. This tends to make the enumerations lists large and growing larger over time.
The Ed-Fi Data Standard addresses this inherent limitation of enumerations in XML using a design pattern called Ed-Fi Descriptors.
Ed-Fi Descriptors
Descriptors provide a more malleable alternative to enumerations. Descriptors are an expanded feature—first introduced in Ed-Fi Data Standard v1.1—that are vital to accommodate the ways in which users of the Ed-Fi Data Standard need to refer to enumerated collections of values.
Descriptors are enumeration vocabularies that are not “fixed” within the XML schema, but are defined in XML files and linked to their source. Descriptors provide implementers with the flexibility to define their own enumerations. Key features of the Descriptor Pattern are:
- Descriptors minimally have a ShortDescription and CodeValue, and may also have a LongDescription. Descriptors allow states and other implementers to continue to use the codes associated with their enumerations.
- To support changing enumerations or code sets, Descriptors have an EffectiveBeginDate and EffectiveEndDate that are typically aligned to school years.
- To better support longitudinal analysis, Descriptors may capture a PriorDescriptor, as appropriate, when codes may change for the same concept or category.
- While Descriptors allow ultimate flexibility for states and other implementers to determine their codes and enumerations, Descriptors may have “maps” back to a common reference terminology to support applications.
- For example, the AttendanceEventCategoryDescriptor allows states and other implementers to define their own attendance codes. The AttendanceEventCategoryMap crosswalks the codes back to a minimum set of In Attendance, Excused Absence, Unexcused Absence, Tardy, and Early Departure.
- Descriptors are linked to a “namespace” that defines its scope of use. Ideally, a state will publish an enumeration vocabulary or code list at a specific URL. The Namespace element of the Descriptor will contain this URL.
All Descriptors are an extension of the type DescriptorType, shown below:
<xs:complexType name=”DescriptorType” abstract=”true”> . . . <xs:complexContent> <xs:extension base=”ComplexObjectType”> <xs:sequence> <xs:element name=”CodeValue” type=”CodeValue”> . . . <xs:element name=”ShortDescription” type=”ShortDescription”> . . . <xs:element name=”Description” type=”Description” minOccurs=”0”> . . . <xs:element name=”EffectiveBeginDate” type=”xs:date” minOccurs=”0”> . . . <xs:element name=”EffectiveEndDate” type=”xs:date” minOccurs=”0”> . . . <xs:element name=”PriorDescriptor” type=”DescriptorReferenceType” minOccurs=”0”> . . . <xs:element name=”Namespace” type=”URI”> . . . </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
Consider the example of PerformanceLevel for an Assessment. The performance levels are custom to each assessment (e.g., Met Standard, Commended, College Ready) and cannot be standardized. The PerformanceLevelDescriptor is shown below.
<xs:complexType name=”PerformanceLevelDescriptor”> . . . <xs:complexContent> <xs:extension base=”DescriptorType”> <xs:sequence> <xs:element name=”PerformanceBaseConversion” type=”PerformanceBaseConversionType” minOccurs=”0”> . . . </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
The entity PerformanceLevelDescriptor holds the CodeValue and Description for each of the performance level enumerations specific to an assessment, as well as other important attributes.
All Descriptor references are an extension of DescriptorReferenceType, shown below.
<xs:complexType name=”DescriptorReferenceType”> . . . <xs:complexContent> <xs:extension base=”ReferenceType”> <xs:sequence> <xs:element name=”CodeValue” type=”CodeValue”> . . . <xs:element name=”Namespace” type=”URI” minOccurs=”0”> . . . </xs:sequence> </xs:extension> </xs:complexContent> </xs:complexType>
The entity PerformanceLevelDescriptorReferenceType holds the CodeValue and Namespace for the specified performance level enumeration.
As an example, for a state to define their own graduation codes for GraduationPlanType they would construct an XML file as follows:
<?xml version="1.0" encoding="utf-8" standalone="yes"?> <InterchangeDescriptors xmlns="http://ed-fi.org/0200" xmlns:ann="http://www.ed-fi.org/annotation" xsi:schemaLocation="http://ed-fi.org/0200 Interchange-Descriptors.xsd" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <GraduationPlanTypeDescriptor> <CodeValue>27</CodeValue> <ShortDescription>Minimum</ShortDescription> <Description>Minimum High School Program TAC Chapter 74, revised September 1, 2005; including TAC §89.1070(b)(2) for students receiving special education services, revised August 1, 2002. (for students who entered grade 9 in 2007-2008 and thereafter)</Description> <EffectiveBeginDate>2007-09-30</EffectiveBeginDate> <Namespace>http://ritter.tea.state.tx.us/weds/index.html</Namespace> <GraduationPlanTypeMap>Minimum</GraduationPlanTypeMap> </GraduationPlanTypeDescriptor> <GraduationPlanTypeDescriptor> <CodeValue>28</CodeValue> <ShortDescription>Recommended</ShortDescription> <Description>Recommended High School Program TAC Chapter 74, revised September 1, 2005; including TAC §89.1070(b)(2) for students receiving special education services, revised August 1, 2002. (for students who entered grade 9 in 2007-2008 and thereafter)</Description> <EffectiveBeginDate>2007-09-30</EffectiveBeginDate> <Namespace>http://ritter.tea.state.tx.us/weds/index.html</Namespace> <GraduationPlanTypeMap>Recommended</GraduationPlanTypeMap> </GraduationPlanTypeDescriptor> <GraduationPlanTypeDescriptor> <CodeValue>29</CodeValue> <ShortDescription>Distinguished</ShortDescription> <Description>Distinguished High School Program TAC Chapter 74, revised September 1, 2005; including TAC §89.1070(b)(2) for students receiving special education services, revised August 1, 2002. (for students who entered grade 9 in 2007-2008 and thereafter)</Description> <EffectiveBeginDate>2007-09-30</EffectiveBeginDate> <Namespace>http://ritter.tea.state.tx.us/weds/index.html</Namespace> <GraduationPlanTypeMap>Distinguished</GraduationPlanTypeMap> </GraduationPlanTypeDescriptor> </InterchangeDescriptors>
The above example includes mappings to a GraduationPlanTypeMap enumeration.
Note that for Descriptors, the Namespace is required so the source for the Descriptor definition can be uniquely determined. An optional AsOfDate may be supplied to give temporal context to the Descriptor value reference.
Because each value/code for a Descriptor is defined with a namespace, the controlled vocabulary may be defined combining values from more than one namespace. This allows a state to add codes to a Federally-defined set or to combine two vocabularies from different contexts. For example, two code lists might be combined for the DisabilityDescriptor, the Individuals with Disabilities Education Act (IDEA) set of disabilities, and the Section 504 set of disabilities, with each set referencing a different namespace.