Person Model
- Steven Arnold
Person
This section describes the introduction of a Person entity into the Ed-Fi data model, provides background and rationale, and provides guidance for its application in the field.
Throughout this section we use the following terminology:
- "Person" refers to the Person entity. Other uses will be quoted and explained.
- The term "individual(s)" refers generally to human beings in the real world.
- "Person-role" will refer to specific Ed-Fi entities such as Student, Parent, and Staff.
Background
The initial Ed-Fi data model was heavily influenced by the use case of State Education Agencies (SEAs) requiring accountability reporting from Local Education Agencies (LEAs). In this use case, granular data is obtained from a Student Information System (SIS) into the Ed-Fi ODS where the accountability reports are derived. The objective was to reduce the burden on districts to compile these reports through standardization.
The Accountability use case required a large amount of manual effort on the part of LEAs to cleanse their data because of poor data quality. This drove the Ed-Fi data model to be a natural key model where cardinality, data types and referential integrity are enforced upon entry into the ODS. Moreover, the data model was designed to be highly semantic so as to reflect the specialized terminology of education data in the field.
At the time, Student and Staff were the primary set of individuals, or actors, of interest in the domain, with Parent coming later. In their role as actors, Students and Staff have unique attributes, associations, and networks of entities that are specific to their person-roles. Moreover, the practice at the time was to have different systems of identification — Students would have one set of unique identifiers and Staff would have another. Student identifiers were most often assigned by SEAs. Staff identifiers were most often assigned by another human resources (HR) system, as depicted in the diagram below.
Figure 1. Typical early SEA configuration
At the time, there was some discussion whether or not to introduce a Person entity with Student and Staff inheriting the attributes and associations of Person. This was dismissed for several reasons:
- The relationship between Person and person-roles is NOT one of inheritance as the person-role may not be mutually exclusive. For example, a staff member may also be a parent.
- Unlike the person-roles Student and Staff, Person did not have associations to other entities that belonged at that level in the use cases at that time. While it was possible for a K–12 Student to also be K–12 Staff, this was a rare occurrence and resolving that across two different identification systems was deemed "not worth the effort."
- While there are frequent cases where Staff are also Parent, the two person-roles were similarly distinct in the source systems with little operational or analytic need to link them.
- Early SEA partners indicated a preference for keeping Student and Staff separate without a Person abstraction, since this matched their operational processes and those of the "systems of record."
The early Ed-Fi model was developed to stay as close to the Common Education Data Standards (CEDS) as possible. CEDS defines supertypes for Person, Organization, and Roles. In the CEDS model, Persons have Roles associated with an Organization for a period of time. This structure solves two specific data modeling issues:
- It allows a unified Person to have multiple roles, where Ed-Fi segmented these roles into Student, Staff, and Parent.
- It unifies the Person-role entities from different domains (e.g., K12School and PSInstitution) as subtypes of a common supertype Person.
In the Ed-Fi data model, associations may be made to the superclass to denote that the association is valid for any subclasses, or may be made specifically to a subclass, indicating that it is not valid for all other subclasses. In the CEDS model, all associations are made to the supertype Person. Subtypes are defined only to "reuse" the attributes of the supertype. This specifically was counter to Ed-Fi’s desire for a data model that enforced a higher level of data quality. For example, one that would not allow a Student to erroneously be associated as a Teacher for a section.
New Drivers for Person
The Ed-Fi data model of disconnected person-roles Student, Staff, and Parent has generally served the K–12 community well. However, there are new drivers to introduce a Person, as follows:
- SEAs are introducing person ID systems to assign unique IDs to an individual regardless of their role. These person ID systems are becoming more common with both commercial and open source sources.
- New longitudinal analytic use cases are emerging to link an individual's data over long periods of time from birth through career, which will require a common resolution of their identity as a Person.
- To that end, the Ed-Fi model is being stretched and extended from its K–12 roots, to include early childhood, postsecondary, and career data.
- New use cases, specifically the Teacher Preparation Data Model (TPDM), have surfaced use cases where a single individual can have multiple person-roles simultaneously (see TPDM discussion below).
- New integration scenarios have identified data sources that provide data related to individuals that can span many person-roles (e.g., Surveys). Rather than have many different associations with the several person-roles, a single association to a Person entity becomes attractive.
Figure 2. SEA configuration with a person ID system
TPDM Overview
The Teacher Preparation Data Model (TPDM) is an extension of the Ed-Fi Unifying Data Model created to fill the nationwide need for storing data on a educator's full career. The full-career lifecycle runs from pre-enrollment activities, enrollment in a educator preparation program, program activities and participation, through program participants' K–12 student outcomes. The TPDM use case is meant to enable Educator Preparation Programs (EPPs) to base program improvements on how their graduates perform in the classroom.
See /wiki/spaces/TPDMX/pages/19202146 for more information.
Specifically relevant to the Person discussion, TPDM has identified Candidate as another person-role of interest: A typical life cycle for an individual pursuing a teaching career is as follows:
- An individual is enrolled in University degree program, becoming a Student of the university.
- After core coursework, the individual is accepted into a Educator Preparation Program, becoming a Candidate while remaining a Student of the University.
- After education instruction by the EPP, the individual is hired by a district as a student-teacher. Thus, the individual could simultaneously be a Student, Candidate, and Staff.
This is depicted in the diagram below:
Figure 3. Example TPDM scenario
Data associated with the individual as a university Student are linked to Student; as is the data associated with their other person-roles, Candidate and Staff. The structure of domain data associated with the various person-roles is relatively well-defined. However, the TPDM use case clearly and significantly requires these person roles to be linked by a Person entity.
In addition, there are additional activities (e.g., applications, surveys, professional development) that can span different person-roles (see Figure 4). There are others, like Credential, which are long-lived and relevant to the person-role Staff even though it may have been obtained in the person-role of Candidate. Both situations lend themselves to associations to Person, rather than in the individual person-role.
In addition to having person-roles that an individual participates in simultaneously, the TPDM use case has a more diverse data set and a more diverse set of source systems, as depicted below. The source systems do not belong to the same organization and therefore are unlikely to have a common person ID system. In addition, a large amount of data comes from supporting applications that may use a variety of IDs.
Figure 4. Sample TPDM configuration
Ed-Fi Person Model
The Ed-Fi Person model in UML is shown below. A new Person entity is introduced, as follows:
- The Person entity has a key of a PersonID plus a SourceSystem (descriptor). This allows more than one person ID system to be used for different sets of individuals, if appropriate. By making SourceSystem a descriptor, this forces the control of PersonID sources.
- Optional association references are made from all person-roles: Staff, Student, Parent; and in TPDM: Candidates.
Figure 5. Ed-Fi Person model
This model makes the use of Person entirely optional and supports backward compatibility with existing use of person-roles, whether Person is used or not.
From the API, Person is an addressable resource (/persons) supported by CRUD operations. The API continues to support person-roles as before (/students, /staffs, /parents).
Note: The Ed-Fi model has no special support for Person and relies on API operations to create Person resources and link person-roles to it.
Using the Person Model
The Person model is designed to work best with a person ID system (or systems), though such a system is not required. The use and orchestration of a person ID system is left up to the owner/administrator of the Ed-Fi ODS / API to accomplish the following activities:
- Create a new Person when a new unique individual becomes known to the system and assign a new unique PersonID. This requires querying the personally-identifiable information of existing Persons to determine if a Person is already known to the system — a function typically provided by person ID systems.
- Link person-roles (i.e., Student, Staff, Parent) to the correct Person.
Two possible scenarios of using a person ID system are as follows:
- Today, person ID systems are used by systems of record (or individuals using them) to obtain PersonIDs. Those PersonIDs are used by these systems (primarily, e.g., SIS, HR) as the appropriate student or staff IDs and thus would be the IDs written by those systems into the Ed-FI ODS / API as the ID for the person-roles. The administering organizations for the Ed-Fi ODS must now determine how to create the Person and link to person-roles with the same ID — perhaps by process using the API or backend ODS. This approach would be easily applied by SEAs with existing person ID systems.
- The other alternative is to integrate a person ID system into the Ed-Fi ODS via a process that identifies newly created person-roles, queries the person ID system for a Person ID, creating a Person entity if needed, and linking the person-role to Person. That process 1) could be written using the API change queries to identify new person-role instance, 2) could use database triggers and stored procedure, or 3) could be a batch process using database date-timestamps.
It may also be possible to use Person without a person ID system if there is an appropriate surrogate ID that can effectively serve as a Person ID. For example, person IDs assigned by Certification systems have been identified as a possibility for the TPDM use case.
Extending the Person Model
As with any other entity in the Ed-Fi, the Person entity may be used in extensions as follows:
- Additional attributes may be added to Person as an extension, for example to better effect the integration of a person ID system.
- Associations can be created from other entities that are best related to the generic Person rather than a person-role, like a Survey.
- New person-roles (such the TPDM Candidate) should follow the pattern established by including optional association references to Person.