Ed-Fi Validation API Design Rev1
Further Info
Note that this design is a revision to Ed-Fi Validation API Design.
Contents
Introduction
This document has been prepared as part of a larger initiative that is looking at scalable, economical, and reusable solutions for level 2 validations. For further context, architecture, and vocabulary refer to the associated Ed-Fi Data Validation Architecture document. This work also builds heavily upon the work that is presented on Ed-Fi validations at the 2018 Ed-Fi Technical Congress by Vinaya Maya, Software Development Lead – Ed-Fi, and Britto Augustine, Chief Technology Officer – Arizona Dept. of Education, as seen in this presentation.
The purpose of this document is to define an approach that will enable systems that are currently submitting data to an Ed-Fi ODS/API to consume the level 2 validation results that are associated with the data provided. A common use case would be a scenario in which a Student Information System submitting one district's data to a statewide Ed-Fi implementation would be able to read the validation errors associated with that district's student information data. From here, it could display those errors back to the district users who are responsible for researching and correcting the data, without requiring the district user to interface with a different dashboard or data validation system.
This document proposes a data structure for the validation results and lists out some of the initial details for API functionality. The complete list of API requirements and detailed technical implementation details for the actual API are beyond the scope of this document.
Major Changes from Previous Version
- A ValidationRule endpoint was added, to allow for capture of rules, to allow API clients to reference these via the API. This endpoint will de facto be required in order to allow an API client to understand the severity and categorization of issues.
- Normalized by moving add Severity and Category from ValidationResult to ValidationRule.
- ValidationRule.Severity changed from string datatype to descriptor references.
- EducationOrganization identifiers were changed to be references; this is consistent with the main data API pattern, and is important to allowing the Validation API to be implemented as an API extension.
- Repeated use of term "Validation" on API resource fields was dropped to shorten element naming.
- ValidationRun.RunStatus resource was changed from a string to a descriptor reference.
Data Validation Structure
The data validation results (as defined in the Ed-Fi Validation Architecture) will be composed of three resources, "validation rule", "validation rule run" resource and a "validation result API".
The data validation results (as defined in the Ed-Fi Validation Architecture) will be composed of four resources, “rule collection”, "validation rule", "validation rule run" resource and a "validation result API".
Validation Rule
DATA ELEMENT | DATA TYPE / OPTIONALITY | REVISION | DESCRIPTION |
---|---|---|---|
RuleIdentifier* | STRING MANDATORY (IDENTITY) | New | This is the unique Id for a validation rule. |
RuleSource | STRING MANDATORY (IDENTITY) | New | The source or origin of the rule. |
HelpUrl | STRING OPTIONAL | New | A link to more information about the rule and how to resolve it. |
ShortDescription | STRING OPTIONAL | New | This is non-structured ASCII text that will include the short details that were used in the evaluation of the validation rule. |
Description | STRING MANDATORY | New | This is non-structured ASCII text that will include the details that were used in the evaluation of the validation rule. |
RuleStatus | Restricted-list STRING MANDATORY | New | The current status of if the rule. Examples are “Active”, “Under Analysis”, Inactive”, “Deprecated”. |
Category | STRING OPTIONAL | New | This is a category for the type of validation rule. Examples might be 'Student Demographics', 'Special Education', or 'Attendance' |
Severity | DESCRIPTOR | New | This specifies whether the validation rule is a 'Warning', 'Minor Validation Error', 'Major Validation Error' or other value standardized by the API |
ExternalRuleId | STRING OPTIONAL | New | Refers back to a unique identifier for this rule in another system (such as a state-maintained repository of validation rules) |
ValidationLogicType | DESCRIPTOR OPTIONAL | New | Specifies the language that the validation logic is represented in, ie SQL or Pseudo-code |
ValidationLogic | STRING OPTIONAL | New | Has the actual code or pseudo-code that is used to find validation errors. |
Validation Rule Run
This element will track the runs of the validation rules. The expectation is that this table would be populated before any validation results are produced with a status of 'running.'
DATA ELEMENT | DATA TYPE / OPTIONALITY | REVISION | DESCRIPTION |
---|---|---|---|
RunIdentifier* | STRING (IDENTITY) | Renamed | This is a unique Id for each run |
RunStartDateTime | DATETIME MANDATORY | Renamed | This is time that the validation run was started. |
RunFinishDateTime | DATETIME Optional | Renamed | This is the time the validation run finished. |
RunStatus | DESCRIPTOR MANDATORY | Renamed | This will denote the status of the validation run. Possible values include 'Running','Finished','Stopped-manual','Stopped-Error' |
Host | STRING OPTIONAL | New | The name of the Host or ODS that was evaluated in this run |
ValidationEngine | STRING OPTIONAL | New | A reference to the validation engine that was responsible for this run |
Validation Results
This is the actual results from the validation rule.
DATA ELEMENT | DATA TYPE / OPTIONALITY | REVISION | DESCRIPTION |
---|---|---|---|
ResultIdentifier* | STRING MANDATORY | Renamed | This is a unique id. |
ValidationRuleRunReference | REFERENCE OPTIONAL | Renamed | This refers (foreign key) back up to the validation rule run. |
ValidationRuleReference | REFERENCE MANDATORY | Renamed | This is a unique id that points back to the validation rule that caused the result to be produced. If a validation rule caused multiple results (for example multiple students with the same condition) they would share this id. This is part of the validation result signature. |
ResourceId | Ed-Fi Resource Id OPTIONAL | Renamed | This is the unique identifier in the ODS that is used to reference a specific resource. Examples include StudentUniqueId or EducationOrganizationId. This is part of the validation result signature. |
ResourceType | Ed-Fi Resource OPTIONAL | Renamed | This is the resource associated with the validation rule. This is denormalized from the validation rule, every instance of a given RuleId will have the same Ed-Fi resource |
EducationOrganizationReference | REFERENCE MANDATORY | Renamed and Datatype changed | Along with NameSpace, This is useful for limiting what systems can consume the validation results and routing the validation results within the consuming system. As a reference, this JSON will follow this format: "educationOrganizationReference": { |
StudentReference | REFERENCE OPTIONAL | New | Reference back to an EdFi student object, when applicable for that validation |
StaffReference | REFERENCE OPTIONAL | New | Reference back to an EdFi staff object, when applicable for that validation |
NameSpace | STRINGOPTIONAL | - | Along with EducationOrganization, this can be used for limiting what systems can consume the validation results and routing the validation results within the consuming system. |
AdditionalContext | Array of name/value pairs OPTIONAL | Renamed and Datatype changed | Includes the details that were used in the evaluation of the validation rule. |
Validation Result Signature
De-duplication by API Client
A validation result will include the RuleIdentifier, which references back to the rule that caused it, and the impacted resources, as identified by EducationOrganization reference, student reference, staff reference, and the ResourceId, which will uniquely identify the Ed-Fi resource that is identified in the validation result. If the underlying data is not corrected, then the next time that the validation rule runs the same validation rule will flag the same resource as having the same problem.
The system that is consuming the APIs may need to know that this is just an updated report on the same issue and not a new issue, especially if that issue has somehow been acknowledged or suppressed in that consuming system. The combination of RuleId and ResourceId, as highlighted in the table above, forms the signature that can be used by that consuming system for implementing logic for deduplication and acknowledgement. The requirements of that logic is beyond the scope of this document.
Validation Results API Requirements
The following are the proposed, initial requirements for the validation results API:
- There must be a repository for validation results.
- The validation results repository should account for all of the data elements, there associated data types, and there associated optionality as described in the data validation structure section above.
- The validation results repository should be stored in a way that it can intuitively be queried by administrators with the appropriate level of database access.
- The validation results repository should have a mechanism that will prevent partial-reads of validation results. In other words the timestamp in the validation results must be sequential so that there will never be validation results that are available in the API that have an older timestamp than validation results that are already available via the API.
- Validation results must be made available via a web-based pull API similar to other Ed-Fi resources.
- Validation results must be available to be pulled by any system that conforms to a published API.
- The validation results must include either Namespace or EducationOrganizationId. This is to enable the consuming system to route the validation result to the correct end-user (e.g. - a specific school in a SIS).
- The consumer API should reuse the data-level security mechanism that is used via the existing Ed-Fi ODS (the details of this are unresolved, see the 'open issues' section of the architecture document for more discussion).
- Validation results must have the ability to be submitted via a bulk database API.
- The validation results submittal API should work with bulk SQL statements from a variety of validation rules engines.
- The validation engine must have a back-door method for viewing validation results (e.g. - database access).
- The validation result API must have a mechanism for handling incoming validation errors from systems other than the validation rule engine.
- The validation result API must be able to handle validation results that are not directly related to Ed-Fi ODS data.
Validation Consumer Requirements
The following is a preliminary, incomplete list of functionality that would need to be enabled by the validation result consumer system (hereafter called 'consumer') responsible for pulling validation results from the above-mentioned API. The consumer is most likely to be a Student Information System (SIS).
- The consumer must be able to talk to the validation API.
- The consumer should recognize when validation results with the same unique signature as described above are duplicates and handle them according to the logic specific for that implementation.
- The consumer should have the ability to route validation results to the appropriate end user for resolution
- The consumer should have a method for a user to 'acknowledge' a known issue for future validation results with the same signature to be suppressed. The detailed requirements for this is beyond the scope of this document.