...
Goals
The Data Import (DI) SIG had 3 goals:
Define Understand and define pain points that Managed Providers experience in implementing Ed-Fi technology and making it available to community members as productized offerings.
Provide input into and feedback on the Alliance’s products and roadmap especially as it relates to deployment, automation, security, management, and other priority concerns.
Discuss ideas and opportunities to unblock managed providers in ways that help them scale their offerings.implementations experience while using Data Import as an Extract-Transform-Load (ETL) solution for non-API based data.
Understand other solutions that may be available for reliable ETL pathways for non-API based data.
Inform Data Import future development and roadmap priorities from community blockers, issues and identified needs
Outcomes
These were the outcomes of the SIG from these goals are:
1.)
...
Managed Service Providers were consistent in endorsing automation technology and approaches for their organizations to effectively reach scale. MSPs use automation for many aspects of their core IT operations and in numerous facets such as deployments, configuration, system health checks and security scans. Automation, paired with flexible, on-demand resources from cloud providers (Azure, Amazon Web Services and Google Cloud Platform), allow MSPs to accommodate demand in ways that traditional on-premise installations would be limited.
Members from the MSP SIG both prompted new updates to the Ed-Fi roadmap, such as the Admin API and Single-Sign On for Ed-Fi Tools technical designs, which was created and reviewed as part of the SIG. The Admin API is a new product on the Ed-Fi roadmap which will allow for the management of the Ed-Fi ODS/API platform via API interfaces, which can only be done via system scripting tasks or user interfaces like Admin App. The SIG confirmed that today's approaches are burden to deploying Ed-Fi solutions and API-based approaches are preferred. Single-Sign On requests led to the roadmap additions of support for OpenID Connect as an additional authorization strategy, which is preferred in cloud and MSP-solutions to integrate with existing identity stores in those environments.
Also within this SIG, MSP members shared their best practices and internal projects for solutions for Ed-Fi automation, which was helpful to review both for MSP group practice share and to confirm roadmap deliverables.
From these conversations in the SIG (and related in-person meetings like Tech Congress), Ed-Fi is well positioned to continue to listen and receive feedback from the active MSP community, both for adoption of deliverables from the roadmap and continued feedback into valuable future roadmap deliverables.
2.) Ed-Fi Can Improve on Security Outreach and Coordination with MSP Community
The MSP group provided a strong signal that they wanted to hear more about Ed-Fi and its activities on security. Security is a difficult, nuanced topic for any organization to manage in the public, as some details need to be held tightly within trusted groups, in order to prevent wider attacks from the public and/or bad actors. The Ed-Fi Tech Team presented a deeper look into the security work we do, including code reviews, automated security scans from GitHub and the annual security reviews we do on ODS/API and Admin App. The MSP group was pleased and reassured in our work after disclosing the steps Ed-Fi takes to deliver secure solutions for our stakeholders. However, from this, we've learned that Ed-Fi can improve on outreach and coordination on security topics as relevant to their customer work in the field.
3.) Ed-Fi Technology is Core to Managed Service Providers
From this MSP SIG, we've learned that Ed-Fi technology is delivering on core values important to these partners' businesses and in support of a greater mission within K-12 data interoperability. We've heard directly and indirectly, the enthusiasm and passion for Ed-Fi work — it's technology deliverables, it's convening ability across numerous related communities and other aspects — and many identified that Ed-Fi is a core component into partner businesses and the work they deliver.
4.) Other Product Roadmap Features
...
Data Import is reported as serving non-API ready data needs with an active community of users
The DI SIG informed many aspects on how Data Import is used today in the field, in service of incorporating non-API ready data into Ed-Fi data infrastructure. With an active SIG with over 40+ members to contribute to the forum, we've learned that the tool is serving needs with non-API ready data. From these conversations, we've learned that Data Import is active in the assessment, educator preparation program (EPP), finance data and other domains where API pathways are non-existent. It is recognized from these conversations, Data Import carries a maintenance burden for the implementer to maintain, which is balanced in its usage along with the need to import such data in Ed-Fi environments. It too is recognized that direct API connections from education data producing products is ideal and preferred, which relieves the maintenance burden of running additional ETL solutions to accommodate.
2.) Numerous viable open-source and low-cost alternatives exist for serving the ETL need
The DI SIG reviewed a number of alternatives to Data Import, as the education and general technology markets have tools and products to serve needs for extracting, transforming and loading data of many types. As a result of this review, the forum has discovered and discussed numerous viable alternatives which can also serve loading of non-API data into Ed-Fi environments.
Education Analytics
Education Analytics is an organization that serves education agencies with a multitude of solutions and approaches that serve needs for analytics to improve student outcomes. The team uses Ed-Fi technology in many of its solutions and has deep knowledge of Ed-Fi's data model, ODS / API and other facets to meet these goals. They have built an open-source toolkit to transform and load data into the Ed-Fi API. The technology is Python-based and known to be aligned within cloud environments.
Details of this toolkit are below, with links to GitHub for the source code and documentation of each component:
- Earthmover - CLI tool for transforming collections of tabular source data into a variety of text-based data formats via YAML configuration and Jinja templates.
- Lightbeam - CLI tool for validating and transmitting payloads from JSONL files into an Ed-Fi API.
Ed-Fi Educator Preparation Program Evaluation
The Ed-Fi Educator Preparation Program (EPP) works in the higher education space to utilize Ed-Fi technology to integrate data to evaluate performance and growth of education preparation programs. The majority of data within this domain comes from non-API ready systems, such as legacy databases and CSV source files. Data Import is known to service this domain and aids in the ETL process into Ed-Fi environments. As EPP can work in high volumes of data and enterprise environments, the team did an evaluation of open-source, low-cost and cloud-ready ETL tools. Each one of these tools have been proven to load data into Ed-Fi ODS / APIs from this evaluation, with detailed notes on performance, process to install, map and load data with the tool, and pros/cons of each tool used.
Below is a summary listing of the tools reviewed as part of this effort:
- Standalone Tools
- Cloud-Based Tools
From this evaluation, Data Import seems to work well for non-enterprise environments and its data needs and processes. The tools listed above work well for enterprise environments and require a level of knowledge and effort to maintain for ETL needs. Each ETL situation is different and should be evaluated against the list of tools to determine the right fit for the project. In the future, Ed-Fi may look at paths to utilize domain knowledge and mapping capabilities from Data Import, and utilize pre-existing tools to service the need of transforming and loading non-API data for a hybrid solution approach.
The full Data Import alternative report is available in the EPP program section here: Loading Large Datasets - Alternatives to the Ed-Fi Data Import Tool
3.) Data Import users will prefer an open-source path ahead for the product
From numerous conversations involving Data Import and within the DI SIG, the Ed-Fi community made it known that an open-source license for Data Import is preferred. As a result, Ed-Fi has responded and worked to move Data Import to an Apache 2.0 license. As of November 2022, Data Import 2.0 has been released and with the Apache 2.0 license. Also, the Template Sharing Service has been refactored to an Ed-Fi Exchange repository, which allows for sharing of data maps and preprocessor scripts, however in a manner more aligned with open-source manner using GitHub.
4.) Additional Data Import SIG requests lead to these feature requests
As part of the DI SIG, the following issues and feature requests have been reported and registered below:
Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1253 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1135 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1254 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key
...
DI-
...
1255 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key
...
DI-
...
1256 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key
...
DI-1257
Next Steps
Ed-Fi will continue to work with the community to understand and utilize Data Import and other ETL solutions to accommodate non-API data within Ed-Fi environments. As Data Import topics emerge from the topics discovered above, we will use our support channels, outlets and the possibility of future SIGs to help inform its future.