Goals
The Data Import (DI) SIG had 3 goals:
...
Details of this toolkit are below, with links to GitHub for the source code and documentation of each component:
- Earthmover - CLI tool for transforming collections of tabular source data into a variety of text-based data formats via YAML configuration and Jinja templates.
- Lightbeam - CLI tool for validating and transmitting payloads from JSONL files into an Ed-Fi API.
Ed-Fi Educator Preparation Program Evaluation
The Ed-Fi Educator Preparation Program (EPP) works in the higher education space to utilize Ed-Fi technology to integrate data to evaluate performance and growth of education preparation programs. The majority of data within this domain comes from non-API ready systems, such as legacy databases and CSV source files. Data Import is known to service this domain and aids in the ETL process into Ed-Fi environments. As EPP can work in high volumes of data and enterprise environments, the team did an evaluation of open-source, low-cost and cloud-ready ETL tools. Each one of these tools have been proven to load data into Ed-Fi ODS / APIs from this evaluation, with detailed notes on performance, process to install, map and load data with the tool, and pros/cons of each tool used.
...
From this evaluation, Data Import seems to work well for non-enterprise environments and its data needs and processes. The tools listed above work well for enterprise environments and require a level of knowledge and effort to maintain for ETL needs. Each ETL situation is different and should be evaluated against the list of tools to determine the right fit for the project. In the future, Ed-Fi may look at paths to utilize domain knowledge and mapping capabilities from Data Import, and utilize pre-existing tools to service the need of transforming and loading non-API data for a hybrid solution approach.
The full Data Import alternative report is available in the EPP program section here: Loading Large Datasets - Alternatives to the Ed-Fi Data Import Tool
3.) Data Import users will prefer an open-source path ahead for the product
...
As part of the DI SIG, the following issues and feature requests have been reported and registered below:
Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1253 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1135 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1254 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1255 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1256 Jira Legacy server Ed-Fi Issue Tracker serverId e04b01cb-fd08-30cd-a7d6-c8f664ef7691 key DI-1257
...