Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: link to DIT report

...

...

...

...

...

...

...

...

...

Goals

The Data Import (DI) SIG had 3 goals:

  1. Understand and define pain points that Ed-Fi implementations experience while using Data Import as an Extract-Transform-Load (ETL) solution for non-API based data.

  2. Understand other solutions that may be available for reliable ETL pathways for non-API based data.

  3. Inform Data Import future development and roadmap priorities from community blockers, issues and identified needs

Outcomes

These were the outcomes of the SIG from these goals are:

...

Details of this toolkit are below, with links to GitHub for the source code and documentation of each component:

    • Earthmover - CLI tool for transforming collections of tabular source data into a variety of text-based data formats via YAML configuration and Jinja templates.
    • Lightbeam - CLI tool for validating and transmitting payloads from JSONL files into an Ed-Fi API.

Ed-Fi Educator Preparation Program Evaluation

The Ed-Fi Educator Preparation Program (EPP) works in the higher education space to utilize Ed-Fi technology to integrate data to evaluate performance and growth of education preparation programs.  The majority of data within this domain comes from non-API ready systems, such as legacy databases and CSV source files.  Data Import is known to service this domain and aids in the ETL process into Ed-Fi environments.  As EPP can work in high volumes of data and enterprise environments, the team did an evaluation of open-source, low-cost and cloud-ready ETL tools.  Each one of these tools have been proven to load data into Ed-Fi ODS / APIs from this evaluation, with detailed notes on performance, process to install, map and load data with the tool, and pros/cons of each tool used.

...

From this evaluation, Data Import seems to work well for non-enterprise environments and its data needs and processes.  The tools listed above work well for enterprise environments and require a level of knowledge and effort to maintain for ETL needs.  Each ETL situation is different and should be evaluated against the list of tools to determine the right fit for the project.  In the future, Ed-Fi may look at paths to utilize domain knowledge and mapping capabilities from Data Import, and utilize pre-existing tools to service the need of transforming and loading non-API data for a hybrid solution approach.

The full Data Import alternative report is available in the EPP program section here: Loading Large Datasets - Alternatives to the Ed-Fi Data Import Tool

3.) Data Import users will prefer an open-source path ahead for the product

...

4.) Additional Data Import SIG requests lead to these feature requests

...

As part of the DI SIG, the following issues and feature requests have been reported and registered below:

  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1253
  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1135
  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1254
  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1255
  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1256
  • Jira Legacy
    serverEd-Fi Issue Tracker
    serverIde04b01cb-fd08-30cd-a7d6-c8f664ef7691
    keyDI-1257

Next Steps

Ed-Fi will continue to work with the community to understand and utilize Data Import and other ETL solutions to accommodate non-API data within Ed-Fi environments.  As Data Import topics emerge from the topics discovered above, we will use our support channels, outlets and the possibility of future SIGs to help inform its future.