Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

The Data Import (DI) SIG had 3 goals:

  1. Understand and define pain points that Ed-Fi implementations experience while using Data Import as an Extract-Transform-Load (ETL) solution for non-API based data.

  2. Understand other solutions that may be available for reliable ETL pathways for non-API based data.

  3. Inform Data Import future development and roadmap priorities from community blockers, issues and identified needs

These were the outcomes of the SIG from these goals are:

1.) Data Import is reported as serving non-API ready data needs with an active community of users

  • Active SIG with over 40+ members to contribute to the conversation
  • We're seeing usage in numerous areas to accommodate non-API ready data - assessments, EPP data, finance data and so on. 
  • Recognized that Data Import carries a maintenance burden that the implementer has to maintain
  • API path to data without Data Import is the best way to go

2.) Numerous viable open-source and low-cost alternatives exist for serving the ETL need

Outcomes:  Ed-Fi will continue on non-enterprise tools / ad-hoc integration — at some point at-scale please a stronger look at the alternatives above — no "perfect line" in this in tool solutions, 

Should we begin to look to how to work closer with the tools above?

3.) Data Import users will prefer an open-source path ahead for the product

  • Because of 2022 conversations heard, Ed-Fi was moved to open-source Data Import
  • In November 2022, we released Data Import 2.0 to an open source repo
  • We've moved the Template Sharing Service to a GitHub Exchange repo

4.) Additional Data Import SIG requests lead to these feature requests

  • John Bailey - has noticed the first time they are importing data, it seems faster than subsequent imports  (SF:  
  • Emilio and Rosh - Would like improved logging. It logs too much and they have to truncate the table often
  • Zurab - duplicated headers in files have posed issues; Emilio - the pre-processor could help with this issue
  • DI-1135 - Array Format in CSV
  • John Bailey - 1.3.2 included a Docker container, but it is unclear how to kick off a schedule
  • Zurab - Source code uses a library to work with FTP servers. The library does not work if the FTP server has a certain setting turned on. Works fine with SFTP, but not FTP.
  • Mike Werner - Documentation is lacking
  • No labels