Info

title	Audience

IT and programming staff who wish to use the LMS Toolkit

Overview

The following components are available in the 1.0 release:

Canvas Extractor
Google Classroom Extractor
Schoology Extractor
LMS Data Store Loader

Please see LMS Toolkit for more information about the purpose of these tools.

Note
The LMS Data Store Loader pushes CSV files, created by the extractors, into a SQL Server database. That database can be the same as an Ed-Fi ODS. However, all of the data are loaded into tables in the `lms` schema instead of the `edfi` schema.

Pre-Requisites

Python 3.9 or higher

Warning
Python 3.9.5 has a bug that causes the extractors to crash, and thus should not be used. The Alliance's testing has used 3.9.4.

Note

title	Note on Python Version

In practice, these tools have only been tested on Windows 10; however, these tools should work from any operating system that supports Python 3.9.

Running the Tools

The LMS Toolkit components can be installed into other Python scripts as dependencies, or they can run as stand-alone command line scripts from the source code.

Deck

id	running

Card

id	running-packages
label	From Packages

The following commands install all fours tools into the active virtual environment; however, each tool is independent and you can install only the tools you need.

Code Block

language	bash

pip install edfi-canvas-extractor
pip install edfi-google-classroom-extractor
pip install edfi-schoology-extractor
pip install edfi-lms-ds-loader
pip install edfi-lms-harmonizer

Tip
To install the most current pre-release version, add the `--pre` flag on each command.

We have developed sample Jupyter notebooks that demonstrate execution of each extractor paired with execution of the LMS Data Store Loader:

Canvas
Google Classroom note the requirement for a service-account.json file.
Schoology

Card

id	running-source-code
label	From Source Code

The source code repository has detailed information on each tool. To get started, clone or download the repository and review the main readme file for instructions on how to configure and execute the extractors from the command line.

Extractors

Whether you run the extractors by incorporating into an existing Python package, or by using the stand-alone command line utility from the source repository, there are a number of required and optional arguments. When running with the command line tool, simply provide the --help option for the full set of options for each extractor.

Applies To	Argument	Required?	Purpose
All	Feature	No	Define which optional features are to be retrieved from the upstream system. Default: none. Available features: Assignments: `Activities` encompassing assignments and submissions. Activities : encompassing section activities and system activities. Experimental Attendance: attendance data. Only applies to Schoology.`Assignments`: encompassing assignments and submissions. Experimental Grades: section-level grades (assignment grades are included on the submissions resource). Experimental and only implemented for Canvas at this time. Note: Sections, Section Associations, and Users are always pulled from the Source System.
	Log Level	No	Valid options are: DEBUG, INFO (default), WARNING, ERROR, CRITICAL
	Output Directory	No	The output directory for the generated CSV files. Defaults to: `./data`.
	Sync database directory	No	Directory for storing a SQLite database that is used in support of synchronizing the data between successive executions of the tool. Defaults to: `./data`.
Google Classroom	Classroom account	Yes	The email address of the Google Classroom admin account.
	Usage start date	No	Start date for usage data pull in YYYY-MM-DD format.
	Usage end date	No	End date for usage data pull in YYYY-MM-DD format.
Schoology	Client key	Yes	Schoology client key.
	Client secret	Yes	Schoology client secret.
	Page size	No	Page size for the paginated requests. Defaults to: 200. Max value: 200.
	Input directory	No	Input directory for usage CSV files.
Canvas	Base URL	Yes	The Canvas API base url.
	Access token	Yes	The Canvas API access token
	Start Date	Yes	Start date for the range of classes and events to include, in YYYY-MM-DD format.
	End Date	Yes	End date for the range of classes and events to include, in YYYY-MM-DD format.

Tip

To retrieve multiple features with one call to the command line interface, list them out with spaces separating the values or commas. Examples:

Code Block

# Two ways to get these three optional features:
poetry run python .\edfi_google_classroom_extractor -f activities, grades, assignments
poetry run python .\edfi_google_classroom_extractor -f activities grades assignments

# Retrieve only the "activities" data (in addition to the core data set).
# Note the use of the "long flag" intead of `-f`.
poetry run python .\edfi_google_classroom_extractor --feature activities

LMS Data Store Loader

Please note that the Data Store Loader controls deployment of its own database schema. For performance optimization it also creates, drops, and renames tables during the upload process. Therefore the user account running this tool must have permission to modify the schema.

The following table lists the arguments for calling the loader utility.

Argument	Required	Purpose
DB server	yes	The destination database server/host name
DB port	no	Optional alternate port number (default: 1433)
DB name	yes	Name of the database to connect to on the host
Exceptions report directory	no	Optional directory for writing out CSV files with LMS records that could not be matched to SIS records
DB username	no	Optional database username (must either use username and password, or use integrated security)
DB password	no	Optional database password
Use integrated security	no	Optional flag to use integrated authentication when connecting to the database
Use encrypted connection	no	Enables an encrypted connection to the database
Trust server certificates	no	When encrypting the database connection, trust the server certificate. primarily used for development, not intended for production use
Log Level	No	Valid options are: DEBUG, INFO (default), WARNING, ERROR, CRITICAL

LMS Harmonizer

Database Tables

In addition to the tables created by the LMS Data Store Loader, in the lms schema, the LMS Harmonizer requires access to the Ed-Fi ODS database tables and to the lmsx extension tables. Currently the toolkit officially supports ODS/API Suite 3, version 5.2. It should work in other versions but has not been tested.

Install the lmsx schema tables through one of two options:
1. Initdev option for a fresh ODS database:
  1. Copy the EdFi.Ods.Extensions.LMSX folder from the LMS Toolkit source code into your Ed-Fi-ODS-Implementation/Application folder.
  2. In the WebAPI project, add a reference to the LMSX project
  3. Run initdev
2. Manual option for existing ODS databases:
  1. From source code, open folder extension\EdFi.Ods.Extensions.LMSX\Artifacts\MsSql\Structure\Ods
  2. Run each of the scripts there, in numeric order. If using change queries, run the scripts in the Changes folder as well
The Harmonizer has several stored procedures and views, which currently are only installed manually.
1. From source code, open folder extension\EdFi.Ods.Extensions.LMSX\LMS-Harmonizer.
2. Run each of the scripts there, in numeric order.

Arguments

The following table lists the arguments for calling the harmonizer utility.

Argument	Required	Purpose
DB server	yes	The destination database server/host name
DB port	no	Optional alternate port number (default: 1433)
DB name	yes	Name of the database to connect to on the host
Exceptions report directory	no	Optional directory for writing out CSV files with LMS records that could not be matched to SIS records
DB username	no	Optional database username (must either use username and password, or use integrated security)
DB password	no	Optional database password
Use integrated security	no	Optional flag to use integrated authentication when connecting to the database
Use encrypted connection	no	Enables an encrypted connection to the database
Trust server certificates	no	When encrypting the database connection, trust the server certificate. primarily used for development, not intended for production use
Log Level	No	Valid options are: DEBUG, INFO (default), WARNING, ERROR, CRITICAL

Analyzing Student Data Using Extractor Output

The LMS Data Store Loader pushes the extractor-created CSV files into a SQL Server database, where the data are available for use via standard SQL Server interfaces and tools. However, the CSV files can also be consumed directly to perform many interesting analyses. We have a developed a set of Jupyter notebooks that demonstrate analytics tasks that can be performed in Python using the Pandas framework, reading raw CSV files. Sample output from these notebooks is visible directly in GitHub, without needing to run the code locally:

Filesystem Tutorial / In Danger of Failing / Missing Assignment Submissions: how to use the LMS Toolkit scripts to understand and access output files created by the extractors. Also includes two analysis scenarios - looking for students who are in danger of failing, and looking for missing assignment submissions.
Record Counts: simply accesses all of the extracted files and provides summary count of records downloaded.
Student Logins: simple visualization showing frequency of student logins to the LMS.
Student Submissions: shows the count of assignments submitted per student, by status.

Operational Concerns

Logging

Deck

history	false
id	logging

Card

default	true
id	logging-packages
label	From Packages
title	Logging configuration when installing from packages

When you incorporate the LMS Toolkit components as package dependencies in other Python scripts, then you need to pass the log-level to the main facade class and you need to define the logging format. For example:

Code Block

import logging
import sys
from edfi_schoology_extractor.helpers.arg_parser import MainArguments as s_args
from edfi_schoology_extractor import extract_facade

# Setup global logging
logging.basicConfig(stream=sys.stdout, level=logging.INFO)

# Prepare parameters
arguments = s_args(
    client_key=KEY,
    client_secret=SECRET,
    output_directory=OUTPUT_DIRECTORY,
	# ----------- Here is the log level setting -----------
    log_level=LOG_LEVEL,
	# -----------------------------------------------------
    page_size=200,
    input_directory=None,
    sync_database_directory=SYNC_DATABASE_DIRECTORY
)

# Run the Schoology extractor
extract_facade.run(arguments)

When running from source code, each extractor logs output to the console; these log messages can be captured in a file by redirecting output to a file

Card

id	logging-source-code
label	From Source Code
title	Logging configuration when running from source code

LMS Extractors, DS Loader, and Harmonizer Errors

The components of the LMS Starter Kit take a unified approach to error reporting. The LMS Extractors, DS Loader, and Harmonizer are all command line utilities that send log information to standard output. To capture the logs for later review, redirect the output to a file using the standard ">" redirect operator. For example, using the Canvas LMS extractor:

Code Block

language	bash	title	Redirect to filepowershell

poetry run python edfi_canvas_extractor > output.log

All of the command line components take an optional "log level" parameter to adjust log output. For example, this can be set from the command-line as follows:

Code Block

language	powershell

poetry run python edfi-_canvas-_extractor > 2021-05-02-canvas.log

The above example assumes that all configuration has been placed into a .env file or environment variables.

The log level defaults to INFO. You can lower the number of log messages by changing to WARNING, or get increased logging by changing to DEBUG. The log level can be set at the command line, in a .env file, or an environment variable (the exact environment variable name depends on the extractor; run the extractor with --help for more information).

Set level to DEBUG

Code Block
language	bash
title	log-level WARNING > output.log

If any errors occurred during the script run, then there will be a final print message to the standard error handler as an additional mechanism for calling attention to the error: "A fatal error occurred, please review the log output for more information." Additionally, the application will exit with status code 1 if there were any log messages at the ERROR or CRITICAL level, otherwise it will exit with status code 0.The valid log level values are DEBUG, INFO (default), WARNING, ERROR, CRITICAL. The log level may also be set via the LOG_LEVEL environment variable.

Harmonizer Data Exception Reporting

In addition to logging, the Harmonizer can be configured to provide reporting on LMS data that could not be matched with ODS data. To enable this, set the optional "exceptions report directory" parameter to a location for the Harmonizer to write files to. For example, this can be set from the command-line as follows:

Code Block

language	powershell

poetry run python edfi_lms_harmonizer -canvas-extractor exceptions-report-log-level DEBUG > 2021-05-02-canvas.logdirectory C:\my-directory

The exceptions report directory may also be set via the EXCEPTIONS_REPORT_DIRECTORY environment variable.

Security

Upstream APIs

Each API has its own process for securing access. Please see the respective readme files for more information:

Data Storage

Given the LMS Toolkit deals with student data, both the filesystem and database (if uploading to SQL Server) are subject to all of the same access restrictions as the Ed-Fi ODS database.

Database Permissions

As noted in the LMS Data Store Loader section above, in addition to read and write permissions (db_datareader and db_datawriter roles), the database user running that tool must have permission to alter SQL schema, which is typically granted through membership in the db_ddladmin role.

The LMS Harmonizer can be run under an account that only has read and write permissions.

Scheduling

The API's provided by these three learning management systems are well defined at a granular level. From a performance perspective, this means that the process of getting a complete set of data is very chatty and may take a long time to process. It is difficult to predict the exact impact, although generally the time will scale proportional to the number of course sections. Some of the API's also do not have any mechanism for restricting the date range or looking for changed data, resulting in each execution of the extractor re-pulling the entire data set.

If running on a daily basis, then we recommend running after normal school hours to minimize contention with network traffic to the source system. If running weekly, then it may be best to run over the weekend.

It should be trivial to call these programs from Windows Task Scheduler, Linux chron, or a workflow engine such as Apache Airflow.

Contents

Table of Contents

Versions Compared

Old Version 13

New Version Current

Key

Overview

Pre-Requisites

Running the Tools

Extractors

LMS Data Store Loader

LMS Harmonizer

Database Tables

Arguments

Analyzing Student Data Using Extractor Output

Operational Concerns

Logging

LMS Extractors, DS Loader, and Harmonizer Errors

Harmonizer Data Exception Reporting

Security

Upstream APIs

Data Storage

Database Permissions

Scheduling

Page Comparison

Versions Compared

Old Version 13

New Version Current

Key

Overview

Pre-Requisites

Running the Tools

Extractors

LMS Data Store Loader

LMS Harmonizer

Database Tables

Arguments

Analyzing Student Data Using Extractor Output

Operational Concerns

Logging

LMS Extractors, DS Loader, and Harmonizer Errors

Harmonizer Data Exception Reporting

Security

Upstream APIs

Data Storage

Database Permissions

Scheduling