This version of the Ed-Fi ODS / API is no longer supported. See the Ed-Fi Technology Version Index for a link to the latest version.
ODS / API Bulk Load Performance Testing
- Ian Christopher
- Chris Moffatt (Deactivated)
Executive Summary
In December 2015, the Ed-Fi Alliance concluded an effort to test the full spectrum of the ODS / API capabilities under load.1 The testing covered both transactional operations to create, read, update, and delete entities, as well as bulk operations that supporting the import of large files. This article discusses the results of the bulk load operation testing. The transactional operation results are reported in a separate article entitled ODS / API Transactional Performance Testing.
A summary of testing results follows:
The bulk load dataset contained over 17 million records, including more than 100,000 student records. The tests simulated two different service configurations, one using the Console Bulk Loader and the other using Bulk Load Services.
The Console Bulk Loader, which loads data from trusted sources via a server-side file queue, averaged 837 records inserted / second.
Bulk Load Services, which allows client applications to load data directly via the API and validates permissions for each record loaded, was able to maintain 454 records inserted / second.
The Console Bulk Loader is almost 50% faster than Bulk Load Services, largely due to additional authorization calls required when loading via the API. As usual, simplicity comes at a cost.
In bulk loading tests, the Database Server rarely experienced a usage spike greater than 25% CPU. For this reason, testing focused on the performance characteristics of the Bulk Services Server.
The solution can easily be scaled up or scaled out to handle larger organizations or increased performance needs. The load simulated by these tests approximates a fairly high degree of activity at a mid-sized organization, using a mid-sized (and inexpensive) virtual machine. The intent in using this configuration was to provide a baseline for organizations to use in planning.
Project Detail
This document provides a summary of the performance test approach employed, and reports on the results of the test effort.
Project Objectives
The load testing objectives were:
Validate that the ODS / API is performant for large bulk load operations.
Ensure the system is scalable.
Report the results to assist implementers in planning for production deployments.
Scope
The bulk load performance tests covered all XML bulk load file types supported by the as-shipped configuration of the ODS / API. The Bulk tests were run on both the Console Bulk Loader and the Bulk Load Services. These services allow for the importing of file-based data into the ODS / API, but use different solution configurations.
The Console Bulk Loader loads files from disk. The Console Bulk Loader is designed to run behind a secure file repository and so has the benefit of running without any security responsibilities of its own.
The Bulk Load Services are set up so that they take on the Security context of the client application that uploaded the files -- which means all interactions can happen through the API, but comes with security responsibilities that slow down performance.
For bulk testing, performance indicators such as CPU, Memory, Disk, and Network were monitored to identify the location of bottlenecks for every configuration and stress level.
Testing Methodology
Testing Architecture
The Ed-Fi ODS / API can be deployed in a variety of configurations, from a single server (as in a development or test machine) to various load-balanced, multi-machine configurations. The architecture tested was a fairly simple configuration, typical of small- and mid-sized production deployments.
Software & Platform Information
Microsoft Message Queue
SQL Server 2012 Enterprise
Ed-Fi ODS / API v2.0 Public Release
Software Components
Bulk Services. The back-end bulk load services, Upload Commit, and Background Worker that handle the processing of large XML payloads of data.
ODS Database. The SQL Server installation hosting the ODS and its supporting databases.
Server Configuration
The Load Testing configuration had the components distributed on AWS in the following configuration:
Series | Model | vCPU | Mem (GB) | SSD Storage (GB) | Throughput (Mbps) | |
---|---|---|---|---|---|---|
Box 1. ODS DB | R3 | 2xlarge | 8 | 61 | 1x160 | N/A |
Box 2. Bulk Services | R3 | xlarge | 4 | 30.5 | 1x80 | N/A |
Test Results
Bulk Services running as a service runs slower than the Console Bulk Loader for the same data set. The increase in time is largely because of the Authorization layer that is in place for the Service mode, whereas security is not the responsibility of the Console Bulk Loader. The fastest interchange that loaded was Student Discipline. It ran in Console Mode and clocked in at a speed of 2357 records per second. The slowest was Student Grade running in Service mode, clocking in at 326 records per second. It should be noted that descriptors were not included in this calculation because they do not run long enough to be correlated with the other long running data.
Recommendations
In all tested configurations, bulk loading data using the Console Bulk Loader loading data from files outperformed the Bulk Load Services. Installations bulk loading very large XML files from a trusted source should consider a solution design that can leverage the Console Bulk Loader.
Test Result Detail
This section contains detail about the testing methods and result data from the testing.
Simulation Method
A test data set aligned to the Ed-Fi Data Standard v2.0 was used for bulk testing. A PowerShell script was created that performed the following steps:
Loop through all files in the array (these were the file names of either the manifest files for bulk, or the actual files for upload to the API).
Mark the date and time, and query the database for the sum of all records in the database for all non-system tables.
Start the operation and wait for completion.
Mark the date and time, then query the database for all the records in the database for all non-system tables.
Calculate the amount of time it took between start and finish, and calculate the number of new records that appeared in the database as result of the load. Use that to calculate speed in the form of number of records inserted per second.
The File Uploader takes a file or folder, creates an operation using the API, and then streams either the file or all files in a folder for that operation, commits the operation, and then waits for the operation to complete.
Bulk Testing Results
This section provides detail about the results from testing bulk operations. We’ll start by looking at the results from loading a dataset containing over 17 million records related to over 113,000 students.
Component | Total Time to Load | # Records Loaded | Records / Sec |
---|---|---|---|
Console | 6:02:23 | 17,084,674 | 785.8 |
Bulk Services | 10:26:21 | 17,084,232 | 454.6 |
It is helpful to look at the two graphs above together. Note that the horizontal axis lists each package. The vertical axis in the top chart measures the time to execute each package in hours, and the vertical axis in the bottom chart measures the number of records loaded.2
The slight differences in records loaded are a function of the different security models used by the loading mechanisms. Student Enrollment, Student Transcript, Student Attendance and Student Grade seem to demonstrate the biggest difference in time to load when comparing Console and Bulk Services. This is expected because security is not a responsibility of the bulk loader in Console Mode, but is active in Service mode.
This final chart demonstrates the difference in speed between the different modes across the different interchanges, by looking at records per second. Notice that Console is always faster than Service, but the performance characteristics of the specific files loaded are generally the same.
Regardless of whether the bulk loader was running in Service or Console mode, it kept the CPU pegged constantly and experienced periods where it would not recycle memory. On occasion, it was necessary to restart the service to completely free memory so that testing could continue. In contrast, the database server CPU usage never spiked over 25% during execution.
1 The tests were performed on the then-current v2.0 of the ODS / API. Field testing indicates that v2.1, the latest ODS / API, has the same performance profile and characteristics.
2 Counts of records loaded include records of all types in a particular interchange. For example, the record count for the Student interchange is 788,941, which included over 113,000 students along with related records like student addresses, student characteristics, and so forth.