article describes the methodology and result of performance testing on the Ed-Fi ODS / API v3.2 (specifically, against release v3.2.0).
In brief, performance testing did not uncover any significant concerns in performance relative to versions 3.0.0 and 3.1.0.
Article Contents:
Volume testing of v3.2.0 occurred in July and August 2019, using the Locust-based 3.x performance testing framework (an Exchange contribution available in GitHub). This volume test covers the resources and HTTP verbs described in the Ed-Fi SIS vendor certification process. It runs for 30 minutes, spawning 30 clients that run simultaneously to perform tens of thousands of operations.
Northridge-inspired v3 data set, which contains 21,628 students.
The test lab environment used a three-server setup: one each for the database, web applications, and the Locust performance testing. VM "Sizes" listed here, such as "DS11_v2", are Microsoft-defined names for the specs of an Azure VM. Key specs are listed beside these size names. These sizes were chosen as their specs are comparable to those of the Vendor Certification VMs but have SSD disks to more closely match a production environment.
Image: Free SQL Server License: SQL Server 2017 Developer on Windows Server 2016
Size: DS11_v2 (Standard, 2 vcpus, 14 GB ram, 6400 max iops, 28GB local SSD)
Image: [smalldisk] Windows Server 2016 Datacenter
Size: B2ms (Standard, 2 vcpus, 8 GB ram, 4800 max iops, 16GB local SSD)
Image: [smalldisk] Windows Server 2016 Datacenter
Size: B2ms (Standard, 2 vcpus, 8 GB ram, 4800 max iops, 16GB local SSD)
In addition to testing the out-of-the-box installation of 3.0.0, 3.1.0, and 3.20, a few experiments were run in order to better understand possible performance impact. Two recent changes in the ODS/API affected the way it interacts with SQL Server; these changes were reverted for the API 3.2.0, singly and in combination, to ensure that updates were not harmful.A final test on 3.2.0 installed the Change Queries feature.
Version | Execution Date | Variation | # of Requests | Mean Response Time in ms | Max Response Time in ms |
---|
3.0.0 | 7/31/2019 |
| 124,985 | 150 | 4911 |
3.1.0 | 7/31/2019 |
| 128,717 | 135 | 2968 |
3.2.0 | 7/29/2019 |
| 128,789 | 136 | 2388 |
3.2.0 | 7/29/2019 | Restore SQL 2014 Compatibility Mode | 130,942 | 142 | 17444 Delete section/{id} |
3.2.0 | 7/29/2019 | SQL 2014 and restore NHibernate 2008 dialect | 124,316 | 155 | 2619 |
| 7/30/2019 | Back to 2016 compatibility mode, with 2008 dialect | 119,078 | 179 | 12513 Delete gradiginperiod/{1} |
3.2.0 | 8/13/19 | 3.2.0 with change queries enabled | 129,566 | 130 | 2680 |
- No two executions of the same code/configuration will result in the exact same mean response time — there is a degree of randomness in the Locust-based clients. Thus the difference between 130 ms, 135 ms, and 136 ms is not significant.
- That said, the use of NHibernate's 2008 dialect instead of the (new default) 2012 dialect does appear to have a negative impact. In other words, changing the dialect to 2012 was a good thing.
- Change queries adds update and delete triggers to these tables. It was hypothesized that these triggers would cause a noticeable slowdown in performance. In fact the performance appears to be better. Most likely this is a random change that is not meaningful. At these volumes, change queries do not appear to cause meaningful performance degradation. However, additional testing with higher volumes of updates and deletes could still turn up some impact.
Overall the web and database server statistics do not show any serious concerns. Database memory was higher for v3.2.0 than versions 3.0 and 3.1, despite reboots between each test execution. However, with the way that SQL Server ties up as much memory as it can, this is not immediately indicative of a problem.
In these stats below, the web server CPU load for v3.2.0 testing is surprising. Given that we didn't see a clear impact in the test results, it is hard to say that this is ultimately important.
The CPU load on the web server for the canonical v3.2.0 test execution appears to be an anomaly. The following chart compares all five runs on the v3.2.0 ODS/API.