August 2024 - Project Tanager Workgroup
August 28, 2024, 11:00 am - 11:45 am central. Contact @Stephen Fuqua for the meeting invitation.
Agenda
Demonstration of current work
Review the roadmap
Design Questions
Attendees: Max Paulson (Denver Public Schools), JF Guertin (EdGraph), Joshua Impson (Resultant), Erik Joranlien (Education Analytics), Doug Loyo (Edufied), Geoff McElhanon (Edufied), Adam Hopkins (Simpat Tech on behalf of Ed-Fi), Brad Banister (Doubleline Partners on behalf of Ed-Fi), Vinaya Mayya (Ed-Fi Alliance), Stephen Fuqua (Ed-Fi Alliance).
Demonstration
Meeting notes:
Working demonstration of:
POST multiple resources
See data flowing into Kafka and OpenSearch
GET ALL and GET by query examples, pulling from OpenSearch
We think this will work with Elasticsearch, but have not tested yet.
Recommendation: consider adding advanced query capabilities found in Elasticsearch/OpenSearch.
Answer: yes, we will look into this, though it will not likely be a priority before Tech Congress.
Roadmap
Milestone 0.4
Milestone | Functional Goals |
---|---|
0.1 done | Compliant Discovery API, Descriptor API, and Resource API definition (except GET by query): able to run bulk upload, smoke test. Includes JSON validation based on API schema file. Fake OAuth (1). |
0.2 DONE |
|
0.3 in progress |
|
0.4 need to accelerate |
|
0.5 |
|
0.6 |
|
0.7 |
|
Meeting notes:
Kubernetes: there is strong interest, but it does not make sense for the Ed-Fi development team to continue learning Kubernetes.
JF may pursue more about Kubernetes setup.
Recommends using strimzi cloud-native version of Kafka
Recommendation: place production deployment utilities in a separate repository.
Architecture
Storage of Metadata
This is a sample message in Kafka, created from Debezium:
_lastModifiedDate
,_etag
(and_lineage
in the future?)Store as metadata or add to the document itself? What should downstream consumers receive?
Already sending “API Standard” rather than “Data Standard”. Example: “SchoolYearType” instead of “SchoolYear”.
Tenancy Routing
GitHub Discussion for asynchronous conversation: Multitenancy / routing / instance management
There are the requirements gathered from the multitenancy / routing / instance management workgroup:
Allow ODS instances to reside on separate database servers.
Support various database segmentation strategies: ODS per year, ODS per district, ODS per district per year, and ODS per tenant. (Reference: Segmentation strategies in ODS API)
Enable API administrators to choose between implicit routes (segmentation information not visible in the routes) and explicit routes (Reference: Context-Based Routing).
Provide isolated configuration for client and security setup by tenants, making it easier to onboard new tenants or remove existing ones.
Authentication
Given that client credentials will ultimately be in a 3rd party OAuth provider… do we want to give API clients the 3rd party OAuth token URL, or put it behind a proxy? If behind a proxy, three possible options:
Data Management Service
DMS Configuration Service
or API gateway / load balancer
Logging
Showing output from docker logs
:
Initial Startup
dms-postgresql:5432 - no response
Waiting for PostgreSQL to start...
dms-postgresql:5432 - no response
Waiting for PostgreSQL to start...
dms-postgresql:5432 - accepting connections
PostgreSQL is ready.
Installing Data Management Service schema.
Master ConnectionString => Host=dms-postgresql;Port=5432;Username=postgres;Password=******;Database=postgres
Beginning database upgrade
Checking whether journal table exists..
Journal table does not exist
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0000_Create_DMS_Schema.sql'
Checking whether journal table exists..
Creating the "schemaversions" table
The "schemaversions" table has been created
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0001_Create_Document_Table.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0002_Create_Alias_Table.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0003_Create_Reference_Table.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0004.1_Configure_Replication.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0004.3_Configure_Replication.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.0010_Create_Insert_References_Procedure.sql'
Executing Database Server script 'EdFi.DataManagementService.Backend.Postgresql.Deploy.Scripts.1000_Create_Reference_Validation_FKs.sql'
Upgrade successful
2024-08-28 16:32:13.988 INF Injecting PostgreSQL as the primary backend datastore
2024-08-28 16:32:14.042 INF Injecting OpenSearch as the backend query handler
2024-08-28 16:32:14.133 DBG Hosting starting
2024-08-28 16:32:14.248 INF Now listening on: http://[::]:8080
2024-08-28 16:32:14.251 DBG Loaded hosting startup assembly EdFi.DataManagementService.Frontend.AspNetCore
2024-08-28 16:32:14.252 INF Application started. Press Ctrl+C to shut down.
2024-08-28 16:32:14.253 INF Hosting environment: Production
2024-08-28 16:32:14.253 INF Content root path: /app
2024-08-28 16:32:14.254 DBG Hosting started
First several lines are from a Bash script
Next block of lines, before the first one with a date/time stamp, is obviously from initial database setup.
any concern about these lines not having date/time stamp or a log level? If so, likely we can remedy that with a small user story. But if there is no concern, then it is not worth spending any time on it.
Happy Path Flow
2024-08-28 16:35:39.815 DBG Connection id "0HN67FTPTI90Q" accepted.
2024-08-28 16:35:39.816 DBG Connection id "0HN67FTPTI90Q" started.
2024-08-28 16:35:39.839 INF Request starting HTTP/1.1 POST http://localhost:8080/data/ed-fi/schoolYearTypes - null 103
2024-08-28 16:35:39.843 DBG Wildcard detected, all requests with hosts will be allowed.
2024-08-28 16:35:39.872 INF {"method":"POST","path":"/data/ed-fi/schoolYearTypes","traceId":"0HN67FTPTI90Q:00000001","clientId":{"Value":"localhost:8080","HasValue":true,"Host":"localhost","Port":8080}}
2024-08-28 16:35:39.915 DBG 1 candidate(s) found for the request path '/data/ed-fi/schoolYearTypes'
2024-08-28 16:35:39.917 DBG Endpoint 'HTTP: POST /data/{**dmsPath} => Upsert' with route pattern '/data/{**dmsPath}' is valid for the request path '/data/ed-fi/schoolYearTypes'
2024-08-28 16:35:39.918 DBG Request matched endpoint 'HTTP: POST /data/{**dmsPath} => Upsert'
2024-08-28 16:35:39.920 INF Executing endpoint 'HTTP: POST /data/{**dmsPath} => Upsert'
2024-08-28 16:35:40.046 DBG Connection id "0HN67FTPTI90Q", Request id "0HN67FTPTI90Q:00000001": started reading request body.
2024-08-28 16:35:40.046 DBG Connection id "0HN67FTPTI90Q", Request id "0HN67FTPTI90Q:00000001": done reading request body.
2024-08-28 16:35:40.056 DBG FrontendRequest: FrontendRequest { Path = /ed-fi/schoolYearTypes, Body = {
"schoolYear": 2022,
"schoolYearDescription": "2022-2023",
"currentSchoolYear": true
}, QueryParameters = System.Collections.Generic.Dictionary`2[System.String,System.String], TraceId = TraceId { Value = 0HN67FTPTI90Q:00000001 } }
2024-08-28 16:35:40.056 DBG Entering ApiSchemaValidationMiddleware- TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.058 DBG Entering ApiSchemaFileLoader
2024-08-28 16:35:40.147 DBG Entering ApiSchemaSchemaProvider
2024-08-28 16:35:40.269 DBG Entering ProvideApiSchemaMiddleware- TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.270 DBG Entering ParsePathMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.273 DBG Entering ParseBodyMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.274 DBG Entering DuplicatePropertiesMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.275 DBG Entering ValidateEndpointMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.282 DBG Entering RejectResourceIdentifierMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.284 DBG Entering CoerceFromStringsMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.288 DBG Entering ValidateDocumentMiddleware- TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.296 DBG Entering ValidateEqualityConstraintMiddleware- TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.298 DBG Entering BuildResourceInfoMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.301 DBG Entering ExtractDocumentInfoMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.302 DBG IdentityExtractor.ExtractDocumentIdentity
2024-08-28 16:35:40.303 DBG IdentityExtractor.DeriveSuperclassIdentityFrom
2024-08-28 16:35:40.304 DBG ReferenceExtractor.Extract
2024-08-28 16:35:40.305 DBG DescriptorExtractor.Extract
2024-08-28 16:35:40.314 DBG Entering DuplicateReferencesMiddleware - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.317 DBG Entering UpsertHandler - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.323 DBG Resilience pipeline executing. Source: 'unknownFailureCircuitBreaker/(null)', Operation Key: 'null'
2024-08-28 16:35:40.332 DBG Entering PostgresqlDocumentStoreRepository.UpsertDocument - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.403 DBG Entering UpsertDocument.Upsert - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.496 DBG Upsert success as insert - TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.507 DBG Resilience pipeline executed. Source: 'unknownFailureCircuitBreaker/(null)', Operation Key: 'null', Result: 'InsertSuccess { NewDocumentUuid = DocumentUuid { Value = b0544561-30a3-4814-90ea-9a7c3e4104d5 } }', Execution Time: 182.7581ms
2024-08-28 16:35:40.509 DBG Document store UpsertDocument returned EdFi.DataManagementService.Core.External.Backend.UpsertResult+InsertSuccess- TraceId { Value = 0HN67FTPTI90Q:00000001 }
2024-08-28 16:35:40.511 DBG FrontendResponse: FrontendResponse { StatusCode = 201, Body = , Headers = System.Collections.Generic.Dictionary`2[System.String,System.String], LocationHeaderPath = /ed-fi/schoolYearTypes/b0544561-30a3-4814-90ea-9a7c3e4104d5 }
2024-08-28 16:35:40.519 INF Setting HTTP status code 201.
2024-08-28 16:35:40.519 INF Write content with HTTP Response ContentType of application/json; charset=utf-8
2024-08-28 16:35:40.520 INF Executed endpoint 'HTTP: POST /data/{**dmsPath} => Upsert'
2024-08-28 16:35:40.522 INF Request finished HTTP/1.1 POST http://localhost:8080/data/ed-fi/schoolYearTypes - 201 0 application/json; charset=utf-8 683.6045ms
2024-08-28 16:35:40.523 DBG Connection id "0HN67FTPTI90Q" disconnecting.
2024-08-28 16:35:40.523 DBG Connection id "0HN67FTPTI90Q" stopped.
2024-08-28 16:35:40.524 DBG Connection id "0HN67FTPTI90Q" sending FIN because: "The Socket transport's send loop completed gracefully."
2024-08-28 16:35:41.834 DBG Connection id "0HN67FTPTI90R" accepted.
2024-08-28 16:35:41.835 DBG Connection id "0HN67FTPTI90R" started.
2024-08-28 16:35:41.837 INF Request starting HTTP/1.1 GET http://localhost:8080/data/ed-fi/schoolYearTypes - null null
2024-08-28 16:35:41.837 INF {"method":"GET","path":"/data/ed-fi/schoolYearTypes","traceId":"0HN67FTPTI90R:00000001","clientId":{"Value":"localhost:8080","HasValue":true,"Host":"localhost","Port":8080}}
2024-08-28 16:35:41.839 DBG 1 candidate(s) found for the request path '/data/ed-fi/schoolYearTypes'
2024-08-28 16:35:41.839 DBG Endpoint 'HTTP: GET /data/{**dmsPath} => Get' with route pattern '/data/{**dmsPath}' is valid for the request path '/data/ed-fi/schoolYearTypes'
2024-08-28 16:35:41.839 DBG Request matched endpoint 'HTTP: GET /data/{**dmsPath} => Get'
2024-08-28 16:35:41.840 INF Executing endpoint 'HTTP: GET /data/{**dmsPath} => Get'
2024-08-28 16:35:41.843 DBG FrontendRequest: FrontendRequest { Path = /ed-fi/schoolYearTypes, Body = , QueryParameters = System.Collections.Generic.Dictionary`2[System.String,System.String], TraceId = TraceId { Value = 0HN67FTPTI90R:00000001 } }
2024-08-28 16:35:41.843 DBG Entering ApiSchemaValidationMiddleware- TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.936 DBG Entering ProvideApiSchemaMiddleware- TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.936 DBG Entering ParsePathMiddleware - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.936 DBG Entering ValidateEndpointMiddleware - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.936 DBG Entering BuildResourceInfoMiddleware - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.938 DBG Entering ValidateQueryMiddleware - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.940 DBG Entering QueryRequestHandler - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.940 DBG Resilience pipeline executing. Source: 'unknownFailureCircuitBreaker/(null)', Operation Key: 'null'
2024-08-28 16:35:41.942 DBG Entering OpenSearchQueryHandlerRepository.QueryDocuments - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:41.944 DBG Entering QueryOpenSearch.Query - TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:42.030 DBG Resilience pipeline executed. Source: 'unknownFailureCircuitBreaker/(null)', Operation Key: 'null', Result: 'QuerySuccess { EdfiDocs = [], TotalCount = 0 }', Execution Time: 89.5347ms
2024-08-28 16:35:42.031 DBG QueryHandler returned EdFi.DataManagementService.Core.External.Backend.QueryResult+QuerySuccess- TraceId { Value = 0HN67FTPTI90R:00000001 }
2024-08-28 16:35:42.031 DBG FrontendResponse: FrontendResponse { StatusCode = 200, Body = [], Headers = System.Collections.Generic.Dictionary`2[System.String,System.String], LocationHeaderPath = }
2024-08-28 16:35:42.032 INF Setting HTTP status code 200.
2024-08-28 16:35:42.032 INF Write content with HTTP Response ContentType of application/json; charset=utf-8
2024-08-28 16:35:42.033 INF Executed endpoint 'HTTP: GET /data/{**dmsPath} => Get'
2024-08-28 16:35:42.033 INF Request finished HTTP/1.1 GET http://localhost:8080/data/ed-fi/schoolYearTypes - 200 2 application/json; charset=utf-8 196.1507ms
2024-08-28 16:35:42.033 DBG Connection id "0HN67FTPTI90R" disconnecting.
2024-08-28 16:35:42.033 DBG Connection id "0HN67FTPTI90R" stopped.
2024-08-28 16:35:42.033 DBG Connection id "0HN67FTPTI90R" sending FIN because: "The Socket transport's send loop completed gracefully."
Most of this is telemetry information. Should this be going into the regular console log, or should we move that off to another log sink? (possibly using Open Telemetry).
We have not been very careful yet with our logging messages and we should do a little cleanup to make these more useful. For example:
INF Request starting HTTP/1.1 GET http://localhost:8080/data/ed-fi/schoolYearTypes - null null INF {"method":"GET","path":"/data/ed-fi/schoolYearTypes","traceId":"0HN67FTPTI90R:00000001","clientId":{"Value":"localhost:8080","HasValue":true,"Host":"localhost","Port":8080}}
We ended up logging the GET request twice
We used structured logging in the second entry, but nowhere else.
Sidenote: the
traceId
that you see here was auto-generated by ASP.NET. However, we have already built in support for using a correlation / trace ID header if it is present.I don’t know what that
clientId
value is about. Certainly not showing anything useful. I would have expectedclientId
to be something we pull from an auth token claim, but there was no auth token in this case.structured logs or unstructured? If we write our logging code correctly, we may be able to control this purely through Serilog settings, letting you choose structured or unstructured.
The date/time and the level should probably be in the structured message.
Most of these debug messages show entry into a function. With the
TraceId
present, you can follow the performance for an individual request.The function entered is not consistently separated from
TraceId
with “space dash space”.Wouldn’t a structured message be easier to use here?
Client Error
Performed a POST operation with an invalid payload.
The debug messages include the raw input value with no data masking. In Meadowlark, we kept the structure while masking the values. Shall we do the same in the DMS?
Note that the client error is not treated as an application error (with
ERR
log level).
Application Error
Stopped the PostgreSQL container. Client received the following response:
Console log:
We deliberately left the stack trace in place.
Not “structured” today
bug - Some of the DBG messages are mistakenly missing the
TraceId
in their output.
Application Log - Circuit Breaker
We’ve implemented the circuit breaker pattern, with configurable settings. Sample log after “a bunch” of calls to the DMS while PostgreSQL is down:
Note the
WRN
,FTL
, andERR
logs.Probably better to try to avoid the final stack trace, on the circuit breaker, as that does not add useful information.
However, a search on
0HN67FTPTI917
reveals that the actual error that finally cause the circuit to open has not been logged.Should the
FTL
message have simply been anERR
? After all, the application is still running, and the database connection may restore itself after a time.
Restart PostgreSQL
We shouldn’t be logging the query result when the circuit successfully re-closes.