Automated data exchange can go wrong for a number of reasons. Some common reasons include:
- Systems or networks are down or unavailable
- An authorization credential was revoked or has gone stale
- A SSL certificate has expired, so a secure/encrypted transaction is unavailable
- Data quality - users have input bad data that doesn't meet the API's requirements
- There is a bug ... somewhere!
...and the list goes on. The point is that things will go wrong. How your product handles API conditions is critical. Below are some best practices in this area.
Contents
...
Table of Contents | ||
---|---|---|
|
Retry transactions
...
Some HTTP return codes—like a 500—might be resolved when the remote system is reset or otherwise fixed. If it makes sense to retry after a period of time, do so. General best practice is to do so on a decaying (i.e., longer and longer) interval. There are a number of best practices guides on this technique on the Web.
However, don't over-retry transactions: Most failure conditions won't be resolved over time, and you do not want your code to become a burden to the API. However, out of respect for your users' time, retry in some cases before you report errors.
Implement a failure threshold, and stop
...
Your implementation might run across cases in which many, many errors start appearing. In such cases, rather than continuing, it is often best to implement a threshold for errors and stop. To let code continue to try to transact in such conditions risks burdening your system and the API host system.
It is difficult to give blanket advice on when to stop, as the particular use case should be considered, and in particular, why repeated failure might be occurring.
It is common for a missing dependency to cause chains of transaction failures when writing to an API. For example, if an API client is trying to register a student for a school, and the school does not exist, all registrations will fail, and continue to fail until the school is created on the API. Likewise, authorization issues commonly cause lots of errors.
You should consider whether your threshold should be by API resource or transaction type. For example, just because enrollments into one school are failing does not mean that transactions for other schools are as well.
Catch and surface errors to users, with context
...
This may seem obvious, but users need to be alerted when there are API transaction failures. Generally, the API provides an error message that should be shown to the user, but the details on the transaction might be helpful as well. For example, showing the user the data fields submitted or even the JSON that failed to load might be helpful.
Features that allow users to sort and filter errors by type, time, error message or data entity/API resources are also helpful. This allows users to discern patterns that your software cannot easily detect.
Like the best practices above, how best to implement this feature may depend on the use case, but the goal is to help the user identify and fix the problem.
If you are an API implementer, use error messages that are precise, accurate, and actionable.