[RESOLVED] Delayed inbound sms, voicemails and call events
Incident Report for Spruce Health
Postmortem

Summary

The reason for the delayed messages was because of communication with our transcription provider timing out to transcribe voicemails. The timeout on uploading a recording to the provider was not correctly tuned, leading to a build-up of messages that needed to be processed by a set of application workers and causing a backlog of messages that needed to be processed. The messages were being processed albeit in a delayed manner due to the communication issues.

Spruce was made aware of the issue via multiple customer complaints and the engineering team started investigating as soon as the issue was escalated.

Action items to mitigate future impact

  • Add an alarm on the application worker responsible for processing transcriptions and SMS. Note that we already had alarms in place for all but one of the workers. This will help ensure that should an issue like this arise again, we’ll be notified asap.
  • Fine tune the timeout in communication with the transcription provider to prevent a build-up in the event of communication errors in the future.
Posted Oct 18, 2022 - 17:31 PDT

Resolved
The issue was resolved and the system returned to being fully functional at around 11:53 am PT.
Posted Oct 18, 2022 - 11:53 PDT
Identified
From 10:05am PT to 11:53am PT on October 18 2022, voicemails, inbound sms and inbound call events reached provider's Spruce inboxes in a delayed manner. The events that were delayed had an indication in the message itself for how long they were delayed by.

There was no impact to inbound calls, outbound calls, secure message exchanges, video calls, email or fax.

Spruce identified this issue in response to customer complaints rather than the proactive monitoring in place for the system in general.
Posted Oct 18, 2022 - 10:05 PDT
This incident affected: Phone Call Routing and SMS Routing.