Spruce platform outage

Incident Report for Spruce Health

Resolved

The issue has been resolved and Spruce system should be fully functional now.

Our engineering team identified the cause to be exhausted database storage due to an issue with database replication. A secondary database replica was unable to communicate with the main database for replication purposes, causing the binary logs used for replication to grow over time and exhaust storage on the primary database.

To temporarily resolve the issue, we increased the storage on the database. To fix the issue for the long term we have set alarms in place to ensure that we are notified if the storage grows beyond a certain threshold and if binary logs grow beyond a certain threshold. We have also resolved the replication issue that caused the binary logs to grow in the first place.
Posted Feb 20, 2019 - 15:42 PST

Update

We are continuing to monitor for any further issues.
Posted Feb 20, 2019 - 15:17 PST

Monitoring

Spruce should be back up and running. Inbound phone calls and SMS should also be coming through. We are monitoring the issue and will update here as soon as we have confirmed that the system is fully functional.
Posted Feb 20, 2019 - 15:07 PST

Update

We are continuing to work on a fix for this issue.
Posted Feb 20, 2019 - 14:59 PST

Identified

Customers are unable to access the web and mobile apps right now. Inbound phone calls and sms routing also impacted. Our engineering team has identified the issue to be at the database storage layer and are on it.
Posted Feb 20, 2019 - 14:50 PST
This incident affected: Web App, Mobile Apps, Phone Call Routing, Fax, and SMS Routing.