On March 27 at 5:11am PT (12:11pm UTC) a faulty deployment process has finished, leaving some of the Close.io app servers in an inconsistent state. That caused the majority of users' requests to fail for the next 29 minutes, until 5:40am PT (12:40pm UTC).
While investigating the issue, we realized two things:
Both of these issues resulted in our app servers responding to your requests with errors.
We were able to identify the issues and make a re-deployment, ensuring that all the servers are in a consistent state and that they run non-conflicting versions of 3rd party packages.
We've already started working on fixing the issues that surfaced during this incident. Our main priority right now is to ensure that errors like this will never happen again. We will achieve that goal by 1) Introducing a more robust staging environment, and 2) Making our deployment process atomic and more resilient to external issues.
We truly apologize for limited availability of Close.io during this incident.