On Friday Apr 7, 2017 for approximately 30 minutes starting at 10:42am PDT, Close.io had an issue preventing proper loading of our app UI (web app & desktop apps). For users who already had the app (web or desktop) loaded on their screen, Close.io continued to function without issue. However opening new windows/tabs of Close.io or refreshing the page would not work properly due to a JavaScript error. Our API was unaffected.
This incident started after a deployment of completely unrelated backend code went wrong. The deployment prior to that was a change to how JavaScript dependencies are loaded into Close.io, to help with our goal of significantly reducing page load times. This prior change also introduced caching of webpack assets between builds to speed up developer wait times and deployment times.
This prior deployment was tested in both a development and staging environment and had been deployed to production for 3 hours without issue before the incident began. However, due to a flaw in the new webpack static asset caching between builds, the next (unrelated) deployment led to deploying a broken static asset manifest.
Once deployed, we immediately spotted the issue and started working to rollback the problematic code. Within about 15 minutes some new loads of Close.io began working properly. Within about 30 minutes, all servers had been fully reverted to non-broken code and the incident was resolved.
We're extremely sorry for this outage and are doing everything we can to help make sure this type of problem doesn't occur again. These action items include: