Between 1am UTC and 3am UTC on 11 January, we experienced a small scale outage at Vero.
Thanks to our automated systems, our Operations team identified the issue immediately and promptly began work on a resolution.
In the last month we have added new caching stores to support our continued growth. This outage was a result of one of the critical servers in this cluster becoming unavailable. In the event of such an outage, the server should automatically fail over to a secondary instance. Our investigation has shown that this did not occur and we are currently working with our upstream provider to determine why.
There was no data loss during this time, as this was only part of our caching layer.
We can confirm that during this time only a subset of API and email processing saw delays. Once back online, it took around 30 minutes for us to process the subset of delays remaining.
We will continue to monitor this closely for the next 24 hours.
If you have any questions, please send us an email via firstname.lastname@example.org
. Have a great day wherever you are 🌏!
The Vero Infrastructure Team
Jan 12, 15:12 AEDT