Scripting Makes Mistakes Easier Than Ever | Voice of the DBA
source link: https://voiceofthedba.com/2022/04/22/scripting-makes-mistakes-easier-than-ever/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Scripting Makes Mistakes Easier Than Ever
A number of you likely use Atlassian products like Jira, Confluence, Opsgenie, or something else. You might have been affected by a large outage they had (post incident blog, Company Q&A, TechRepublic report) recently which lasted at least 9 days. I don’t know if all customers have their data back and are working, but this was a surprisingly poorly handled incident according to a number of reports from customers. There’s a great write-up from the outside that you might want to read.
The bottom line in this issue is that Atlassian looked to deactivate a legacy product with a script, but they apparently didn’t communicate well among their teams. The script ended up using the wrong customer IDs and also marked the sites for permanent removal, not temporary removal (soft delete). While they supposedly test their restore capabilities, they weren’t prepared for partial restores of subsites. I’m guessing this is likely a partial database restore, which many of us know is way more complex than a full database restore.
Leave aside the issue of a software-as-a-service (SaaS) company failing their customers, and the lack of communication with customers. The more interesting thing for me is the challenge of poor coding and communication internally. Clearly, the project to deactivate their legacy app wasn’t well planned or tested and the code used was probably executed at too wide a scale initially.
When we deploy code changes to a large number of items, we want to test them at a small number first. Whether we are deploying to multiple databases, against many customers, or different systems, a standard method of making changes at scale involves working in rings. Azure DevOps describes this in docs, and they actually use rings to change the platform. We used the same pattern 20 years ago for software and database updates to many systems. We would internally deploy to a few users to look for issues. Then a week later we would deploy to a small number of systems to check for unexpected issues. Then typically to most systems in the third ring with a fourth ring a week later to catch up stragglers that needed more time to prepare.
I find many customers, especially those with sharded/federated databases or many systems unwilling to spread out deployments in this manner. Often they yield to pressure from business users to ensure everyone gets the same update at the same time. I would never recommend this approach as we need to ensure we are looking at scripts in a controlled environment, or even two, before we deploy things widely. I’d be even more cautious about one-off administrative scripts that might make a change similar to the one Atlassian attempted. Those are often not seriously tested enough.
At the very least, any of us working with multiple customers in a single database or in multiple databases ought to ensure we can backup and restore a single customer, but more importantly, can you restore a group of customers. If you make a mistake like Atlassian, which scripting allows us to do extremely rapidly, can you recover a partial set of data? Many of us don’t test this, but that’s likely something we ought to consider when we work with scripts that are designed to only change some data. Most of us don’t experience complete failures, but partial ones, usually because of human error. We ought to know how to deal with these situations.
Steve Jones
Listen to the podcast at Libsyn, Stitcher, Spotify, or iTunes.
Recommend
-
6
Cryptocurrency It's Now Easier Than Ever To Invest in Chinese Cryptocurrency Yuan Pay Group's cryptocurre...
-
3
Azure Cosmos DB API for MongoDB in the cloud – now easier than ever
-
5
Product Information
-
4
Horus Lugo Posted on Nov 12 ...
-
4
Additional polishing is included in the latest preview build The experience with the Windows Search hasn’t been exactly the most flawless, and recent blunders proved exactly why the Bing integration...
-
7
How Crypto Is Easier Than Ever With Life Crypto There is no denying that there are many benefits to cryptocurrency, as an investment tool and simply as a currency. Its value can possibly appreciate o...
-
4
3 mins readProduct NewsWe’re making it easier than ever for you to unsubscribeIf you don’t want to hear from us ever again, we want to make that ha...
-
5
WhatsApp is making it easier than ever to chat with new contacts By Daniel Allen Published 1 day ago New options...
-
5
A.I. Is Making It Easier Than Ever for Students to Cheat By Aki Peritz Sept 06, 20229:00 AM ...
-
7
Have you ever found yourself in a situation where you wanted to check the capabilities of MongooseIM, but yo...
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK