The shift in mindset when developing software used by 10s vs 100,000s
date
Jan 28, 2025
outer_link
slug
developing-for-100000s
status
Published
tags
SWE
summary
My mindset has shifted to thinking differently when developing software that is used by thousands of users.
type
Post
A little story
When I was developing a feature to my team's existing product, I was very eager to quickly write the code, write some tests, deploy to non-prod and get it into prod. However, it was not so simple. I can't dive into the details of my work, but essentially my manager told me to take a step back. He told me to really think through how my change could impact the end users. He asked me some important questions that prompted me to think about the end users.
You see, my team's service is used by thousands of other engineers and build servers throughout the firm. Any issues to the service could result in a lot of problems for many other teams and engineers 😬. We really don't want that. He essentially wanted me to prove to him that my change would not break production. Only then can I really release to prod.
How do I prove that?
Well, I know it works perfectly in non-prod. I deployed it there and ran a bunch of our internal test projects. But that is not enough. You see when you have thousands and sometimes hundreds of thousands of users, it is very difficult to simply think of all the different cases users could use the service. Someone might use it in a way that you did not predict. Then how would I prove to him that?
You see in proofs, it is very easy to proof that something will not work. Simply provide one counter example where it doesn't work and boom, you can be sure that it won't work. But to prove that something works, you either have to show that it works for every individual case or prove it in some other way.
So thinking hard about this, what better way to give my manager the piece of mind than proving with some cold hard data. Data of all the users using the service in all the different ways captured from the logs. (It might be difficult to understand what I'm exactly talking about here without actually saying what I did, but I can't reveal that, sorry 😞.)
In some companies which have user facing products used by millions if not billions, there are entire dedicated teams which perform these types of analysis and they don't release to production in one go. They might do it first through AB tests or through some sort of canary deployments.
But for my team's service, I've got to do all that by myself bud 😏. But that is okay. This is necessary ownership that comes when owning a feature or product. You have to own the feature from development all the way to deployment and have to take responsibility for any issues.
So what next?
So I analysed all the usage logs, aggregating data and proved how my change will affect the systems and why it won't break production. As a result of my analysis I discovered that I had to do one slight change before deploying my change for extra safety 🦺.
Takeaway
This episode really thought me not to think like I'm still in school or developing a personal project. With personal projects, I just code and deploy anything I want without giving a second thought about how this might affect the user since there's little to no user traffic at all. I might be even able to get away with it if there are very little users since I mostly likely might have covered the obvious scenarios.
But when working on real software that are used by hundreds and thousands of users with real business impact, I realized the need to be very meticulous and rigorous in coding and releasing anything at all.
Of course even with very little users, there could be scenarios that you didn't think of, but it becomes more significant when thousands or hundreds of thousands of users use your product.
I've learned to always keep the end users in mind and substantiate any of my suggestions or solutions with data. This might not always be possible but whenever it is, I do it. So a key learning for me has been this:
Always keep the user in mind when developing, support my solutions with data especially when deploying to prod.