June 21, 2026Engineering

How I Migrated Production To Kubernetes With Zero Downtime

The real story of moving a production app from Docker Compose to K3s Kubernetes without a single minute of downtime, and why running both environments in parallel is the only sane way to do it.
Kubernetes
K3s
DevOps
Zero Downtime Deployment
Docker
For a long time, deploying meant SSHing into a server, pulling the latest images, and running a Docker Compose command while watching the logs and hoping. Most of the time it was fine. The times it was not fine were memorable. A container would fail to come up clean, the old one was already gone, and the site was down while I scrambled. Traffic would spike and the single box would strain because there was nothing to spread the load across. Every deploy carried a small tax of dread. Moving that app to Kubernetes is the thing that removed the dread. Here is how it actually went, and why the migration itself caused zero downtime. Kubernetes gets talked about like it is only for companies with a hundred engineers. It is not. The features that matter for a small team are the boring ones. Rolling deploys mean a new version comes up and gets proven healthy before the old one goes away, so a bad deploy never takes the site down. Self healing means a crashed container gets restarted automatically instead of waiting for a human to notice. Scaling means another copy of the app is a config change, not a manual server build. I run a lightweight distribution called K3s for this. It is real, certified Kubernetes, but it runs comfortably on modest hardware without the operational weight of a big managed cluster. For an app that does not need to span data centers, it is the right amount of Kubernetes. The single most important decision in a zero downtime migration is to never cut over blind. The old environment keeps running and keeps serving real users the entire time the new one is being built and tested. Nothing changes for users until the new environment has proven it can do the job. That sounds obvious written down. The reason people get burned is that they treat the migration as one big switch flipped at midnight, and when something they did not anticipate breaks, there is nothing to fall back to. Running both removes that whole category of disaster. The new cluster is just a thing being validated in the background until the moment it is ready, and the rollback plan is simply leaving the old environment exactly where it is. The app was already in Docker, which helped, but Compose hides a lot of assumptions that Kubernetes forces you to make explicit. Where does configuration come from. What happens to uploaded files when a container is replaced. How does the app know it is healthy. Kubernetes wants real answers to all of that, which is annoying for an afternoon and a gift forever, because those assumptions stop being invisible. I moved configuration into proper config and secret objects so nothing sensitive lived in an image. I made sure anything that wrote files wrote them somewhere durable, not inside a container that disappears on the next deploy. And I added real health checks so the cluster could tell the difference between a container that is running and one that is actually ready to serve, which is the whole foundation of a safe rolling deploy. App containers are stateless and easy to run in parallel. The database is neither. The safe move is to keep a single source of truth for data through the entire cutover and migrate only the stateless app traffic. The new app environment points at the same database the old one uses, so there is no moment where data is being copied while writes are still landing. You move the traffic, not the data, and the data never has two masters fighting over it. That single decision is what makes the rest calm. Once the new cluster is happily talking to the real database and serving correct responses, the only remaining step is moving traffic. By the time I changed where traffic pointed, the new cluster had already been serving validation traffic and answering correctly. The switch itself was a routing change, and because the old environment was still standing and still connected, the worst case was moving traffic back in seconds. It was not needed. Users saw nothing. There was no maintenance window, no banner, no apology email. The numbers that came out the other side were the point. Backend uptime went from 98.3% to near 100%, because the new setup heals itself and deploys without dropping requests, and the system I migrated this way serves more than 150 client organizations. The day to day change was just as valuable. Deploys stopped being an event. They became a push. The reason to do this is not that Kubernetes is fashionable. It is that fragile hosting taxes you constantly, in held breaths at deploy time and in outages you cannot prevent. Moving to a cluster that heals itself and deploys without downtime turns the most stressful part of running an app into the most boring part, which is exactly where you want it. If your app lives on a single server or a Compose file and every deploy is a gamble, this is the kind of work I do. The full breakdown is on the Cloud and Kubernetes Migration service page, and if slow or unreliable performance is the deeper problem, the API and Database Performance service is the companion to this one. Can you really migrate with zero downtime? Yes, as long as you run both environments in parallel and move traffic only once the new one is proven, with the old one ready to take it back instantly. Do I need full Kubernetes or is K3s enough? For most small to mid sized apps, K3s is plenty, and it is what I run on my own products. What is the riskiest part? The database. Keep one source of truth for data during the cutover and move only the stateless app traffic. Where can I get help? Migration is one of my services. The service page has the full writeup and a way to book a call.

Frequently asked questions

Can you really migrate to Kubernetes with zero downtime?

Yes, if you do not treat it as a single switch. You stand the new environment up alongside the old one, point it at the same database, verify it is healthy under real traffic, and only then move traffic across. The old environment stays ready to take traffic back instantly, so a problem is a rollback, not an outage.

Do I need full Kubernetes or is K3s enough?

For most small to mid sized apps, K3s is plenty. It is a lightweight, certified Kubernetes distribution that runs comfortably on modest servers and gives you the same rolling deploys, self healing, and scaling without the operational weight of a large managed cluster. I run my own products on K3s.

What is the riskiest part of a migration?

The database, almost always. App containers are easy to run in parallel. State is not. The safe pattern is to keep one source of truth for data during the cutover and move only the stateless app traffic, so you are never trying to migrate live data and live traffic at the same moment.

Where can I get help with a migration like this?

This is one of the services I offer. There is a full writeup on the Cloud and Kubernetes Migration service page, and you can book a call from there if you want your own app moved off fragile hosting.
How I Migrated Production To Kubernetes With Zero Downtime | Kevin Gabeci