When it comes to running managed infrastructure in the cloud, I've been on both sides of the table: As a platform architect, I worked with microservice applications doing streaming IT automation. And as a solutions architect, I've worked with a lot of different companies, tech stacks, and problem spaces picking the right tool for the job.
Here are five tips on when and why you might want to work with a cloud based database service provider and how you can go about picking the right one.
1. Have a strong opinion about vendor lock-in
Vendor lock-in is anything that keeps you on the platform longer than you want to be there. It may or may not be a concern, but you should have a strong opinion about it.
People usually think about proprietary features or sticky functionality, like machine learning out of the box, or a fancy control panel, or some automated deployment mechanism. Fair enough, but it may or may not be really needed. You should definitely know if it's just a nice-to-have feature or if it's required for your success as a company.
But the second part that people don't often think about is the refactor cost of moving away from the platform you chose. Is it just a simple configuration change in one repository - or is it a slightly more expensive configuration change across 300 microservices? Or does it mean refactoring your code entirely? Keep that question in mind when you're picking a technology vendor. Lock-in isn't always about features. Lock-in is anything that keeps you from moving away.
2. Define acceptable downtime
You really need to start by defining what acceptable downtime is for your end users and then work backwards from there. What type of foundation do you need to build on to create an infrastructure that can provide your end users with the desired uptime SLA?
Usually uptime SLA is measured in nines.
Three nines, or 99.9%, uptime means about 45 minutes of downtime per month. The next denomination is usually 99.95%. That's about 20 minutes of downtime per month. And then four nines, 99.99%, is usually the highest industry standard, with about four minutes of downtime per month. Some services try to get a lot closer to five nines. However, you really need to figure out a good balance between engineering effort for uptime guarantees and building revenue-generating features.
When you're looking at databases as a service, read the fine print in the SLA agreement. A lot of services are very upfront about what they include and what is excluded from the uptime SLA.
For example, look at the maintenance windows and make sure that you know what is the expected maintenance window, if there's any downtime associated with that, and whether that's included in the SLA or not. Some cloud services provide an SLA that includes a 10-minute maintenance window, but they exclude that from the downtime SLA measurement.
So if you think about the 99.95% and 99.99%, the four nines of uptime is four minutes of downtime per month. If you add 10 minutes to that, it gets you a lot closer to the 99.95%. Keep that in mind when you're selecting databases and services.
3. Know your limitations
Whether you've been running Postgres in production for 10 years or you're starting brand new with Kafka, make sure that you have the right expertise in the equation to make yourself successful. And REALLY be honest with yourself.
It's always a shared responsibility matrix of who does what and who owns the DBA responsibilities. So there's a whole continuum from application level data schema design and query optimization all the way down to server patching and maintenance. Make sure that you split up the responsibilities. If you need help with the data schema design or query optimization, make sure that the vendor has professional services you can obtain to get that help, or bring in a third party.
But make sure that you don't build in anti-patterns when you're starting with the new technology. Four months down the road it's not that much fun to deal with refactoring and hot-deploying fixes to production.
4. Research hidden costs
If you're comparing a managed service provider against a DIY deployment, usually servers are really cheap in the cloud. People are going to be the most expensive component of that deployment. So make sure you factor in what it's going to cost for hiring DBA expertise, to ensure that you have everything you need to be successful. Make sure you know what it's going to cost to have the ops team available at 3:00 AM on a Saturday, and factor that in the people cost.
If you're comparing a DIY deployment against a cloud managed service provider, always factor in the hidden costs of the cloud, like networking. For example, when you're shipping data across regions, it becomes very expensive. To egress between regions for your services, or to egress your cloud, it's charged per gigabyte per month. So if you're thinking that you're going to be operating in the terabyte or petabyte per month level, you should absolutely calculate that out upfront to know what it's going to cost.
And the other factor is to make sure you know the cost of enterprise features like security and compliance cost. If you need to upgrade a tier to get SSO login, or if you need to upgrade to get compliance or even VPC peering, that has great implications on the overall cost.
5. Plan for growth together
At the end of the year, just like you do with your customers, every company wants to grow their revenue. You are a customer of the managed service provider, and they're going to want to grow your account. 10%, 20%, or maybe 50%. What's that going to look like at the end of the year?
If you're on GCP or AWS, maybe that just looks like adopting new services or just growing normally with scale. So you may be locked into a cloud deployment, or you may be locked into certain regions to get discounts, but you're not locked into a single service.
Alternatively, if you pick a vendor that specializes in a certain technology, like Elastic.co, Confluent for Kafka, or Redis Labs, you may be locked into that one service.
And when they try to grow your account, that's either increased usage or adopting new use cases into that technology. If you're pushing new use cases into a technology, just make sure that it's the right tool for the job.
For example, Kafka may or may not be the right choice to use as a database. I've used it like that in the past for smaller data sets where I need low latency changes, distributed across a clustered system, but there's cases where Kafka definitely wasn't the right answer for a database. So make sure you pick the right tool for the job.
Above all, remember, it's a partnership. You're passionate about what you're building and what you're selling to your end users. Make sure that they are, too! They're there to support you and grow with you.
Not using Aiven services yet? Sign up now for your free trial at https://console.aiven.io/signup!
You might also be interested in this APAC-centered webinar: How cloud and open source technology help apac businesses thrive.