Considerations of Designing for Scale

FearZero

New member
Who am I?

I've went by a few usernames over the years, but I was very heavily in this specific community before cpvr first started this site all the way up until 2008. Then I came back 2009 - 2011. Since then I've been off doing life stuff. I don't really want to go too deep into who I am, I was pretty childish in the past. However I did learn a lot starting in this community and I've been able to achieve success with what I learned here.

My background includes the following:

  • Web development with html/css, javascript, php, and mysql (been using since 2002).
  • C/C++, Ruby, Java, C#, Python, VisualBasic 6 + .NET, and various other languages (all learned between 2002 and 2008).
  • Linux Systems Administration with cpanel, ubuntu, and RHEL. Bash scripting (been using since 2008).
  • Unreal Engine 4 (been learning and using this for VR game development since early 2017).
  • Cloud Computing Technologies (been using companies like Amazon, Rackspace, and Linode since 2010).
  • Currently I'm a DevOps Engineer (Development and Operations)
What Does it Mean to Design for Scale?

In the past, the way we would deploy a website would be to place an order with a company to have a physical server that hosted our services with them. Alternatively we would pay for a shared hosting account with a provider that uses something like cPanel. The process for getting a site up used to be super slow between sales and deployment, while today we can click a button and have a virtual private server running for $5 a month at some cloud hosting provider.

Having all of your code deployed to a single location can be fine starting off. However it is a single point of failure. Back when I was first trying to make my own Neopets, I never considered the Infrastructure implications. In fact I didn't realize what I was writing in PHP was going to affect the server's available resources. Even the MySQL queries I ran, I had no true concept of the impact they really had on the database as a whole, so I shoved every bit of information I could think of in a table somewhere. 

Around 2008, something truly great began to happen. Cloud Hosting became a really bit buzzword. When I heard the word cloud, it was misused a lot and it became some meaningless trope that I came to hate, because everything internet related was "cloud". Though after working for some of the big companies I ended up being at, I learned a lot about Cloud Computing and I began learning more about how to use such technologies. 

The key thing being, if your site has a spike in traffic, the ability to provide more resources (more servers running your code)  to ensure the site stays up. If you look at a big company like Netflix, their infrastructure is ran on Amazon Web Services (AWS). The reason for this is so that they can implement code that can phone AWS over a restful API to spin up more servers with the right services, as needed, to prevent downtime. They have a tool that always runs, and it will randomly break things to see if the service can handle such issues without immediate intervention. Their design philosophy is essentially that all of the services for the website, should work, always, regardless of something breaking. No one part of their infrastructure should cause everything to be down. There is no single point of failure.

So to design for scale simply means to write code and services, so that you can run redundant copies, in tandem between multiple places, without an outage in one location causing the site to go down.

How to Design for Scale?

This is a bit harder said than done, because you not only need to know how to make your application, but you need to learn how to deploy infrastructure. Unlike in yesteryear's past, we have companies like Amazon's AWS, Rackspace Cloud, Linode, Digital Ocean, RedHat OpenShift, OVH Cloud, and more. Some of them provide "Autoscale" solutions that you can look into and others you would need to manually implement solutions.

Deploying infrastructure manually also isn't the correct way to do things anymore and in most cases you don't even want to write code to interact with their APIs directly. You can treat your infrastructure like code with tools like Terraform (https://terraform.io). 

For newcomers that want to look into infrastructure on a budget, I recommend Digital Ocean, Linode, and OVH. They provide some scale at low costs. AWS has a "free tier" and very good tutorials for their service. They are by far the best for infrastructure and deployment, but there is a very big learning curve in addition to their pricing model being hard to follow. However the AWS getting started guides are very good resources and the sheer number of services they provide are vast. They provide cloud servers, file storage, CDN services, DNS services, ... (list goes on), and machine learning services. 

You will need to learn about how to setup systems for whatever you're doing. Most of you that are coding in PHP, want to look into setting up Apache (or Nginx), PHP, and MySQL on cloud servers running Linux (CentOS, or Ubuntu [##.04 LTS]).

You could completely skip the common LAMP (linux/apache/mysql/php) stack and use something like Go Language to make a binary that you can just run on Linux servers. Making microservices that your static website will do restful requests to is pretty much the future. This gives you an endpoint for both your website and potential phone applications to interact with.

Database management becomes incredibly different when you're designing for scale. Your application or microservice needs to where the database is and how to fallback when the primary database is down. MySQL (or MariaDB these days [if we're going to be real]) has the ability to be configured to replicate information in various ways. One of the most common is Master/Slave replication. However this puts you in a situation where one database is just passively acting. MariaDB has a solution called maxscale which is looking very promising. 

Shared storage is another consideration when you have the same application running on multiple servers. The correct way to do things is to have your website code on every device locally and all dynamic content should live in a file service of some sort to be served over CDN like S3 at AWS, Cloud Files at Rackspace, Cloud Spaces at Digital Ocean, and so on. 

If you're doing things right, you shouldn't be editing code that is in production while people might be accessing it. There are very well thought out deployment cycles that you can read up about and put into practice. Tools like Jenkins and JenkinsX make this a ton easier. Docker has made shipping code a lot easier over the years as well. 

Why to Design for Scale

At the end of the day maybe you don't make a big hit that earns made dollar bills. What you do end up getting is practice at skills that make you employable. The most in demand jobs these days are related to utilizing cloud services. Amazon and RedHat have entire courses that you can become certified in. Rackspace received government funding and got their Cloud Academy off the ground to teach Linux. RedHat is currently in a push to get as many Cloud Consultants as possible hired. [Meanwhile Digital Ocean is just doing what they always do, being that cool kid in NYC].

The ability to understand and work with Cloud Computing tech will not just make you a better developer, but it also is the toolkit that wasn't around when many of us started. 

A Final Word and Some Links

Some of this may be very hard to follow, because I just spat it out at the drop of hat. However here are some useful links to some of the things I talked about. 

Digital Ocean - https://www.digitalocean.com
Tutorials - https://www.digitalocean.com/community/tutorials
Documentation - https://developers.digitalocean.com/documentation/

AWS - https://aws.amazon.com/
Price Calculator - https://calculator.s3.amazonaws.com/index.html
Getting Started - https://aws.amazon.com/getting-started/
Documentation - https://aws.amazon.com/documentation/

Linode - https://www.linode.com/linodes
Getting Started - https://www.linode.com/docs/getting-started/
Developer API - https://developers.linode.com/api/v4

Rackspace Cloud - https://www.rackspace.com/cloud/cloud-computing
Cloud Servers Getting Started - https://support.rackspace.com/how-to/getting-started-with-cloud-servers/
Developer API - https://developer.rackspace.com/docs/cloud-servers/v2/

RedHat OpenShift - https://www.openshift.com/
Documentation - https://docs.openshift.com/

OVH VPS (Budget) - https://www.ovh.com/world/vps/
The OVH API - https://www.ovh.com/world/vps/api-restful.xml

OVH Cloud (Not Budget) - https://www.ovh.com/world/public-cloud/instances/
OVH Cloud API (OpenStack) - https://docs.ovh.com/fr/

 
Damn, didn’t even know half of this stuff existed O.O

Thanks for all this info, it will be something to really consider in the future! ?

 
It's really awesome to know both the programming side and the infrastructure side of things. Back in the day, I would be trolling for free cPanel accounts just to have a way to test PHP before I found out I could just install XAMPP and WAMP.

These days I just setup testing environments inside VirtualBox with CentOS, Windows Subsystem for Linux with Ubuntu, straight up on my own personal Linode and Digital Ocean accounts, and work related things on AWS on my company's dime.

I'm really enjoying terraform to manage my environments lately. It makes cloud stuff a lot easier once you get the hang of it. If you need a server, you just add some configuration, then run `terraform apply`. When you're done you just remove those config lines and run it again. It's wicked awesome.

 
It's really awesome to know both the programming side and the infrastructure side of things. Back in the day, I would be trolling for free cPanel accounts just to have a way to test PHP before I found out I could just install XAMPP and WAMP.

These days I just setup testing environments inside VirtualBox with CentOS, Windows Subsystem for Linux with Ubuntu, straight up on my own personal Linode and Digital Ocean accounts, and work related things on AWS on my company's dime.

I'm really enjoying terraform to manage my environments lately. It makes cloud stuff a lot easier once you get the hang of it. If you need a server, you just add some configuration, then run `terraform apply`. When you're done you just remove those config lines and run it again. It's wicked awesome.
Dang, it does sound awesome! I’ll go do some in depth research straightaway! ??‍♀️

 
Very good read, thank you! I heard of all of that already, because one of our customers is using AWS. So I know how to deploy using terraform and stuff but I never set up AWS by myself. It is a very complex theme but also a very interesting one. I really should look more into it. And maybe I can catch one or two things when the Senior is talking about it :D  

What I really would like to know is, what you think about the size of a project where "designing for scale" would be profitable? Because the costs of such an installation need to be earned each month.  Is this only something for the "big fishes" or would you recommend it to smaller projects, too?

Sorry for my English. Haven't used it for quite a while now...  I hope you still understand what I am talking about :D  

 
Thanks to companies like digital ocean, I would say that any project should be built with scale in mind. The ability to take your application from running on just one or two devices to 100s of devices (to accommodate a growing user foot print) is a design constraint that would make an app affordable to maintain in the long run. It prevents having to redo large portions of code and infrastructure.  

 
Back
Top