Joseph Mariadassou
6 min readJul 1, 2022

--

An Intelligent User’s Guide to Cloud Computing on Amazon

Amazon Web Services (AWS) is a cloud computing service provider. They pioneered the use of public cloud in this millennium. Many otherwise savvy computer users are not aware of what it actually does. This blog addresses those users to help them decide if they really need cloud computing by describing some of the services AWS provides.

What is Cloud Computing?
Consider a computing service such as a web server that is perhaps accessed heavily during business hours and accessed lightly off-business hours. Assigning a computer that can handle the heavy load during the day is wasteful because it is idle at least 100 hours a week. A video streaming service like Netflix is used less during business hours. So we would like to use a system that can increase or decrease the computing power depending upon the usage. This is the service that Cloud Computing provides: a way to increase or decrease the use of computing resources depending upon the need.


The technology behind cloud computing is based on the concept of a Virtual Machine (VM), an idea that can be traced back to the seventies, at least. It is a piece of software that appears like a complete computer from a remote user's perspective. A physical machine can host more than one VM. For example if there are ten physical machines each hosting ten VMs then from the remote user's perspective there are one hundred independent computing devices. Amazon pioneered the use of sharing physical machines among disparate customers by assigning every customer their own VM. Two customers could be running VMs on the same hardware blissfully unaware of the existence of each other.

Each VM is allotted a certain number of CPUs and memory limited by the underlying hardware. When the demand for computing resource increases, the customer can ask for more VMs with the same characteristics as the previous one. This is known as horizontal scaling. Vertical scaling refers to allocating more resources to the same VM which may mean that the VM may need to be restarted.

A private cloud is a cluster of VMs, often hosted in one or more data centres, that is used exclusively by a single party. A community cloud is a cloud that a few organisations share exclusively. For example Amazon runs a cloud service exclusively for US government departments. Amazon also provides private cloud computing services. Their offering is called Outpost. VMware, Microsoft and Ubuntu have similar offerings. The main advantage of a private cloud is that some physical machines can be turned off by moving the few VMs in use to a small number of physical machines.

That is the history but as is wont with any good technology, cloud computing now has far more use cases. In the rest of the blog I'll discuss two of the main services that I think can be used by everyone. A good engineer does not ask: Of what use is it? The more pertinent question is: To what use can I put it to? Hope you can come up with uses that the rest of of the world has not thought about before.

Storage and Computing
Computers have always provided facilities for data storage and computing services whether it is a word processor or a spreadsheet. The advantage of the cloud is that you can use small quantities. Let us say you have a collection of pictures or documents. Some have hundreds of gigabytes of data but most of us have much less. But we all land up getting a terabytes of hard disk and we have to label each external hard disk either by date or by topic. Using online storage we don't have buy all that hard disk upfront. We can gradually rent more hard disk space. Most of the time we don't access old files although we would like to retain them should we need them. With online storage, data can be stored in slow devices so that you pay for the access and very little for the storage. The other advantage of online storage is reliability. Most businesses have more than one backup. Online storage providers usually backup data in different physical locations.

While the need for online storage is thus clear, what is the need for online computing service from the perspective of a small business or a private individual? Interactive applications like a word-processor or spreadsheet is better done on the local machine, although Citrix came up with idea of running MS Office on the server to optimise the number of licenses required. It is not very popular though. However service applications like payroll and inventory tracking are usually back end processes that are run periodically with little human interaction. It requires a detailed cost benefit analysis to decide if cloud computing would be viable in such cases. This is where the idea of server-less computing comes into play as we shall see later.


Storage
Most people store data as key-value stores. A file has a unique name for that storage device even if the name can be decomposed to folders and file. A file also has content whether it is text file or a spreadsheet. AWS provides two types of key-value stores. The first is called S3 (short for Simple Storage Service). The second is DynamoDB. S3 is used for storing large files up to a few terabytes (currently 5 terabytes) in size. The key is limited to 1024 (utf-8) characters. It is not possible to change a single byte in either the key or value in a key-value store. You need to update the whole key or value. The other key-value store, DynamoDB is meant for storing a large number of small data. The size of key and value put together is 400KB. However it is extremely fast.It is also quite cheap.
Amazon also provides Relational Databases (RDS) such as Oracle, MSSQL, MySQL or Postgres. These are meant for large database applications that need to handle complex queries. Most small applications though are better off using Dynamo.

Computing
As mentioned earlier cloud computing is used mostly because the demand for computing resources varies. There are two type of computing services, the first is the use of explicit VMs as described earlier. The other is the use of server-less computing where a single function is executed without the customer having to provision a VM. AWS provisions the VM on which the function runs.

Let us consider the use of a VM. Although it is possible to create a single VM, known as an EC2 instance, it is better to think of it as creating a node in a network of VMs. This network is known as Virtual Private Cloud or VPC. Hence creating a bare EC2 instance essentially means adding a node to the default VPC. But for any practical Internet facing application, Amazon advises to create at least two VPC's one private and the other public. A public VPC is one that can connect to the internet via an internet gateway. The private VPC is the workhorse that talks to the public VPC. In the simplest instance you would create an EC2 instance with a static IP address that is accessible from anywhere on the Internet. This EC2 instance is usually no more than a request forwarding machine. This is to ensure that all requests are vetted and often reformatted before handing over to the back-end on the private VPC. This is to prevent all kinds of malicious attacks. At worst the front end may go down but all the data in the back-end would be safe.

Serverless computing on AWS goes by the name Lambda, a term used in Functional programming to indicate a nameless function. In AWS it refers to a code that is not hosted on a customer provisioned VM. The advantage is that subject to limits it scales without the user having to do anything explicit. A Lambda function can be written in Python, Javascript, Java, C#, Go etc. The function can be triggered by an explicit HTTP call. It cannot access resources in a user's private VPC unless it is bound to a VPC. Doing so would mean it can access S3 and Dynamo without leaving the AWS network. Any single application of a Lambda function is restricted to fifteen minutes. If more time is required then either the solution has to be re-architected or an on-demand EC2 instance needs to be use.

Summary
Cloud computing services on AWS consists of two broad categories: Storage and Computing. Storage services include S3 and DynamoDB. There are two types of computing services: Server-based and server-less. In addition to these AWS also provides other services such as queues, notification services and services for software development and machine learning.
S3 is best service to get started with. A previous article showed how the command line can be used to avail the use of S3. The next article in this series will discuss how to use DynamoDB with S3 to store all your favourite songs along with the relevant blurbs.

--

--

Joseph Mariadassou

Software developer with interest in Politics, Philosophy and Economics