Compute: How to choose the right instance type from AWS, Azure or Google Cloud

Cloud providers offer various instance sizes to host your workloads flexibly. Those are available with a varied combination of CPU, memory, storage, and networking capacity extended with GPU (Graphical processing unit) capabilities.

The instance size has to be selected based on the use case needs so that the required performance is met and the cost is also optimized.

The most common questions to be addressed in this post are:

What are the instance families offered by cloud providers?
How do I choose the right instance size for my workload?
What is the instance series available under each family?
What will happen if I choose an instance randomly for hosting?

Imagine if you want to opt for Infra-as-a-service for computing. You need to know the pricing and the instance size or capability before you host your workloads. The use cases may vary scenario by scenario and also by the environment. Performance and cost should always provide trade-offs.

Based on this, cloud providers offer instance family or type that includes one or more instance sizes, allowing you to scale your resources to the requirements of your target workload.

Broadly there are five categories of instance families. They are:

General Purpose
Compute Optimized
Memory Optimized
Storage Optimized
Accelerated Computing

General purpose

These instances provide a balance of compute, memory and networking resources. You can opt for this when for a variety of diverse workloads. In simple words, this provides the best price-performance ratio for the workloads.

When we say balanced, the CPU and memory ratio will be 4x. For example, a 2 vCPU instances will have 8 GB RAM approximately. Note that this is just to make you remember and might vary based on the instance size offering.

AWS and Azure offers burstable instances. These Burstable Performance Instances provide a baseline level of CPU performance with the ability to burst above the baseline.

These instances are ideal for applications that use these resources in equal proportions such as

Web and App serving
Game serving
Small to medium databases
Back office apps
Microservices
Virtual desktops
Development, Test, and staging environments
Media/streaming
Development on CI/CD
Video and image encoding, transcoding, and processing
Build servers and code repositories

Compute Optimized

These instances are ideal for compute-bound applications that benefit from high-performance processors. They have the highest performance per core on Compute Engine and are optimized for compute-intensive workloads.

In our way, the CPU and memory ratio will be mostly 2x. For example, a 2 vCPU machines will have 4 GB RAM.

A few use cases of Compute-optimized instances are:

Compute-bound workloads
High-performance web servers
Gaming servers or multi-player gaming
Ad serving or server engines
Media transcoding
Scientific modeling
Batch processing
Distributed analytics
High-performance computing (HPC)
AI, Machine/deep learning inference
Video encoding

Memory Optimized

These instances are designed to deliver fast performance for workloads that process large data sets in memory. They are ideal for memory-intensive workloads, offering more memory per core. They are best suitable for enterprise applications running large enterprise databases.

In our way, the CPU and memory ratio will be mostly 8x. For example, a 2 vCPU machines will have 16 GB RAM of memory.

A few use cases of Memory Optimized instances are:

SAP, SQL, and NoSQL databases; distributed web scale in-memory caches, such as Memcached and Redis; in-memory databases and real-time big data analytics, such as Hadoop and Spark clusters; and other enterprise applications; production installations of SAP HANA in-memory database in the cloud. Certified by SAP for running Business Suite on HANA, the next-generation Business Suite S/4HANA, Data Mart Solutions on HANA, Business Warehouse on HANA, and SAP BW/4HANA in production environments.

SAP enterprise Apps
In-memory databases like SAP HANA: S/4 HANA, Data mart solutions on HANA, SAP BW/4 HANA
SQL and No-SQL Databases
Distributed in-memory caches – Redis or Memcache
Real-time Big data analytics – Hadoop, Spark clusters
Open source databases
In-memory analytics workloads
Business Warehousing (BW) workloads
Genomics analysis
SQL analysis services

Trade-offs with General purpose, compute and memory optimized instances

Let us illustrate this with an example. Let us consider the 4 core virtual CPU and see how the CPU and memory combination comes with General purpose, compute, and memory-optimized. There are several instance sizes available based on the type of processor – ARM/Intel, network bandwidth, and internal storage/disks supported which is not shown below.

Instance Type	vCPU	Memory in GB	Ratio	Comments
General Purpose	4	16	4x	Balanced cores and memory for any use case
Compute Optimized	4	8	2x	More compute focused which is CPU
Memory Optimized	4	32	8x	Offer more memory per core

Tab. Sample illustration of sizes across instance families

Apart from CPU and memory, network bandwidth, and storage support also matter. The available network bandwidth of an instance depends on the number of vCPUs that it has. The bandwidth ranges from 1 Gbps to 25 Gbps based on the instance size and acts as a baseline bandwidth. Cloud provider also uses network I/O credit mechanism to burst beyond their baseline bandwidth on a best-effort basis. For storage, you can look for SSD-based instance storage which acts as temporary block-level storage that comes with the instance.

Let us represent our understanding by plotting CPU and memory in a graph. Hope this provides better clarity.

***Fig. Classification of Instance Families***

Accelerated Computing

These instances are ideal for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high-performance computing (HPC). This family is the best option for workloads that require GPUs which is Graphical Processing Unit.

These instances use hardware accelerators, or co-processors, to perform functions, such as floating point number calculations, graphics processing, or data pattern matching, more efficiently than is possible in software running on CPUs.

A few use cases that can be fulfilled are:

CUDA-enabled ML training and inference
Massive parallelized computation
BERT natural language processing
Deep learning recommendation model (DLRM)
Machine learning
High-performance computing
Computational fluid dynamics
Computational finance
Seismic analysis
Speech recognition
Autonomous vehicles
Drug discovery
Molecular modeling, genomics
Object detection, Image recognition, and rendering
Natural language processing
Forecasting and Recommendation engines
Image and video analysis
Advanced text analytics
3D visualizations and 3D rendering
Graphics-intensive remote workstation
Document analysis, voice, and conversational agents,

Storage Optimized

These instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage.

They are optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications. The disk will be based on Non-Volatile Memory Express (NVMe) SSD-backed instance storage or HDD based disks.

Few use cases for Storage Optimized instances are:

Small to medium-scale NoSQL databases e.g. Cassandra, MongoDB, Aerospike
In-memory databases e.g. Redis
SQL and Scale-out transactional databases
Data warehousing
Elasticsearch
Analytics workloads.

Summary

Let us conclude the post with the instance types available across AWS, Azure and Google Cloud. The below table provides the list of instance series offered under each instance family or type.

Instance Family	Category	AWS	Azure	Google Cloud
General Purpose	Burstable instances	T-Series	B-Series	E-Series**
	Most common	M-Series	D-Series	N-Series**
	Other Instances	Mac and A1 series	A-Series	T-Series
Compute Optimized	Most common	C-Series	F-Series FX-Series	C-series**
	Other HPC instances	HPC	H-Series	C2D-Series
Memory Optimized	Most common	R-Series	E-Series	**
	Very large memory	X-series	M-Series	M-Series
	Other instances	High memory (U-series) z1d-series	DSv2 Series
Accelerated Computing	GPU enabled	P, G-Series	N-Series (NC, NV, ND)	A-Series
	FPGA enabled	F1	NP-Series
	Other instances	VT1 (U30-accelerators) Inf1 (Inferentia chips) DL1 (Gaudi Accelerators) Tr1 (Trainium Accelerators)
Storage Optimized	NVMe SSD-based	I-Series	Ls-Series
	HDD based	D-Series	G-Series

Tab. Compute Instance Comparison across AWS, Azure and Google Cloud

Note: Google Cloud categorizes E-series and N-series as part of General purpose but provides varied combination under that. For example, E-series comes with E2 standard (4x), E2 high-mem (8x) and E2 high-cpu (1x). Hence direct comparison is not shown in the table.