Cloud providers offer various instance sizes to host your workloads flexibly. Those are available with a varied combination of CPU, memory, storage, and networking capacity extended with GPU (Graphical processing unit) capabilities.
The instance size has to be selected based on the use case needs so that the required performance is met and the cost is also optimized.
The most common questions to be addressed in this post are:
- What are the instance families offered by cloud providers?
- How do I choose the right instance size for my workload?
- What is the instance series available under each family?
- What will happen if I choose an instance randomly for hosting?
Imagine if you want to opt for Infra-as-a-service for computing. You need to know the pricing and the instance size or capability before you host your workloads. The use cases may vary scenario by scenario and also by the environment. Performance and cost should always provide trade-offs.
Based on this, cloud providers offer instance family or type that includes one or more instance sizes, allowing you to scale your resources to the requirements of your target workload.
Broadly there are five categories of instance families. They are:
- General Purpose
- Compute Optimized
- Memory Optimized
- Storage Optimized
- Accelerated Computing
General purpose
These instances provide a balance of compute, memory and networking resources. You can opt for this when for a variety of diverse workloads. In simple words, this provides the best price-performance ratio for the workloads.
When we say balanced, the CPU and memory ratio will be 4x. For example, a 2 vCPU instances will have 8 GB RAM approximately. Note that this is just to make you remember and might vary based on the instance size offering.
AWS and Azure offers burstable instances. These Burstable Performance Instances provide a baseline level of CPU performance with the ability to burst above the baseline.
These instances are ideal for applications that use these resources in equal proportions such as
- Web and App serving
- Game serving
- Small to medium databases
- Back office apps
- Microservices
- Virtual desktops
- Development, Test, and staging environments
- Media/streaming
- Development on CI/CD
- Video and image encoding, transcoding, and processing
- Build servers and code repositories
Compute Optimized
These instances are ideal for compute-bound applications that benefit from high-performance processors. They have the highest performance per core on Compute Engine and are optimized for compute-intensive workloads.
In our way, the CPU and memory ratio will be mostly 2x. For example, a 2 vCPU machines will have 4 GB RAM.
A few use cases of Compute-optimized instances are:
- Compute-bound workloads
- High-performance web servers
- Gaming servers or multi-player gaming
- Ad serving or server engines
- Media transcoding
- Scientific modeling
- Batch processing
- Distributed analytics
- High-performance computing (HPC)
- AI, Machine/deep learning inference
- Video encoding
Memory Optimized
These instances are designed to deliver fast performance for workloads that process large data sets in memory. They are ideal for memory-intensive workloads, offering more memory per core. They are best suitable for enterprise applications running large enterprise databases.
In our way, the CPU and memory ratio will be mostly 8x. For example, a 2 vCPU machines will have 16 GB RAM of memory.
A few use cases of Memory Optimized instances are:
SAP, SQL, and NoSQL databases; distributed web scale in-memory caches, such as Memcached and Redis; in-memory databases and real-time big data analytics, such as Hadoop and Spark clusters; and other enterprise applications; production installations of SAP HANA in-memory database in the cloud. Certified by SAP for running Business Suite on HANA, the next-generation Business Suite S/4HANA, Data Mart Solutions on HANA, Business Warehouse on HANA, and SAP BW/4HANA in production environments.
- SAP enterprise Apps
- In-memory databases like SAP HANA: S/4 HANA, Data mart solutions on HANA, SAP BW/4 HANA
- SQL and No-SQL Databases
- Distributed in-memory caches – Redis or Memcache
- Real-time Big data analytics – Hadoop, Spark clusters
- Open source databases
- In-memory analytics workloads
- Business Warehousing (BW) workloads
- Genomics analysis
- SQL analysis services
Trade-offs with General purpose, compute and memory optimized instances
Let us illustrate this with an example. Let us consider the 4 core virtual CPU and see how the CPU and memory combination comes with General purpose, compute, and memory-optimized. There are several instance sizes available based on the type of processor – ARM/Intel, network bandwidth, and internal storage/disks supported which is not shown below.
Instance Type | vCPU | Memory in GB | Ratio | Comments |
General Purpose | 4 | 16 | 4x | Balanced cores and memory for any use case |
Compute Optimized | 4 | 8 | 2x | More compute focused which is CPU |
Memory Optimized | 4 | 32 | 8x | Offer more memory per core |
Apart from CPU and memory, network bandwidth, and storage support also matter. The available network bandwidth of an instance depends on the number of vCPUs that it has. The bandwidth ranges from 1 Gbps to 25 Gbps based on the instance size and acts as a baseline bandwidth. Cloud provider also uses network I/O credit mechanism to burst beyond their baseline bandwidth on a best-effort basis. For storage, you can look for SSD-based instance storage which acts as temporary block-level storage that comes with the instance.
Let us represent our understanding by plotting CPU and memory in a graph. Hope this provides better clarity.
Accelerated Computing
These instances are ideal for massively parallelized Compute Unified Device Architecture (CUDA) compute workloads, such as machine learning (ML) and high-performance computing (HPC). This family is the best option for workloads that require GPUs which is Graphical Processing Unit.
These instances use hardware accelerators, or co-processors, to perform functions, such as floating point number calculations, graphics processing, or data pattern matching, more efficiently than is possible in software running on CPUs.
A few use cases that can be fulfilled are:
- CUDA-enabled ML training and inference
- Massive parallelized computation
- BERT natural language processing
- Deep learning recommendation model (DLRM)
- Machine learning
- High-performance computing
- Computational fluid dynamics
- Computational finance
- Seismic analysis
- Speech recognition
- Autonomous vehicles
- Drug discovery
- Molecular modeling, genomics
- Object detection, Image recognition, and rendering
- Natural language processing
- Forecasting and Recommendation engines
- Image and video analysis
- Advanced text analytics
- 3D visualizations and 3D rendering
- Graphics-intensive remote workstation
- Document analysis, voice, and conversational agents,
Storage Optimized
These instances are designed for workloads that require high, sequential read and write access to very large data sets on local storage.
They are optimized to deliver tens of thousands of low-latency, random I/O operations per second (IOPS) to applications. The disk will be based on Non-Volatile Memory Express (NVMe) SSD-backed instance storage or HDD based disks.
Few use cases for Storage Optimized instances are:
- Small to medium-scale NoSQL databases e.g. Cassandra, MongoDB, Aerospike
- In-memory databases e.g. Redis
- SQL and Scale-out transactional databases
- Data warehousing
- Elasticsearch
- Analytics workloads.
Summary
Let us conclude the post with the instance types available across AWS, Azure and Google Cloud. The below table provides the list of instance series offered under each instance family or type.
Instance Family | Category | AWS | Azure | Google Cloud |
General Purpose | Burstable instances | T-Series | B-Series | E-Series** |
Most common | M-Series | D-Series | N-Series** | |
Other Instances | Mac and A1 series | A-Series | T-Series | |
Compute Optimized | Most common | C-Series | F-Series FX-Series | C-series** |
Other HPC instances | HPC | H-Series | C2D-Series | |
Memory Optimized | Most common | R-Series | E-Series | ** |
Very large memory | X-series | M-Series | M-Series | |
Other instances | High memory (U-series) z1d-series | DSv2 Series | ||
Accelerated Computing | GPU enabled | P, G-Series | N-Series (NC, NV, ND) | A-Series |
FPGA enabled | F1 | NP-Series | ||
Other instances | VT1 (U30-accelerators) Inf1 (Inferentia chips) DL1 (Gaudi Accelerators) Tr1 (Trainium Accelerators) | |||
Storage Optimized | NVMe SSD-based | I-Series | Ls-Series | |
HDD based | D-Series | G-Series |
Note: Google Cloud categorizes E-series and N-series as part of General purpose but provides varied combination under that. For example, E-series comes with E2 standard (4x), E2 high-mem (8x) and E2 high-cpu (1x). Hence direct comparison is not shown in the table.