Background
In this post, I will do a detailed benchmark of different type of AWS EC2 instances, to get below results:
- Performance improvements from c5/r5/m5 to c4/r4/m4? (for c3/r3/m3, definitely we need to replace)
- Comparison between AMD and Intel instances? e.g m5 vs. m5a
- Multiple small instances or one big instance? e.g 4 * (4 CPU, 8GB Memory) or 1 * (16 CPU, 32GB Memory)
To simplify the spelling, we defined some terms here:
- C-Series or C: Computer Optimized Instances
- M-Series or M: General Purpose Instances
- R-Series or R: Memory Optimized Instances
- 5th-Gen/4th-Gen/3rd-Gen: 5th/4th/3rd generation instances
TL;DR
- Use 5th-Gen as much as possible
- 5th-Gen has better performance in CPU and Memory. ~25% faster in CPU and 10 times faster in memory.
- 5th-Gen is 15% cheaper than 4th-Gen - $0.231 per hour for c4.xlarge and $0.196 per hour for c5.xlarge in Singapore.
- Use C-Series for CPU-bound applications, C-Series CPU performance is ~20% better than M/R-Series.
- AMD instances is 20% faster in CPU but 25% slower in Memory.
Conclusions
Before I listed long introductions of benchmark and charts, I guess it is better to write down the conclusions first.
After several benchmarks, we can have conclusions from chart and data:
- Single-core CPU performance is almost the same in one series.
- Differences among C/M/R-Series are not only the ratio between CPUs and memory - C-Series is 1:2, M-Series is 1:4; R-Series is 1:8. C-Series single-core performance is ~20% better than M/R-Series.
- C-Series CPU Compute Optimized Instances CPU performance is ~20% better than M-Series and R-Series Instances.
- C/M/R-Series single-core performance in multiple-cores benchmark decay significantly when testing cores bigger than half of cores - performance running on all cores drops ~20% than running on less or equal to half of cores. This may be caused by over-sell of virtual machines.
- Amazon Linux 2 CPU performance is 8% better than CentOS 7, Ubuntu Server 18.04 LTS and Ubuntu Server 16.04 LTS.
- 5th-Gen memory performance is 10 times faster in sequential read/write than 4th-Gen and 3th-Gen.
- 5th-Gen memory performance is 2-3 times faster in random read/write than 4th-Gen and 3th-Gen.
- C5-Series/R5-Series CPU is ~25% faster than C4-Series/R4-Series.
- M5-Series CPU is ~20% faster than M4-Series.
- C-Series memory performance is ~8% faster than R-Series/M-Series.
- M-Series with AMD, CPU performance is ~20% faster than M-Series with Intel.
- M-Series with AMD, memory performance is ~25% slower than M-Series with Intel.
Preparations
Benchmark Tools
We used Sysbench and Geekbench to run the benchmarks.
I read a lot of different benchmarks before and sometimes it is hard to understand because authors usually put my emphasis on the results, not the procedures. I think not only the benchmark results are important, also the procedures. So in this section I will write the commands which generated benchmark results. Please feel free to skip.
Sysbench
Sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. It is most frequently used for database benchmarks, but can also be used to create arbitrarily complex workloads that do not involve a database server.
We will test on these aspects:
- cpu: a simple CPU benchmark
- memory: a memory access benchmark
CPU benchmark commands
sysbench cpu --threads=1 run
Parameters:
--threads, The total number of worker threads to create. Each thread will run on one CPU core.
Memory benchmark commands
Memory tests will run for read
performance and write
performance in seq
(sequential) and rnd
(random) mode.
sysbench memory --memory-oper=read --memory-access-mode=seq run
sysbench memory --memory-oper=write --memory-access-mode=seq run
sysbench memory --memory-oper=read --memory-access-mode=rnd run
sysbench memory --memory-oper=write --memory-access-mode=rnd run
Geekbench
Geekbench is a cross-platform benchmark that measures your system's performance. It will run several built-in tests on Single-Core and Multiple-Cores. The results are shown one Single-Core Score
and one Multi-Core Score
, like below:
Benchmark commands
cd Geekbench-5.1.0-Linux/
./geekbench5
Benchmark Plans
Single-core performance differences among different instance types
According to AWS “C5 and C5d instances feature either the 1st or 2nd generation Intel Xeon Platinum 8000 series processor (Skylake-SP or Cascade Lake) with a sustained all core Turbo CPU clock speed of up to 3.6 GHz.”, we need to know which type has the best single-core performance.
We will test on:
- c5 families:
- c5.large
- c5.xlarge
- c5.2xlarge
- c5.4xlarge
- m5 families:
- m5.large
- m5.xlarge
- m5.2xlarge
- m5.4xlarge
- r5 families
- r5.large
- r5.xlarge
- r5.2xlarge
- r5.4xlarge
Multiple-cores performance is linear with cores number
EC2 servers are always multiple-cores, and the pricing usually grows with cores linearly. For example, c5.large instances have 2 CPUs, $0.098 per hour; c5.xlarge instances have 4 CPUs, $0.196 per hour.
In this test, we want to make sure the performance grows linearly, like the relation between pricing and cores. We only need to test single-core performance on different generations instances if multiple-cores performance is linear with cores number. It will save a lot of time and make the whole benchmark easier to understand.
We will test on:
- c5.large
- c5.xlarge
- c5.2xlarge
- c5.4xlarge
Performance of different Operating systems
We will test single-core performance on different operating systems:
- Amazon Linux 2
- CentOS 7 (x86_64) - with Updates HVM
- Ubuntu Server 18.04 LTS
- Ubuntu Server 16.04 LTS
c5/c4/c3, r5/r4/r3, m5/m4/m3
This test is the main target for this benchmark - we want to know the performance improvements among c5/c4/c3, r5/r4/r3, m5/m4/m3.
We will test on:
- c5.large, c4.large, c3.large
- r5.large, r4.large, r3.large
- m5.large, m4.large, m3.large
m5 vs. m5a
“a” in m5a means AMD CPU, which is cheaper than Intel CPU. In this test we want to know the difference between AMD and Intel CPU.
We will test on:
- m5.2xlarge - tested in previous test
- m5a.2xlarge
Top comments (1)
Hey, the memory picture in the "M-Series Performance - m5/m4/m3" is wrong. Thank you for sharing benchmarks of AWS EC2. 👍