loading...

CPU/Memory benchmarks of AWS EC2 5/4/3/AMD series

yaorenjie profile image Yao Ren Jie ・5 min read

Background

In this post, I will do a detailed benchmark of different type of AWS EC2 instances, to get below results:

  1. Performance improvements from c5/r5/m5 to c4/r4/m4? (for c3/r3/m3, definitely we need to replace)
  2. Comparison between AMD and Intel instances? e.g m5 vs. m5a
  3. Multiple small instances or one big instance? e.g 4 * (4 CPU, 8GB Memory) or 1 * (16 CPU, 32GB Memory)

To simplify the spelling, we defined some terms here:

  • C-Series or C: Computer Optimized Instances
  • M-Series or M: General Purpose Instances
  • R-Series or R: Memory Optimized Instances
  • 5th-Gen/4th-Gen/3rd-Gen: 5th/4th/3rd generation instances

TL;DR

  • Use 5th-Gen as much as possible
    1. 5th-Gen has better performance in CPU and Memory. ~25% faster in CPU and 10 times faster in memory.
    2. 5th-Gen is 15% cheaper than 4th-Gen - $0.231 per hour for c4.xlarge and $0.196 per hour for c5.xlarge in Singapore.
  • Use C-Series for CPU-bound applications, C-Series CPU performance is ~20% better than M/R-Series.
  • AMD instances is 20% faster in CPU but 25% slower in Memory.

Conclusions

Before I listed long introductions of benchmark and charts, I guess it is better to write down the conclusions first.

After several benchmarks, we can have conclusions from chart and data:

  1. Single-core CPU performance is almost the same in one series.
  2. Differences among C/M/R-Series are not only the ratio between CPUs and memory - C-Series is 1:2, M-Series is 1:4; R-Series is 1:8. C-Series single-core performance is ~20% better than M/R-Series.
  3. C-Series CPU Compute Optimized Instances CPU performance is ~20% better than M-Series and R-Series Instances.
  4. C/M/R-Series single-core performance in multiple-cores benchmark decay significantly when testing cores bigger than half of cores - performance running on all cores drops ~20% than running on less or equal to half of cores. This may be caused by over-sell of virtual machines.
  5. Amazon Linux 2 CPU performance is 8% better than CentOS 7, Ubuntu Server 18.04 LTS and Ubuntu Server 16.04 LTS.
  6. 5th-Gen memory performance is 10 times faster in sequential read/write than 4th-Gen and 3th-Gen.
  7. 5th-Gen memory performance is 2-3 times faster in random read/write than 4th-Gen and 3th-Gen.
  8. C5-Series/R5-Series CPU is ~25% faster than C4-Series/R4-Series.
  9. M5-Series CPU is ~20% faster than M4-Series.
  10. C-Series memory performance is ~8% faster than R-Series/M-Series.
  11. M-Series with AMD, CPU performance is ~20% faster than M-Series with Intel.
  12. M-Series with AMD, memory performance is ~25% slower than M-Series with Intel.

Preparations

Benchmark Tools

We used Sysbench and Geekbench to run the benchmarks.

I read a lot of different benchmarks before and sometimes it is hard to understand because authors usually put my emphasis on the results, not the procedures. I think not only the benchmark results are important, also the procedures. So in this section I will write the commands which generated benchmark results. Please feel free to skip.

Sysbench

Sysbench is a scriptable multi-threaded benchmark tool based on LuaJIT. It is most frequently used for database benchmarks, but can also be used to create arbitrarily complex workloads that do not involve a database server.

We will test on these aspects:

  • cpu: a simple CPU benchmark
  • memory: a memory access benchmark

CPU benchmark commands

sysbench cpu --threads=1 run

Parameters:
--threads, The total number of worker threads to create. Each thread will run on one CPU core.

Memory benchmark commands

Memory tests will run for read performance and write performance in seq(sequential) and rnd(random) mode.

sysbench memory --memory-oper=read --memory-access-mode=seq run
sysbench memory --memory-oper=write --memory-access-mode=seq run
sysbench memory --memory-oper=read --memory-access-mode=rnd run
sysbench memory --memory-oper=write --memory-access-mode=rnd run

Geekbench

Geekbench is a cross-platform benchmark that measures your system's performance. It will run several built-in tests on Single-Core and Multiple-Cores. The results are shown one Single-Core Score and one Multi-Core Score, like below:
Geekbench example result

Benchmark commands

cd Geekbench-5.1.0-Linux/
./geekbench5

Benchmark Plans

Single-core performance differences among different instance types

According to AWS “C5 and C5d instances feature either the 1st or 2nd generation Intel Xeon Platinum 8000 series processor (Skylake-SP or Cascade Lake) with a sustained all core Turbo CPU clock speed of up to 3.6 GHz.”, we need to know which type has the best single-core performance.

We will test on:

  1. c5 families:
    1. c5.large
    2. c5.xlarge
    3. c5.2xlarge
    4. c5.4xlarge
  2. m5 families:
    1. m5.large
    2. m5.xlarge
    3. m5.2xlarge
    4. m5.4xlarge
  3. r5 families
    1. r5.large
    2. r5.xlarge
    3. r5.2xlarge
    4. r5.4xlarge

Multiple-cores performance is linear with cores number

EC2 servers are always multiple-cores, and the pricing usually grows with cores linearly. For example, c5.large instances have 2 CPUs, $0.098 per hour; c5.xlarge instances have 4 CPUs, $0.196 per hour.

In this test, we want to make sure the performance grows linearly, like the relation between pricing and cores. We only need to test single-core performance on different generations instances if multiple-cores performance is linear with cores number. It will save a lot of time and make the whole benchmark easier to understand.

We will test on:

  1. c5.large
  2. c5.xlarge
  3. c5.2xlarge
  4. c5.4xlarge

Performance of different Operating systems

We will test single-core performance on different operating systems:

  1. Amazon Linux 2
  2. CentOS 7 (x86_64) - with Updates HVM
  3. Ubuntu Server 18.04 LTS
  4. Ubuntu Server 16.04 LTS

c5/c4/c3, r5/r4/r3, m5/m4/m3

This test is the main target for this benchmark - we want to know the performance improvements among c5/c4/c3, r5/r4/r3, m5/m4/m3.

We will test on:

  1. c5.large, c4.large, c3.large
  2. r5.large, r4.large, r3.large
  3. m5.large, m4.large, m3.large

m5 vs. m5a

“a” in m5a means AMD CPU, which is cheaper than Intel CPU. In this test we want to know the difference between AMD and Intel CPU.

We will test on:

  1. m5.2xlarge - tested in previous test
  2. m5a.2xlarge

Results

Single-core performance

Single-core performance

Multiple-cores performance and its decay

Multiple-cores performance

OS Performance

OS Performance of Sysbench
OS Performance of Geekbench

C-Series Performance - c5/c4/c3

CPU

C-Series CPU Performance

Memory

C-Series Memory Performance

R-Series Performance - r5/r4/r3

CPU

R-Series CPU Performance

Memory

R-Series Memory Performance

M-Series Performance - m5/m4/m3

CPU

M-Series CPU Performance

Memory

M-Series Memory Performance

References

  1. Benchmark data in Google Sheets

Posted on by:

yaorenjie profile

Yao Ren Jie

@yaorenjie

10 YOE Data | SysML | Frontend | DevOps

Discussion

markdown guide