loading...
Lambda Store

Swifter Than DynamoDB: Lambda Store - Serverless Redis

mattiabi profile image Mattia Bianchi Updated on ・14 min read

As a Serverless Redis service, Lambda Store is an alternative to both DynamoDB and ElastiCache. In this post, I'll focus on one of the cases that you should use/pick Lambda Store instead of DynamoDB. It is, simply, latency.

According to our benchmarks, we observed that DynamoDB has higher latencies (mean, p99,p99.9, max) when compared to Lambda Store. Yes, we know benchmarks are biased for specific use-cases they are created for. Still, they are meaningful if you have similar use-cases.

In this post, I'll try to explain our benchmark and will show the results.

ℹ️ All benchmark code is open at my Github repo. I am open to all suggestions, reviews and even pull-requests to the benchmark code.

Benchmark

In the benchmark, I used a single type of struct with a bunch fields, Product. At DynamoDB, fields of the struct are mapped to columns of a table. Similarly, at Lambda Store, they are mapped to fields of a Redis hash data structure.

type Product struct {
    Id           int     `json:"pid"`
    CreationTime int64   `json:"creation_time"`
    UpdateTime   int64   `json:"update_time"`
    Name         string  `json:"name"`
    Description  string  `json:"description"`
    Category     string  `json:"category"`
    Count        uint64  `json:"count"`
    Price        float64 `json:"price"`
    Seller       string  `json:"seller"`
    SellerId     int     `json:"seller_id"`
    Country      string  `json:"country"`
    City         string  `json:"city"`
}

Scenarios

I ran four different test scenarios with two test modes and with different parallelism and TLS on/off settings.

Test scenarios are:

  • Read Test: Some number of Product objects are inserted initially. Then all fields of these objects are read.
  • Insertion Test: Unique Product objects are inserted to an initially empty database.
  • Overwrite Test: Some number of Product objects are inserted initially. Then all fields of these objects are overwritten.
  • Update Test: Some number of Product objects are inserted initially. Then only a few fields of these objects are updated.

Test modes are:

  • Latency Test: Each parallel process runs the test with large pauses between executions. Target in latency test is to find latency the database for a single operation, with very low load.

  • Throughput Test: Each parallel process runs the test method in a loop, without any delay between executions. Our target here is to test latency of the database under high load.

⚡Number of throughput values are not strictly comparable, because we are testing different systems with different client implementations. Still, it makes sense to compare them, since there are significant differences in most of the scenarios.

Setup & Environment

  • AWS eu-west-1 Europe (Ireland) region was used for both DynamoDB and Lambda Store.
  • Machine type to run benchmark code was m6g.xlarge.
  • Operating system was Ubuntu Server 18.04 LTS 64-bit Arm version.
  • Go runtime was 1.14.3.
  • AWS Go SDK version was 1.30.29.
  • Go Redis client version was 7.2.0.
  • Benchmarks were run with 1, 10 and 100 parallel goroutines.

Results

ℹ️ You can find all of the benchmark histogram files under results directory at my Github repo.

Throughput Tests

In the throughput tests, each parallel goroutine executes the benchmark method continuously in a loop for a minute. I recorded the latency of each request in a histogram. Latency is measured as the elapsed time between sending a request and receiving its response.

Read Benchmark

In the read benchmark, some number of Product objects are inserted initially. Then all fields of randomly selected objects are read continuously. At DynamoDB all columns of an existing row are read for each request. At Lambda Store all fields of a Redis hash are read.

Parallelism 1

With single client, Lambda Store had significantly better latencies and throughput. p99, p99.9 and max latencies of DynamoDB were around 3.9ms, 22ms and 95ms respectively but at Lambda Store they were just 0.7ms, 2ms and 8.7ms.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     665.599   0.50000000     45608            2246.655   0.50000000     13063
     671.743   0.75000000     67241            2377.727   0.75000000     19557
     679.423   0.87500000     78190            2459.647   0.87500000     22838
     690.687   0.93750000     83512            2547.711   0.93750000     24453
     769.535   0.99218750     88320            3969.023   0.99218750     25859
    2162.687   0.99902344     88927           22790.143   0.99902344     26037
    8773.631   1.00000000     89013           95027.199   1.00000000     26062

read Throughput - Parallelism-1

Parallelism 10

With 10 parallel clients, Lambda Store again provided significantly better latency and throughput values. Max latency of DynamoDB was around 275ms but at Lambda Store it was just 18ms.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     666.623   0.50000000    446875            2478.079   0.50000000    111546
     679.423   0.75000000    660320            2742.271   0.75000000    167502
     704.511   0.87500000    766362            2932.735   0.87500000    195293
     732.159   0.93750000    820150            3158.015   0.93750000    209072
    1004.031   0.99218750    867687           10420.223   0.99218750    221233
    3272.703   0.99902344    873664           26476.543   0.99902344    222756
   18382.847   1.00000000    874517          276561.919   1.00000000    222973

read Throughput - Parallelism-10

Parallelism 100

With 100 parallel clients, Lambda Store again had significantly better latency and throughput values. p99.9 and max latencies of DynamoDB was around 196ms and 726ms w/o TLS and 419ms and 978ms with TLS respectively but at Lambda Store max latency was just 28ms w/o TLS and 35ms with TLS.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

    2791.423   0.50000000   1024844            7655.423   0.50000000    280890
    2983.935   0.75000000   1531801           10797.055   0.75000000    421528
    3319.807   0.87500000   1785964           14401.535   0.87500000    491501
    4046.847   0.93750000   1913014           21413.887   0.93750000    526584
    7958.527   0.99218750   2024460           56950.783   0.99218750    557294
   14376.959   0.99902344   2038406          196870.143   0.99902344    561133
   27918.335   1.00000000   2040388          726138.879   1.00000000    561681

read Throughput - Parallelism-100

Throughput Comparison

These are the number of completed read requests in a minute. In all tests, Lambda Store had better throughout values.

Lambda Store Lambda Store TLS DynamoDB DynamoDB TLS
P-1 89013 90159 26062 25574
P-10 874517 864270 222973 173521
P-100 2040388 1919849 561681 243913

Insertion Benchmark

In the insertion benchmark, unique Product objects are inserted to an initially empty database. At DynamoDB a new row is inserted to a table for each request. At Lambda Store a new Redis hash is inserted to the Redis database.

Parallelism 1

With single client, Lambda Store had significantly better latencies and throughput. In this case, load of the system was very low, because most of the time system was idle waiting for network I/O.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     651.263   0.50000000     43162            3309.567   0.50000000      8481
     662.015   0.75000000     63507            3743.743   0.75000000     12684
     683.519   0.87500000     73582            3897.343   0.87500000     14808
     702.975   0.93750000     78690            3991.551   0.93750000     15848
    2533.375   0.99218750     83266            6406.143   0.99218750     16769
    6737.919   0.99902344     83840           28868.607   0.99902344     16885
   13000.703   1.00000000     83921          114491.391   1.00000000     16901

Insert Throughput - Parallelism-1

Parallelism 10

With 10 parallel clients inserting new objects, result was a bit different. At lower percentiles, up to p99, which generates most of the throughput, Lambda Store had better latency values and as a result of that higher throughput. But 1% of the requests had significantly worse latencies at Lambda Store compared to DynamoDB.

ℹ️ We investigated this a bit further, and figured out that, while inserting new entries, there is an O(log n) cost which causes big delays under very high load. We are working on a solution to remove that cost and testing it on our dev environment.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     712.703   0.50000000    186312            3493.887   0.50000000     78875
     886.271   0.75000000    278444            3813.375   0.75000000    118126
    1225.727   0.87500000    324776            4083.711   0.87500000    137732
    1735.679   0.93750000    347975            4857.855   0.93750000    147550
   15884.287   0.99218750    368268           14155.775   0.99218750    156157
  166723.583   0.99902344    370802           29065.215   0.99902344    157231
  710934.527   1.00000000    371163          145489.919   1.00000000    157384

Insert Throughput - Parallelism-10

Parallelism 100

With 100 parallel clients inserting new objects, both systems struggled to provide acceptible latency values. But at p99.9, both were close to 1s latency values.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

    7446.527   0.50000000    185522            3962.879   0.50000000    119638
   10895.359   0.75000000    278315           24625.151   0.75000000    179388
   19431.423   0.87500000    324533           29409.279   0.87500000    209311
   32866.303   0.93750000    347686           33685.503   0.93750000    224282
  271581.183   0.99218750    367955          760741.887   0.99218750    237254
  811597.823   0.99902344    370489         1005584.383   0.99902344    238888
 2055208.959   1.00000000    370851        12750684.159   1.00000000    239121

Insert Throughput - Parallelism-100

Throughput Comparison

These are the number of completed insertion requests in a minute. In all tests, Lambda Store had better throughout values.

Lambda Store Lambda Store TLS DynamoDB DynamoDB TLS
P-1 83921 80460 16901 14862
P-10 371163 360779 157384 129753
P-100 370851 370100 239121 135363

Overwrite Benchmark

In the overwrite benchmark, some number of Product objects are inserted initially. Then all fields of randomly selected objects are overwritten continuously. At DynamoDB an existing row of a table is overwritten completely for each request. At Lambda Store all fields of a Redis hash are updated with the new values.

Parallelism 1

With single client, Lambda Store had significantly better latencies and throughput. In this case, load of the system was very low, because most of the time system was idle waiting for network I/O. Result was very similar to insertion benchmark.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     663.039   0.50000000     43215            3258.367   0.50000000      9128
     673.279   0.75000000     62757            3389.439   0.75000000     13620
     691.199   0.87500000     73335            3479.551   0.87500000     15883
     710.655   0.93750000     78440            3559.423   0.93750000     17015
    1958.911   0.99218750     82981            4214.783   0.99218750     17994
    6111.231   0.99902344     83553           20430.847   0.99902344     18118
   19939.327   1.00000000     83634           47022.079   1.00000000     18135

Overwrite Throughput - Parallelism-1

Parallelism 10

With 10 parallel clients, Lambda Store again provided significantly better latency and throughput values. Max latency of DynamoDB was around 250ms but at Lambda Store it was just 24ms.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     695.807   0.50000000    333923            3543.039   0.50000000      76918
     766.463   0.75000000    496873            3872.767   0.75000000     115060
     953.343   0.87500000    579516            4098.047   0.87500000     134086
    1728.511   0.93750000    620861            5349.375   0.93750000     143618
    4009.983   0.99218750    657069           13189.119   0.99218750     151994
    8527.871   0.99902344    661596           28213.247   0.99902344     153040
   23969.791   1.00000000    662241          247857.151   1.00000000     153189

Overwrite Throughput - Parallelism-10

Parallelism 100

With 100 parallel clients, DynamoDB w/o TLS had better latency values at low percentiles (p93) but after that point its latency values were very huge compared to Lambda Store and DynamoDB with TLS. Lambda Store had around 1/8 of latency of DynamoDB with TLS. Also Lambda Store significantly better throughput values than DynamoDB.

I am planning to run this test again, because of interesting behaviour of DynamoDB w/o TLS. I observed very similar result in update test with 100 parallel clients. See below...

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

    5906.431   0.50000000    425093            3614.719   0.50000000    120065
    7475.199   0.75000000    637311            3956.735   0.75000000    179872
    9355.263   0.87500000    743457            4202.495   0.87500000    210067
   14106.623   0.93750000    796279            4599.807   0.93750000    224821
   28049.407   0.99218750    842701          870842.367   0.99218750    237927
   39911.423   0.99902344    848507         2237661.183   0.99902344    239566
   78381.055   1.00000000    849334        12624855.039   1.00000000    239800

Overwrite Throughput - Parallelism-100

Throughput Comparison

These are the number of completed overwrite requests in a minute.

Lambda Store Lambda Store TLS DynamoDB DynamoDB TLS
P-1 83634 83292 18135 15649
P-10 662241 660878 153189 145962
P-100 849334 835029 239800 133787

Update Benchmark

In the update benchmark, some number of Product objects are inserted initially. Then a few fields of randomly selected objects are updated continuously. At DynamoDB a few columns of a row are updated for each request. At Lambda Store fields of a Redis hash are updated.

Parallelism 1

With single client, Lambda Store had significantly better latencies and throughput. Result was very similar to insertion and overwrite benchmarks.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     644.095   0.50000000     46631            3633.151   0.50000000      7855
     651.775   0.75000000     68043            4048.895   0.75000000     11733
     662.015   0.87500000     79058            4218.879   0.87500000     13701
     675.839   0.93750000     84660            4308.991   0.93750000     14653
     932.351   0.99218750     89565            5439.487   0.99218750     15506
    5570.559   0.99902344     90181           26017.791   0.99902344     15613
   25673.727   1.00000000     90269           39354.367   1.00000000     15628

update Throughput - Parallelism-1

Parallelism 10

With 10 parallel clients, Lambda Store again provided significantly better latency and throughput values. Max latency of DynamoDB was around 129ms w/o TLS and 245ms with TLS but at Lambda Store it was just 17ms.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     654.335   0.50000000    400301            3657.727   0.50000000     77352
     677.887   0.75000000    596148            3917.823   0.75000000    115828
     715.775   0.87500000    695659            4167.679   0.87500000    135029
     818.175   0.93750000    745117            4685.823   0.93750000    144697
    4120.575   0.99218750    788509           13344.767   0.99218750    153095
    7712.767   0.99902344    793934           28688.383   0.99902344    154148
   17399.807   1.00000000    794710          129040.383   1.00000000    154298

update Throughput - Parallelism-10

Parallelism 100

With 100 parallel clients, DynamoDB w/o TLS had similar latency values at very low percentiles (p75) but after that point its latency values were very huge compared to Lambda Store and DynamoDB with TLS. Lambda Store had around 1/8 of latency of DynamoDB with TLS. Also Lambda Store significantly better throughput values than DynamoDB.

This result was very similar to overwrite test with 100 parallel clients.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count


    3885.055   0.50000000    668790            3788.799   0.50000000    120084
    5357.567   0.75000000   1003177            4304.895   0.75000000    179880
    6627.327   0.87500000   1170136           18497.535   0.87500000    209643
    8081.407   0.93750000   1253410           27934.719   0.93750000    224613
   20185.087   0.99218750   1326456          843055.103   0.99218750    237711
   30359.551   0.99902344   1335594         2222981.119   0.99902344    239346
   69468.159   1.00000000   1336898         6039797.759   1.00000000    239579

update Throughput - Parallelism-100

Throughput Comparison

These are the number of completed update requests in a minute.

Lambda Store Lambda Store TLS DynamoDB DynamoDB TLS
P-1 90269 87742 15628 17014
P-10 794710 790974 154298 139004
P-100 1336898 1298606 239579 136594

Latency Tests

In the latency tests, 10 parallel goroutines executed the benchmark method with a large pause between executions in a loop for three minutes. In these tests load was very low and purpose was to measure latency as if there was only single request in the system. I recorded the latency of each request in a histogram. Latency is measured as the elapsed time between sending a request and receiving its response.

Read Benchmark

In the read benchmark, some number of Product objects are inserted initially. Then all fields of randomly selected objects are read continuously. At DynamoDB all columns of an existing row are read for each request. At Lambda Store all fields of a Redis hash are read.

Lambda Store had better latency values similar to throughput tests.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     721.919   0.50000000      2423            2545.663   0.50000000      2381
     752.639   0.75000000      3619            2729.983   0.75000000      3556
     775.679   0.87500000      4226            2975.743   0.87500000      4148
     791.039   0.93750000      4520           10747.903   0.93750000      4442
     843.263   0.99218750      4782           14598.143   0.99218750      4701
    1726.463   0.99902344      4815           28147.711   0.99902344      4734
   55607.295   1.00000000      4819           52396.031   1.00000000      4738

read latency


Insertion Benchmark

In the insertion benchmark, unique Product objects are inserted to an initially empty database. At DynamoDB a new row is inserted to a table for each request. At Lambda Store a new Redis hash is inserted to the Redis database.

Lambda Store had better latency values similar to throughput tests. p99, p99.9 and max latencies of DynamoDB was around 17ms, 30ms and 51ms w/o TLS and 5.4ms, 21ms and 37ms with TLS respectively but at Lambda Store they were 2.7ms, 7.5ms and 18ms w/o TLS and 2.9ms, 8.3ms and 15.8ms with TLS.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     763.903   0.50000000      2410            3883.007   0.50000000      2374
     796.159   0.75000000      3601            4558.847   0.75000000      3561
     812.031   0.87500000      4213           13033.471   0.87500000      4156
     829.439   0.93750000      4500           14368.767   0.93750000      4453
    2772.991   0.99218750      4763           17858.559   0.99218750      4711
    7565.311   0.99902344      4796           29982.719   0.99902344      4744
   18431.999   1.00000000      4800           51118.079   1.00000000      4748

Insert Latency


Overwrite Benchmark

In the overwrite benchmark, some number of Product objects are inserted initially. Then all fields of randomly selected objects are overwritten continuously. At DynamoDB an existing row of a table is overwritten completely for each request. At Lambda Store all fields of a Redis hash are updated with the new values.

Lambda Store had better latency values similar to throughput tests.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count


     768.511   0.50000000      2396            3948.543   0.50000000      2378
     796.671   0.75000000      3590            4415.487   0.75000000      3565
     813.055   0.87500000      4182           11567.103   0.87500000      4157
     852.991   0.93750000      4481           13320.191   0.93750000      4455
    6422.527   0.99218750      4742           16285.695   0.99218750      4714
    9854.975   0.99902344      4775           32407.551   0.99902344      4746
   15597.567   1.00000000      4779           43024.383   1.00000000      4750

Overwrite latency


Update Benchmark

In the update benchmark, some number of Product objects are inserted initially. Then a few fields of randomly selected objects are updated continuously. At DynamoDB a few columns of a row are updated for each request. At Lambda Store fields of a Redis hash are updated.

Lambda Store had better latency values similar to throughput tests.

            =Lambda Store=                                =DynamoDB=

 Latency(μs)   Percentile     Count         Latency(μs)   Percentile     Count

     755.199   0.50000000      2405            3880.959   0.50000000      2363
     779.775   0.75000000      3608            4476.927   0.75000000      3549
     794.111   0.87500000      4188           12673.023   0.87500000      4138
     825.855   0.93750000      4485           14434.303   0.93750000      4433
    5087.231   0.99218750      4747           18382.847   0.99218750      4690
    9814.015   0.99902344      4780           32571.391   0.99902344      4722
   12918.783   1.00000000      4784           51019.775   1.00000000      4726

update latency


Conclusion

One of the main goals of Lambda Store is to provide a low latency serverless Redis database. We worked very hard to build a serverless Redis service with first-class persistence layer in addition to in-memory database without sacrificing the latency. And we are still working to improve that further.

These benchmarks show that, for the low to medium throughput workloads, Lambda Store has better latency figures compared to DynamoDB. At high throughput workloads Lambda Store has comparable latency values to DynamoDB, even better latencies up to p99 and significantly higher throughput than DynamoDB. As I mentioned in insertion throughput tests section, we already figured out a bottleneck while growing database with the new insertions and testing an improvement. According to preliminary tests, that change will improve all write workloads as well.

Lambda Store

Lambda Store is the first the `serverless Redis` service. In this blog, Lambda Store engineering team shares their experiences on Cloud, AWS, Kubernetes, Redis and of course Lambda Store.

Discussion

markdown guide
 

Can you comment on the throughput? With Dynamo I can easily do 17K writes per second with OnDemand capacity. I see you have "reserved plans" which is the same as DynamoDB Reserved Capacity what will the comparison be on a spiky work load and continuous work load?

 

Exact throughout (even latency) numbers depend on many factors such as instance type/cpu, network settings & bandwidth, benchmark code etc. What matters is the comparison of the numbers measured on the same (or very similar) environment and test. That's why I haven't pointed out exact throughput or latency numbers but just compared the two systems.

I haven't tested DynamoDB reserved capacity yet. That's a good point. As you said this benchmark was to test spiky load. I can work on another post comparing DynamoDB's reserved capacity to Lambda Store's reserved plan.