DEV Community

Dylan Anthony
Dylan Anthony

Posted on • Updated on

Is a Redis ORM a Horrible Idea?

Help me Devs! I work on a web app which deals with a lot of data that moves really fast. Our primary database (MySQL) simply cannot keep up, so we store a lot of “current state” data in Redis.

Today we (de)serialize the objects with JSON and store the whole object. That leads to several issues, so we’re going to switch to storing each field individually.

The code for managing all these fields is going to get icky fast, so I’m thinking about building a general purpose ORM for Redis. Some of these already exist but don’t have all the features I need (e.g. transactions with pipelines), locks.

Is this a horrible idea? Am I just building a product that already exists elsewhere that I just don’t know about?

Thanks for your response!

EDIT: There have been a few responses suggesting that the problem has to do with JSON. To clarify, when we were storing this sort of data in MySQL it was normalized, not JSON. Using JSON is only a means to persist objects to Redis.

Discussion (33)

Collapse
ben profile image
Ben Halpern

I feel like Redis is a delightfully flexible tool and it’s in its nature to support all sorts of use cases.

In spirit I feel like ORMs are a perfectly nice fit.

A quick search seems to indicate that others have tried and succeeded at this

Collapse
david_j_eddy profile image
David J Eddy

"...Our primary database (MySQL) simply cannot keep up..." MySQL runs some of the biggest data sets in the world. Find a good DBA. Clustering, federation, caching.

Using Redis as a caching layer is a great idea, but as a persistent storage media; I would not recommend that. Sure, it can save to disk, has replication, complex object types, etc. But Redis is, by its nature, a cache, not persistent storage.

Right tool for the right job.

Collapse
dbanty profile image
Dylan Anthony Author

Thanks for the suggestion! I'll clarify the use case just a bit. The problem is not with read performance from MySQL, but with write performance to it. The data we're storing here is transient by its very nature, with each record changing several times per minute. It's also not data that we'd care much about losing (we persist as much as we need to in a down-sampled fashion to other storage mediums). So even if we could tune MySQL to accept writes fast enough (which would be expensive if we need to hire a new full time employee to figure it out!) we don't need most of the features it gives us.

Collapse
siy profile image
Sergiy Yevtushenko

You can save yourself from huge pain of maintaining data consistency between two different storages. Take a look at Apache Ignite data grid. Using it you can get several benefits:

  • transparent caching
  • access data as key-value store
  • access same data as relational DB with distributed SQL queries
  • full transaction support
  • close to linear horizontal scalability for both storage and performance
  • tunable redundancy and consistency characteristics for every "table" ("cache" in Ignite terms)

It also provides additional services like queues, pub-sub, distributed computing, built-in file system and many more useful things.

Collapse
dbanty profile image
Dylan Anthony Author

Very cool, I've never looked at that before! I'll certainly consider it as an option.

Collapse
vlasales profile image
Vlastimil Pospichal

There are three ways:

  1. Normalize database (break JSON and store atomic data to MySQL)
  2. still use JSON, but choose another database (MongoDB, DB4, Redis) - without ORM
  3. use a hybrid SQL database with JSON support (PosgreSQL)

I recommend the 3.

Collapse
dbanty profile image
Dylan Anthony Author

Thank you! I clarified a bit in an edit but JSON was not the goal, just a means to store data in Redis. In MySQL we had normal tables with columns. I hadn't been considering a NoSQL database before, but I'm certainly considering it now!

Collapse
rhymes profile image
rhymes

Oooh now I understand. Writes on normalized tables (with index rewrites and such) can be slow. I'm not 100% up to date on MySQL (I main use PostgreSQL) but I remember there's the possibility of having a JSON column for destructured data. Maybe you can test that before moving to a separate NoSQL DB.

I would test a JSON column, then if it's still too slow I would consider a separate DB. Maintaining one DB is still better than multiple DBs :D

Let us know about your findings, I'm very curious!

Collapse
brandinchiu profile image
Brandin Chiu

It sounds a lot like you're trying to shoehorn non-relational data into a relational system.

You lose all of the performance support of a relational system when your primary data collection is json. It's difficult to index and more complex for MySQL to store.

This is more of an issue of using a hammer like you would a screwdriver: you're simply not using the tool correctly.

Either maintain your data as JSON documents and move to a document-storage NoSQL solution like mongo (easier), or refactor your data management into a relational model to work with MySQL (harder).

Collapse
dbanty profile image
Dylan Anthony Author

Oops, I clearly made a mistake in describing the issue! I clarified a bit above. When we were trying to use MySQL for this, the data was not JSON. You do bring up a good point though in that we don't need the power of a relational database (e.g. joins) for this sort of data. I have been looking at DynamoDB to see if that will give us what we need. Though maybe Mongo would have better tooling.

Collapse
brandinchiu profile image
Brandin Chiu

Reading over some of your other comments implies that it's only a subset of your data that has this problem, not the entire dataset.

In that case, mixing storage engines is probably the preferred solution. If this "transient data" is short-lived between periods of long persistence, then redis makes a lot of sense.

Collapse
dmfay profile image
Dian Fay

Did you introduce Redis because you were having a performance issue specifically with JSON blobs being stored in MySQL? If that's the case I would look back before looking forward: if your data can be expanded into relations and indexed you might not need a second database at all (and even in MySQL you can index generated columns sourced from JSON, should you need to).

Outside that, you build the tooling you need! If someone else has gotten 80 or 90% of the way there, maybe don't bother starting from scratch. And since Redis is a key-value store rather than a relational database it won't, strictly speaking, be an O/RM -- but that's a good thing.

Collapse
dbanty profile image
Dylan Anthony Author

I clarified a bit in an edit since I clearly implied exactly what you were saying- JSON in MySQL. That isn't what we were doing, but that would have definitely been a problem! You're certainly right in that if we go forward with better Redis tooling we'll look at forking / expanding an existing package before writing one from scratch. Just depends on how much rewriting we'd have to do.

Collapse
rendlerdenis profile image
Denis Rendler • Edited on

I have worked on a project that had PostgreSQL. it was just amazing. Fast, SQL syntax and we could also query directly in JSON fields. Maybe it will help to have a look.

Collapse
adamcosi profile image
Adam

You raise a good point here, MySQL also supports native JSON type since 5.7.8, and if you're using MySQL 8.* the support for JSON is even more robust.

I'd certainly consider a separate data store (if you want to isolate this data from your primary stores) in either PostgreSQL or MySQL with native JSON support unless there is some particular business case that rules these out (for the OP).

Collapse
dbanty profile image
Dylan Anthony Author

Our primary database is pretty much stuck on MySQL 5.6 for the foreseeable future due to boring reasons. You're right though in that I don't need this data in the same database (clearly, because it's not) so I could stand up a new MySQL 8 or PostgreSQL instance. I've been wanting to play with both of those anyway.

Thanks for the suggestion!

Thread Thread
rhymes profile image
rhymes

That's also a good option. If the primary DB is frozen in time and if this data is mostly written and not read, you could even have a separate MySQL connection with the newest version that only contains the table to store this data that gets updated asynchronously (read "eventually consistent").

You'd recycle the same tooling, company expertise and learn new things that might help in the future to upgrade the main DB if you ever chose to do so.

Adopting a completely new storage server is a big investment (in terms of money, devops, time and so on), I'd prioritise a way to decrease that impact if it's technologically feasible

Collapse
gabiaxel profile image
Gabriel Axel

If you decide to go with Redis, and the built-in data structures (sets, hashes, etc') don't cover your requirements (and I do suggest you first try to make it fit - you should get into the "Redis mindset" and not always expect Redis to fit your code as it is), you may want to look into RedisJSON - it's an official Redis module for storing and editing JSON natively without (de)serialization. There are client libraries for several platforms.

Collapse
dbanty profile image
Dylan Anthony Author

Boy would I love to. Unfortunately we’re using Elasticache and it isn’t supported :(. Good suggestion though!

Collapse
pjmartorell profile image
Pere Joan Martorell • Edited on

You can store your data in Redis, no problem. I don't know what language are you using (I guess Python), but in Ruby there are several Redis ORMs. Maybe you can take a look and get some ideas:

I don't think Redis is only intended for caching, it can be used to calculate distances between coordinates, do unions/intersections between groups, create scoreboards, etc. There are background job processors like Sidekiq that also use it to store and manage their jobs in a persistent way.

Collapse
wlcdesigns profile image
wLc • Edited on

You might want to give CassandraDB a try, since it sounds like you need a non relational database instead of MySQL.

Collapse
dbanty profile image
Dylan Anthony Author

Thanks! I’ve never looked at that one before

Collapse
rhymes profile image
rhymes

It seems like you're in need of a document database. Have you considered the option?

Collapse
dbanty profile image
Dylan Anthony Author

Oh hey rhymes! I'm considering the option now due to all these excellent replies!

Collapse
stefanofago73 profile image
Stefano Fago

An interesting resource is redisson: github.com/redisson/redisson
If You work with Java it make your day... if not, in the source You'll find a lot of interesting idea...

Collapse
dbanty profile image
Dylan Anthony Author

Cool, I’ll take a look for sure! I’m working with Python which has a few options but none are quite all the way there. If I do end up going this route though I’ll check out this project for inspiration. Thanks!

Collapse
sebk69 profile image
sebk69

Hi Dylan,

I'm new in DEV and I'me fallen on you're post, just few days after implementing a redis connector to small-orm.

For now, it is implemented only on swoft (The connector was specific because of async specificities of swoole) but it can inspire your will : github.com/sebk69/small-orm-swoft and github.com/sebk69/small-orm-core

Collapse
dbanty profile image
Dylan Anthony Author

Thanks a bunch! Several people have suggested trying out a NoSQL database so that's probably the direction I'm going.

Collapse
dbanty profile image
Dylan Anthony Author

This keeps coming up, I'll definitely try MongoDB! Also considering DynamoDB because I'm on AWS.

Collapse
sleavely profile image
Joakim Hedlund

If you're on AWS I strongly suggest DynamoDB. It has a fairly low barrier to entry and is covered by their forever-free tier if you want to ptototype a bit with it. (PS. Use on-demand tables, not the auto-scaling throughput ones. On-demand is magic.)

Thread Thread
dbanty profile image
Dylan Anthony Author

Im definitely thinking about it. Unsure how pricing will scale, napkin calculations look a little scary 👻. It’s on the short list though, thank you!