DEV Community

Discussion on: What are the most suitable datastores for storing a huge number of articles and news?

Collapse
 
bgadrian profile image
Adrian B.G.

IF you have a good team of experienced SysAdmins/Data Engineers to maintain the clusters:

To do it right and for long term I would choose 3 solutions for 3 problems.
Long term storage something horizontal scalable with replication (cassandra/kafka with streams maybe? )
A nice alternative would be S3 documents.

From "the source of truth" you can move data with NiFi or other solutions to other platforms that can change over time. This is the trick.

ElasticSearch is one option for text search.

Real-time analytics/aggregation: Apache Beam/Spark/Flink.

Once a month heavy duty analytics and discovery: a special database you can put tons of data, extract the report and close it (BigQuery, AWS Athena, Aurora..)

ELSE / you do not have a big team of SRE and DevOps:
managed solutions, I would suggest Google Cloud.

Collapse
 
devfanooos profile image
FaN000s

I think we are in a trap :) :) :)

We do not have a team of experienced SysAdmins/Data Engineers and I do not think storing the data outside our data center will be an acceptable choice :) :) :) :).

Collapse
 
bgadrian profile image
Adrian B.G.

I would suggest looking for another job, but hey, thats just my trivial oppinion ๐Ÿ˜€

This situation usually is a signal for a lot more company wide issues.