Company, I work at, does data research and scraping which is later aggregated and published to our clients. We also try to denormalize data in order to provide faster data lookup in web applications.
Until now, we used mechanisms within SQL Server to do these aggregations. But recently this has became a bottleneck and processes take too much time to execute and overlap to business hours.
What are other tools that market uses to perform aggregations and pre-calculation outside of relational database? My discoveries include:
- Apache Hadoop MapReduce
- Apache Pig
- Apache Spark