This is great! I have been working with Apache Spark for years and your notes are great and to the point!
I love that you compared cluster computing vs grid computing. As to my opinion, I think that Apache Spark can also perform as a grid computing, however, having said that, I don't think it will get the best performance since the physical distance matters.
If you look at grid as a distributed system concept - a way to use computers distributed over a network to solve a problem, then Hadoop is a subset of grid computing. And Apache Spark is like the better version of Hadoop Map Reduce paradigm since it does most of the calculations in memory.
The problems for which Apache Spark is most often used are problems which a better solved when a computation is brought to where the data lives. Typical "Big Data" problems like machine learning, data mining, etc feel into this category. I personally never tried using Apache Spark as grid computing but it can be an interesting test.
What are you planning next with Apache Spark ?
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
This is great! I have been working with Apache Spark for years and your notes are great and to the point!
I love that you compared cluster computing vs grid computing. As to my opinion, I think that Apache Spark can also perform as a grid computing, however, having said that, I don't think it will get the best performance since the physical distance matters.
If you look at grid as a distributed system concept - a way to use computers distributed over a network to solve a problem, then Hadoop is a subset of grid computing. And Apache Spark is like the better version of Hadoop Map Reduce paradigm since it does most of the calculations in memory.
The problems for which Apache Spark is most often used are problems which a better solved when a computation is brought to where the data lives. Typical "Big Data" problems like machine learning, data mining, etc feel into this category. I personally never tried using Apache Spark as grid computing but it can be an interesting test.
What are you planning next with Apache Spark ?