Generation in Software world

Few decades back, RDBMS and OOPs came into picture. These two technologies only origin to process or analyze any type of data. Oracle/mysql and Java adopt RDBMS & OOPS, So that most of the companies adopt these technologies. So that its first generation of analysis. These first generation processing systems only process limited/ small amount of data and process only structured/ schema oriented data.

Second Generation systems:

For the last 10 years, data increasing gradually, especially for the last couple of years getting huge amount of unstructure & semi structured data. It’s difficult to process large amount of data and unstructured data using first generation systems like oracle. So that Second generation systems came into picture.

Second generation systems like Hadoop resolved above problems.  Hadoop resolved many problems especially to store data and to process large amount of data even that data either structured, unstructured.  Additionally Hive, pig tools user friendly to run sql queries and to process easily. So that for the last 5 years most of the companies adopt this emerging technologies.

The problem with Second generation systems is very slow and it’s process only specified type data. Let eg: Hadoop process batch data (offline), strom process only streaming data. The main problem is it’s process disk to ram and ram to disk. So that performance also drastically down, so that most of the organizations facing many problems for the last couple of years. So that third generation systems came into picture.

power of spark power

Third Generation Systems

For the last couple of years, Hot cake in the market called spark. So that spark resolve above problems, means to process different types of data and process quickly. To process historical data (batch) or live data (streaming) or machine learning or graphical data at a time, spark is the best choice. The power of spark is everything process in memory. Everyone knows disk store permanently, but it’s slow, similarly ram store temporary, but it’s very fast. So that Spark highly follow this (process everything in memory) so hat  almost most of the problems get resolved. Additionally spark optimize analysis performance. So that most of the students looking for spark training, many organizations giving internal or online spark training.

Future of IT

In future everything fast , means everything process in memory. So spark is an origin for future analytical technologies. Spark very very hot in the market. Now a days most of the companies looking for and offering with huge amount of packages for spark developers. Additionally Spark with Hadoop, very very cheap, so that most of the projects especially tera data, main frames, datawarehouse projects now switching to spark.

Already fourth generation systems like apache ignite everything in memory, means store, process everything in memory. Flink also unified platform to process in memory. So in future all technologies (both  OLAP & OLTP) follows in-memory to process data quickly.

Spark: Origin of future analytics

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.