Big data tool are the application which provides distributed processing of huge data efficiently.
_Redshift Spark Takes care of: distribution, load processing, ensure competition of task if node failes by assigning same to other node
Ex: want to combine data from different database and different tables, spark has connectors to connect to different types of database(sql/nosql) and sync data.
Ex: Apache open source big data tool
- Plathora
- Pino
- NyPhy
- MapReduce