This course takes a detailed look at how to implement Big Data solutions using Apache Spark. The course describes the problems that Big Data is designed to solve, and explains how Hadoop addresses these issues via HDFS, Yarn, and the Spark API.
We show plenty examples to help you understand how to create and use RDDs from various data sources, such as flat files, NoSQL databases, and relational databases. We also explore the key Spark APIs layered on top of RDDs, including Spark Streaming via DataFrames, Spark SQL, and Spark Machine Learning and Spark Graph Processing.