NoSQL, Big Data and Spark Fundamentals

0
Join & Subscribe
edX
$247.00
English
Certificate Available
14 weeks long, 2-3 hours a week

Overview

Data engineers and Big Data professionals are in overwhelming demand. NoSQL and Big Data technology skills such as Apache Spark are a must-have for modern day data-driven decision-making. This three-course Professional Certificate from IBM opens the door for data engineering and big data careers.

Starting with NoSQL Database Basics, this course introduces you to NoSQL fundamentals, including the four key non-relational database categories. By the end of the course, you will have hands-on skills working with MongoDB, Cassandra, and IBM Cloudant NoSQL databases.

A crucial aspect of data engineering is the acquisition and management of Big Data and Big Data Analytics scalability and performance. When you enroll in Big Data, Hadoop, and Spark Basics, you'll discover the characteristics, features, benefits, limitations, and applications of some of the more popular Big Data processing tools. You explore the open-source ecosystem of Apache tools, including Apache Hadoop, Apache Hive, and Apache Spark, including Spark on Kubernetes. Discover how to leverage Spark to deliver reliable insights. You'll gain hands-on data analysis skills using PySpark and Spark SQL and create a streaming analytics application using Spark Streaming, and more.

Then enroll in Apache Spark for Data Engineering and Machine Learning to discover how data and machine learning engineers use Spark Structured Streaming, GraphFrames, Regression, Classification, and clustering. Learn about clustering and how to apply the k-means clustering algorithm using Spark MLlib. Extraction Transformation and Loading, (ETL) is at the heart of data and machine learning engineering, and you'll gain skills using Spark to perform extract, transform and load (ETL) tasks.This course culminates with a hands-on Spark project.

This Professional Certificate does not require any prior programming or data science skills; however, prior basic data literacy and SQL skills will prove valuable in completing this program.

Syllabus

Courses under this program:
Course 1: NoSQL Database Basics

This course introduces you to the fundamentals of NoSQL, including the four key non-relational database categories. By the end of the course you will have hands-on skills for working with MongoDB, Cassandra and IBM Cloudant NoSQL databases.



Course 2: Big Data, Hadoop, and Spark Basics

This course provides foundational big data practitioner knowledge and analytical skills using popular big data tools, including Hadoop and Spark. Learn and practice your big data skills hands-on.



Course 3: Apache Spark for Data Engineering and Machine Learning

This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML workflows.



Taught by

Aije Egwaikhide, Karthik Muthuraman, Romeo Kienzler, Rav Ahuja, Steve Ryan and Ramesh Sannareddy