Data Engineering

0
Join & Subscribe
edX
$1,196.00
English
Certificate Available
57 weeks long, 3-4 hours a week

Overview

Organizations have more data at their disposal today than ever before. The vast amount of data that organizations are capturing, along with their desire to extract meaningful insights is driving an urgent demand for Data Engineers.

Data Engineers play a fundamental role in harnessing data that enable organizations to apply business intelligence for making informed decisions. Today’s Data Engineers require a broad set of skills to develop and optimize data systems and make data available to the organization for analysis.

This Professional Certificate provides you the job-ready skills you will need to launch your career as an entry level data engineer.

Upon completing this Professional Certificate, you will have extensive knowledge and practical experience with cloud-based relational databases (RDBMS) and NoSQL data repositories, working with Python, Bash and SQL, processing big data with Apache Hadoop and Apache Spark, using ETL (extract, transform and load) tools, creating data pipelines, using Apache Kafka and Airflow, designing, populating, and querying data warehouses and utilizing business intelligence tools.

Within each course, you’ll gain practical experience with hands-on labs and projects for building your portfolio. In the final Capstone project, you’ll apply your knowledge and skills attained throughout this program and demonstrate your ability to perform as a Data Engineer.

This program does not require any prior data engineering or programming experience.

Syllabus

Courses under this program:
Course 1: Data Engineering Basics for Everyone

Learn about data engineering concepts, ecosystem, and lifecycle. Also learn about the systems, processes, and tools you need as a Data Engineer in order to gather, transform, load, process, query, and manage data so that it can be leveraged by data consumers for operations, and decision-making.



Course 2: Python Basics for Data Science

This Python course provides a beginner-friendly introduction to Python for Data Science. Practice through lab exercises, and you'll be ready to create your first Python scripts on your own!



Course 3: Python for Data Engineering Project

An opportunity to apply your foundational Python skills via a project, using various techniques to collect and work with data



Course 4: Relational Database Basics

This course teaches you the fundamental concepts of relational databases and Relational Database Management Systems (RDBMS) such as MySQL, PostgreSQL, and IBM Db2.



Course 5: SQL for Data Science

Learn how to use and apply the powerful language of SQL to better communicate and extract data from databases - a must for anyone working in the data science field.



Course 6: SQL Concepts for Data Engineers

In this short course you will learn additional SQL concepts such as views, stored procedures, transactions and joins.



Course 7: Linux Commands & Shell Scripting

This mini-course describes shell commands and how to use the advanced features of the Bash shell to automate complicated database tasks. For those not familiar with shell scripting, this course provides an overview of common Linux Shell Commands and shell scripting basics.



Course 8: Relational Database Administration (DBA)

This course helps you develop the foundational skills required to perform the role of a Database Administrator (DBA) including designing, implementing, securing, maintaining, troubleshooting and automating databases such as MySQL, PostgreSQL and Db2.



Course 9: Building ETL and Data Pipelines with Bash, Airflow and Kafka

This course provides you with practical skills to build and manage data pipelines and Extract, Transform, Load (ETL) processes using shell scripts, Airflow and Kafka.



Course 10: Data Warehousing and BI Analytics

This course introduces you to designing, implementing and populating a data warehouse and analyzing its data using SQL & Business Intelligence (BI) tools.



Course 11: NoSQL Database Basics

This course introduces you to the fundamentals of NoSQL, including the four key non-relational database categories. By the end of the course you will have hands-on skills for working with MongoDB, Cassandra and IBM Cloudant NoSQL databases.



Course 12: Big Data, Hadoop, and Spark Basics

This course provides foundational big data practitioner knowledge and analytical skills using popular big data tools, including Hadoop and Spark. Learn and practice your big data skills hands-on.



Course 13: Apache Spark for Data Engineering and Machine Learning

This short course introduces you to the fundamentals of Data Engineering and Machine Learning with Apache Spark, including Spark Structured Streaming, ETL for Machine Learning (ML) Pipelines, and Spark ML. By the end of the course, you will have hands-on experience applying Spark skills to ETL and ML workflows.



Course 14: Data Engineering Capstone Project

This Capstone Project is designed for you to apply and demonstrate your Data Engineering skills and knowledge in SQL, NoSQL, RDBMS, Bash, Python, ETL, Data Warehousing, BI tools and Big Data.



Taught by

Aije Egwaikhide, Karthik Muthuraman, Romeo Kienzler, Rav Ahuja, Jeff Grossman, Steve Ryan, Ramesh Sannareddy, Joseph Santarcangelo, Lin Joyner, Rose Malcolm and Yan Luo