SRE Fundamentals and Security

0
Join & Subscribe
edX
Free Online Course (Audit)
English
$99.00 Certificate Available
5 weeks long, 2-3 hours a week
selfpaced

Overview

Site Reliability Engineers must have the right tools and strategies to perform in a technical, fast-paced environment. IBM Cloud SRE is guided by nine competency areas that lead to the successful practice of the discipline:

● Applying Site Reliability Engineering principles

● Operations

● Monitoring and incident management

● Security and compliance

● Compute infrastructure

● Networking

● Storage and data management

● Reliability and resiliency

● Deployment automation

In this first course of the three-part Professional Certificate in Site Reliability Engineering (SRE), you will focus on the first four SRE competencies:

● Applying Site Reliability Engineering principles

● Operations

● Monitoring and incident management

● Security and compliance

NOTE: The remaining five SRE competencies are covered in Course 2: SRE Infrastructure, Resiliency and Deployment Automation.

This course covers approximately 50% of the content required to help you prepare for the “IBM Certified Professional SRE - Cloud V2” certification exam.

If you are interested in pursuing the “IBM Certified Professional SRE - Cloud V2” certification, we recommend that you complete all three offerings of the Professional Certificate in Site Reliability Engineering (SRE) to ensure a successful certification exam experience.

Syllabus

Module 1: Welcome and Introduction

You will cover the following topics:

● An introduction to the IBM Professional SRE role

Module 2: SRE Fundamentals and Terminology

You will cover the following topics:

● Deeper dive into SRE role

● SRE principles

● Managing trade-offs between change, velocity, and reliability

● Negotiating service level objectives, service level indicators, error budgets and the user experience

● IBM Cloud tools and technology across the Software Development Life Cycle

● Applying software engineering principles to drive reliability

Module 3: Operations

You will cover the following topics:

● Performing operational readiness reviews (ORR) on IBM Cloud

● Creating ORR checklist

● Employing cost-optimization strategies

● Managing backups and recoveries on IBM Cloud

Module 4: Monitoring

You will cover the following topics:

● Monitoring overview

● Creating and maintaining metrics, traces, and alerts on IBM Cloud

● Collecting, analyzing, and managing logs on IBM Cloud

● Identifying key metrics for service health on IBM Cloud

● Using performance and availability metrics to measurethe health of services on IBM Cloud

Module 5: Incident Management

You will cover the following topics:

● Managing incidents on IBM Cloud

● Developing a balanced action plan to mitigate future incidents

● Performing the post-incident review

Module 6: Security and Compliance

You will cover the following topics:

● Monitoring and managing security threats on IBM Cloud

● Implementing and managing security policies on IBM Cloud

● Implementing encryption models

● Managing role-based access control on IBM Cloud

Taught by

Michele Jordan and Marissa Moore