Site Reliability Engineers must have the right tools and strategies to perform in a fast-paced technical environment. Nine competency areas guide the successful practice of IBM Cloud SREs.
● Applying Site Reliability Engineering principles
● Operations
● Monitoring and incident management
● Security and compliance
● Compute infrastructure
● Networking
● Storage and data management
● Reliability and resiliency
● Deployment automation
In this second course of the three-part Professional Certificate in Site Reliability Engineering (SRE), you will focus on the following five SRE competencies:
● Compute infrastructure
● Networking
● Storage and data management
● Reliability and resiliency
● Deployment automation
NOTE: The remaining four SRE competencies are covered in Course 1: SRE Fundamentals and Security.
This course covers approximately 50% of the required content to help you prepare for the “IBM Certified Professional SRE - Cloud V2” certification exam.
If you are interested in pursuing the “IBM Certified Professional SRE - Cloud V2” certification, to improve your passing success, we recommend that you complete all three offerings of the Professional Certificate in Site Reliability Engineering (SRE) to ensure a successful certification exam experience.
Module 1: Compute Infrastructure
You will cover the following topics:
● IBM Cloud service models: IaaS, PaaS, and FaaS
● Troubleshooting VMs on IBM Cloud
● Troubleshooting clusters on IBM Kubernetes Service
● Troubleshooting clusters on Red Hat OpenShift on IBM Cloud
● Troubleshooting serverless services
Module 2: Networking
You will cover the following topics:
● Applying IBM Cloud networking features
● Implementing and managing virtual networks on IBM Cloud
● Configuring name resolution on IBM Cloud
● Managing performance on IBM Cloud
● Troubleshooting external connections on IBM Cloud
● Troubleshooting interservice connectivity on IBM Cloud
Module 3: Storage and data management
You will cover the following topics:
● Managing storage and data attributes
● Managing storage accounts
● Managing data on IBM Cloud
● Managing data replication and retention
Module 4: Reliability and resiliency
You will cover the following topics:
● Importance of reliability and resiliency for services
● Designing and improving Reliability for systems and services
● Designing for failure and recovering from failure
Module 5: Deployment automation
You will cover the following topics:
● Deployment automation
● Implement Infrastructure as Code
● SRE responsibilities to CI/CD pipeline