Cloud Operations


What happens once your software is actually running in production? Ensuring that it stays up-and-running is important, with little or no downtime.

This module teaches you how to perform operations in your on-premises private cloud and in the public cloud.


Installation, configuration, automation, monitoring, securing, maintaining, and troubleshooting the services, networks, and systems necessary to support business applications are covered in that module.

It includes specific Kubernetes and AWS features, tools, and best practices related to these technologies.


By the end of this course, you'll be able to:

  • Identify the important aspects of being an Ops engineer.
  • Define why availability, scalability, monitoring, logging, tracing and security are important to Ops.
  • Integrate your code automatically and continuously automate deployment (DevOps).
  • Use Ansible in DevOps environments to deploy and manage software and configurations.
  • Build and Expose Kubernetes applications.
  • Manage Kubernetes objects like ingress, storage, ConfigMaps and Secrets.
  • Learn to analyze and locate critical pod log files in your Kubernetes clusters and create a centralized logging system with a configured EFK (Elasticsearch, Fluentd, and Kibana) stack for Kubernetes.
  • Learn how to monitor your Kubernetes clusters and objects with tools like Prometheus.
  • Add service meshes to a Kubernetes cluster to help solve the growing complexities of connectivity, security, and observability.
  • Get hands-on exploring some services meshes examples with Istio.
  • Use most of the features in Helm, a package manager for Kubernetes.
  • Discover Kustomize, a new declarative approach to configuration customization.
  • Use the Git version control system to organize and manage your Kubernetes infrastructure and applications (GitOps).
  • Use the AWS Command Line Interface (AWS CLI), and understand administration and development tool.
  • Manage, secure, and scale configurations, compute instances, storage and databases on AWS.
  • Monitor the health of your infrastructure with services such as Amazon CloudWatch, AWS CloudTrail, and AWS Config.
  • Manage resource consumption in an AWS account by using tags, Amazon CloudWatch, and AWS Trusted Advisor.
  • Create and configure automated and repeatable deployments with tools such as Amazon Machine Images (AMIs) and AWS CloudFormation.
  • Set up, manage, and tear down complex infrastructure environments with terraform.
  • Understand the principles of site reliability engineering (SRE).
  • Explain the differences between the two approaches SRE and DevOps.
Verantwortliche Person:
Prof. Metzger Laurent
Standort (angeboten):
Empfohlene Module:
Standard-Modul für Informatik STD_14(Empfohlenes Semester: 4)
Standard-Modul für Informatik STD_21(Empfohlenes Semester: 6)
Standard-Modul für Informatik Retro STD_14_UG(Empfohlenes Semester: 4)
Standard-Modul für Cyber Security STD_14 (PF)

Kurse in diesem Modul