Program Director, SRE

Posted on October 9, 2025

|
2 min read
|
Views

Job details

  • Profession: Uncatalogued Profession

  • Country of the Job: Ireland

  • State of the Job: IRL

  • City of the Job: Dublin

  • Job Application Deadline (Year): 2025

  • Type of job: Not specified. See Job Description

  • Job salary amount given (annually): Not specified. See Job Description

  • Hiring Company: IBM

  • Mode of Work: Onsite

  • Applier's country: Ireland

  • Benefits Included: Disability protection

Introduction

A career in IBM Software means you’ll be part of a team that transforms our customer’s challenges into solutions.

Seeking new possibilities and always staying curious, we are a team dedicated to creating the world’s leading AI-powered, cloud-native software solutions for our customers. Our renowned legacy creates endless global opportunities for our IBMers, so the door is always open for those who want to grow their career.

IBM’s product and technology landscape includes Research, Software, and Infrastructure. Entering this domain positions you at the heart of IBM, where growth and innovation thrive.

Your role and responsibilities

As a Site Reliability Engineering (SRE) Program Director, you will play a pivotal role in leading and driving the SRE program within our organization. You will be responsible for ensuring the reliability, scalability, and performance of systems and applications which support IBM Software SaaS offerings. The successful candidate will have a strong technical background, exceptional leadership skills, and a proven track record of implementing and optimizing SRE best practices in SaaS environments.

Key Responsibilities:

  • Lead the SRE program strategy and execution across multiple SaaS offerings

  • Drive reliability engineering practices to ensure high availability and performance of services

  • Collaborate with engineering, product, and operations teams to embed SRE principles into the software development lifecycle

  • Oversee incident management processes, including root cause analysis and continuous improvement

  • Champion automation, observability, and proactive monitoring across systems

  • Guide the adoption of container orchestration and infrastructure-as-code practices

  • Mentor and grow a high-performing, globally distributed SRE team

Required technical and professional expertise

‘- Proven experience in a leadership role within Site Reliability Engineering, with a focus on supporting SaaS and/or PaaS solutions

  • Proficient understanding of cloud computing platforms (e.g., IBM Cloud, AWS, Azure, GCP) and infrastructure as code

  • In-depth knowledge of system architecture, networking, and security principles

  • Strong experience with incident management, post-incident analysis, and root cause analysis in a multi-tenant SaaS context

  • Expertise in implementing and managing container orchestration platforms (e.g., Kubernetes) for multi-tenant environments

Preferred technical and professional experience

‘- Certification in Site Reliability Engineering or related field

  • Excellent communication skills and the ability to collaborate effectively with cross-functional teams

  • Demonstrated success in leading SRE transformations within organizations, particularly in the context of SaaS platforms

IBM is committed to creating a diverse environment and is proud to be an equal-opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, gender, gender identity or expression, sexual orientation, national origin, caste, genetics, pregnancy, disability, neurodivergence, age, veteran status, or other characteristics. IBM is also committed to compliance with all fair employment practices regarding citizenship and immigration status.