Site Reliability Engineer (SRE)
-
- Software Engineering
- Professional
Site Reliability Engineer (SRE)
-
- Software Engineering
- Professional
Introduction
At IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.
Your Role and Responsibilities
At IBM, work is more than a job – it’s a calling: To build. To design. To code. To consult. To think along with clients and sell. To make markets. To invent. To collaborate. Not just to do something better, but to attempt things you’ve never thought possible. Are you ready to lead in this new era of technology and solve some of the world’s most challenging problems? If so, lets talk.
Your Role and Responsibilities
- Troubleshoot, monitor, and support critical production systems.
- Perform root cause analysis and manage incidents to ensure timely resolution.
- Provision and deploy environments in a cloud infrastructure (preferably IBM Cloud).
- Handle initial intake for Salesforce-related customer cases, ensuring SLA commitments are met.
- Provide on-call support, sharing rotation duties with global resources (including Poland), ensuring minimized MTTR (Mean Time to Recovery).
- Manage workloads and resources to maintain commitments and prevent SLA breaches.
Required Technical and Professional Expertise
- Strong working knowledge of Kubernetes and cloud infrastructures, with a preference for IBM Cloud (1-3 years).
- Expertise in administration, configuration, and management of MS SQL Server 2022 (1-3 years).
- Expertise in automation platforms such as AWX.
- Proficiency in scripting languages like Python and related tools.
- Strong problem-solving skills and attention to detail.
Preferred Technical and Professional Expertise
- Proven experience in providing on-call support for critical production systems, focusing on determining root cause analysis (RCA).
- Familiarity with Salesforce infrastructure and case management processes.
- Experience with monitoring tools and incident management platforms.
- Ability to work efficiently in a global, distributed team environment.
Want to know what it’s like to be an IBMer?
Key Job Details
Don’t see a fit at this time?
Don’t worry. Join our Talent Network and get notified about the latest opportunities.