Welcome to Red Hat! As a global leader in open source technology, we are constantly pushing the boundaries of innovation and driving industry-wide change. We are seeking a highly skilled and motivated Manager of Site Reliability Engineering to join our team and help us maintain and improve the reliability of our products and services. In this role, you will have the opportunity to lead and develop a team of talented engineers, while working closely with cross-functional teams to ensure our systems and applications are running at peak performance. If you have a passion for problem-solving, a strong technical background, and a desire to make a meaningful impact, we would love to hear from you.
- Manage a team of site reliability engineers and provide guidance, mentorship, and leadership to ensure team success.
- Oversee the day-to-day operations of our systems and applications, ensuring high availability, scalability, and reliability.
- Collaborate with cross-functional teams including developers, operations, and support to identify and resolve issues and improve overall system performance.
- Develop and implement strategies and processes to optimize system reliability and performance.
- Conduct regular performance reviews and provide feedback and coaching to team members to foster their professional growth and development.
- Proactively identify potential issues and risks, and develop contingency plans to mitigate them.
- Stay up-to-date with industry trends and best practices in site reliability engineering, and implement them to improve our processes and systems.
- Collaborate with other managers and leaders to drive company-wide initiatives and promote a culture of innovation and continuous improvement.
- Act as a liaison between the site reliability engineering team and other departments to ensure effective communication and alignment of goals and priorities.
- Promote a positive and inclusive work environment that values diversity, collaboration, and knowledge sharing.
Extensive Experience In Managing Large-Scale Production Environments, Preferably In A Cloud-Based Or Distributed Systems Environment.
Strong Knowledge Of Software Development And Automation Tools, Such As Ansible, Puppet, Or Chef, As Well As Experience With Version Control Systems Like Git.
Proven Track Record Of Implementing And Managing Highly Available And Scalable Systems, With A Deep Understanding Of Reliability And Monitoring Practices.
Excellent Leadership And Communication Skills, With The Ability To Lead And Mentor A Team Of Engineers And Collaborate Effectively With Cross-Functional Teams.
In-Depth Understanding Of Linux And Open Source Technologies, With A Strong Background In Scripting And Programming Languages Like Python, Java, Or Go.
Project Management
Leadership
DevOps
Agile Methodology
Automation
Cloud Computing
Collaboration
Problem-Solving
Technical Expertise
Infrastructure As Code
Continuous Integration/Continuous Delivery (Ci/Cd)
Communication
Conflict Resolution
Leadership
Time management
Attention to detail
Teamwork
collaboration
Adaptability
Problem-Solving
Decision-making
According to JobzMall, the average salary range for a Manager, Site Reliability Engineering is $130,000 - $170,000 per year. This can vary depending on factors such as the company, location, and experience level of the individual. Some companies may offer additional benefits and bonuses, such as stock options or performance bonuses, which can increase the overall salary.
Apply with Video Cover Letter Add a warm greeting to your application and stand out!
Red Hat, Inc. is an American multinational software company providing open-source software products to the enterprise community. Founded in 1993, Red Hat has its corporate headquarters in Raleigh, North Carolina, with other offices worldwide. It became a subsidiary of IBM on July 9th, 2019.

Get interviewed today!
JobzMall is the world‘ s largest video talent marketplace.It‘s ultrafast, fun, and human.
Get Started