
Senior SRE (Site Reliability Engineer)
Are you a highly skilled and experienced Site Reliability Engineer looking for a new challenge? Do you have a passion for solving complex problems and ensuring the reliability and scalability of large-scale systems? EPAM Systems is seeking a Senior SRE to join our dynamic and growing team. As a key member of our organization, you will play a vital role in designing and implementing efficient and resilient infrastructure and processes. If you are a proactive, self-motivated individual with a strong background in both software engineering and systems administration, we encourage you to apply for this exciting opportunity.
- Design, develop, and implement efficient and reliable systems and processes to ensure the scalability and availability of large-scale systems.
- Collaborate with cross-functional teams to identify and resolve complex technical issues related to system reliability.
- Proactively monitor and analyze system performance, identifying potential issues and implementing solutions to improve overall system reliability.
- Continuously improve and optimize systems and processes to increase efficiency and reduce downtime.
- Develop and maintain automation tools and scripts to streamline processes and improve system reliability.
- Stay updated on industry trends and best practices related to systems reliability and scalability.
- Communicate with key stakeholders to report on system performance and provide recommendations for improvement.
- Lead and mentor junior team members, sharing knowledge and best practices.
- Participate in on-call rotation and respond to critical incidents in a timely and efficient manner.
- Ensure compliance with security and regulatory standards related to system reliability.
- Participate in the design and implementation of disaster recovery and business continuity plans.
- Identify areas for improvement in existing systems and processes, and propose and implement solutions.
- Collaborate with cross-functional teams to develop and implement new features and enhancements.
- Conduct regular system reviews and audits to ensure compliance with established standards and protocols.
- Act as a subject matter expert and provide technical guidance to team members and other departments as needed.
Extensive Experience In Managing And Maintaining Large-Scale Production Systems, Including Troubleshooting, Monitoring, And Performance Tuning.
Expertise In Automation And Configuration Management Tools Such As Ansible, Puppet, Or Chef.
Proficiency In At Least One Programming Language, Such As Python, Java, Or Go, For Creating And Maintaining Scripts And Tools.
Strong Knowledge Of Cloud Computing Platforms, Such As Aws, Azure, Or Gcp, And Experience With Containerization Technologies Like Docker And Kubernetes.
Proven Track Record Of Designing And Implementing Highly Available And Scalable Systems, With A Deep Understanding Of Networking, Security, And Database Concepts.
Change Management
Security
DevOps
Scripting
Continuous Integration
Automation
Cloud Computing
Performance optimization
Capacity planning
Incident response
Monitoring
Infrastructure As Code
Communication
Conflict Resolution
Emotional Intelligence
Leadership
Time management
creativity
Critical thinking
Teamwork
Adaptability
Problem-Solving
According to JobzMall, the average salary range for a Senior SRE (Site Reliability Engineer) in Bengaluru, Karnataka, India is between ₹1,500,000 - ₹2,500,000 per year. This translates to approximately $21,000 - $35,000 in USD per year. However, this range can vary depending on factors such as years of experience, specific company, and skills.
Apply with Video Cover Letter Add a warm greeting to your application and stand out!
EPAM Systems, Inc. is a US company that specializes in product development, digital platform engineering, and digital and product design agency.

Get interviewed today!
JobzMall is the world‘ s largest video talent marketplace.It‘s ultrafast, fun, and human.
Get Started
