
Senior Staff Site Reliability Engineer
At NVIDIA, we're seeking a highly skilled and experienced Senior Staff Site Reliability Engineer to join our dynamic team. As a leader in the AI and technology industry, we are constantly pushing boundaries and innovating to create cutting-edge products. In this role, you will play a critical role in ensuring the reliability and scalability of our systems, impacting millions of users worldwide. We are looking for someone with a strong technical background, exceptional problem-solving skills, and a passion for driving continuous improvement. If you thrive in a fast-paced, collaborative environment and are ready to take on new challenges, we want to hear from you!
- Design and implement highly available and fault-tolerant systems to support millions of users worldwide.
- Monitor and maintain the health and performance of our systems, proactively identifying and resolving any issues.
- Lead incident response and resolution, including root cause analysis and post-incident reviews.
- Collaborate with cross-functional teams to continuously improve system reliability, scalability, and performance.
- Develop and maintain automation tools for deployment, monitoring, and maintenance of systems.
- Stay updated with the latest technologies and industry best practices to drive innovation and improve efficiency.
- Provide technical guidance and mentorship to junior team members.
- Participate in on-call rotations and provide 24/7 support for critical systems.
- Ensure compliance with security standards and policies.
- Create and maintain documentation of systems, processes, and procedures.
- Identify opportunities for process improvement and implement changes to increase efficiency and reduce downtime.
- Communicate effectively with team members, stakeholders, and management to provide regular updates on system performance and reliability.
Extensive Experience With Infrastructure And Software Engineering: A Senior Staff Site Reliability Engineer At Nvidia Should Have A Strong Background In Both Infrastructure And Software Engineering, With A Deep Understanding Of How These Two Areas Intersect And Impact Each Other.
Proficiency In Multiple Programming Languages: The Ideal Candidate Should Be Proficient In Multiple Programming Languages, Such As Python, Java, And Shell Scripting, To Build And Maintain Reliable Systems And Automation Tools.
Expertise In Cloud Computing: As Nvidia's Products Are Primarily Cloud-Based, The Senior Staff Site Reliability Engineer Should Have A Deep Understanding Of Cloud Computing Platforms, Such As Aws, Azure, And Gcp, And Be Able To Design And Optimize Infrastructure For These Environments.
Strong Troubleshooting And Problem-Solving Skills: A Senior Staff Site Reliability Engineer Should Be Able To Quickly Identify And Troubleshoot Complex Issues In A Production Environment, Using Various Tools And Techniques To Resolve Them Effectively.
Leadership And Project Management Experience: In Addition To Technical Skills, A Senior Staff Site Reliability Engineer Should Have Experience Leading And Managing Projects, As Well As Mentoring And Training Junior Team Members. They Should Also Have Excellent Communication Skills To Collaborate With Cross-Functional Teams And Stakeholders.
Security
Virtualization
Networking
Scripting
Automation
Cloud Computing
Disaster recovery
Performance tuning
Containerization
Linux/UNIX
Configuration management
Monitoring
Communication
Conflict Resolution
Emotional Intelligence
Leadership
Problem Solving
Time management
creativity
Attention to detail
Teamwork
Adaptability
According to JobzMall, the average salary range for a Senior Staff Site Reliability Engineer in Santa Clara, CA, USA is between $170,000 and $200,000 per year. This range can vary based on factors such as experience, education, and specific job duties. Additionally, location and company size may also impact salary range.
Apply with Video Cover Letter Add a warm greeting to your application and stand out!
NVIDIA Corp. designs and manufactures computer graphics processors, chipsets, and related multimedia software. The company operates through two segments: Graphics Processing Unit and Tegra Processor. The Graphics Processing Unit segment includes sales of the company's GeForce discrete and chipset products that supports desktop and notebook PCs plus license fees from Intel and sales of memory products. The Tegra Processors segment provides processors that deliver superior visual and multimedia experience on tablets, smart phones and gaming devices while consuming minimal power.

Get interviewed today!
JobzMall is the world‘ s largest video talent marketplace.It‘s ultrafast, fun, and human.
Get Started