Software Engineering Manager Ii Site Reliability Engineering Job In Dublin

Software Engineering Manager Ii, Site Reliability Engineering - Google Inc.
  • Dublin, Leinster, Ireland
  • via BeBee.com
-
Job Description


Software Engineering Manager II, Site Reliability Engineering

Job Summary: Lead a team of engineers to design, build, and maintain large-scale distributed systems, ensuring high availability, scalability, and performance.


Job Description:


As a Software Engineering Manager II, you will lead a team of engineers to design, build, and maintain large-scale distributed systems, ensuring high availability, scalability, and performance.

You will be responsible for owning end-to-end availability and performance of key services, building automation to prevent problem recurrence, and leading by example, mentoring the team, and establishing credibility through quality technical execution.


Key Responsibilities:

  • Lead a team of Software/Systems Engineers on projects for users and be directly responsible for uptime
  • Own end-to-end availability and performance of key services and build automation to prevent problem recurrence
  • Automate response to all non-exceptional service conditions
  • Lead by example, mentor the team, and establish credibility through quality technical execution
  • Manage on-call rotations across continents, using a follow-the-sun model
  • Design, write, and deliver software to improve the availability, scalability, latency, and efficiency of Google's services

Requirements:

  • Bachelor's degree in Computer Science, a related field, or equivalent practical experience
  • 8 years of experience with data structures or algorithms
  • 5 years of experience with software development in one or more programming languages
  • 3 years of people management experience, and experience designing, analyzing, and troubleshooting distributed systems
  • Experience working in computing, distributed systems, storage, or networking
  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems
  • Ability to debug, optimize code, and to automate routine tasks
  • Systematic problem-solving approach, coupled with effective communication skills

About the Company:


Google is a global company that combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems.

Our Site Reliability Engineering (SRE) team ensures that Google's services have reliability, uptime appropriate to users' needs, and a fast rate of improvement.

We promote a culture of diversity, intellectual curiosity, problem solving, and openness, and encourage collaboration, thinking big, and taking risks in a blame-free environment.

;