Experienced Site Reliability Engineer (SRE) or Dev Ops engineer sought to shape the future of cloud platform services, joining a new team at an early stage and making a significant impact on systems and services offered to customers and partners in EMEA and beyond.
Key Responsibilities:
- Designing, implementing, and maintaining robust, scalable, high-quality software and systems within the SRE domain.
- Managing and supporting in-house development systems, CI/CD pipelines, tools, monitoring, and alerting, with a focus on automation to streamline activities and reduce toil.
- Collaborating closely with developers and architects to ensure designed solutions meet non-functional requirements such as availability, performance, security, and maintainability.
- Leading incident management, post-mortem reviews, and continuous improvement initiatives, contributing to the evolution of processes and systems within the organization.
- Defining key metrics and technical decisions driving products and their delivery, including SLOs and SLAs, architecture, best practices, and cost optimization.
- Taking responsibility for complex project tasks, striving for higher standards of individual and team performance.
- Building relationships external to the team and driving knowledge sharing across all products.
- Identifying personal development opportunities, setting goals, and delivering impactful contributions, while also mentoring and training other engineers throughout the company.
Requirements:
- 3-5 years of experience using AWS platform and services.
- Proficiency with common tools and technologies used within CI/CD and Build pipelines, including Terraform, Jenkins, Gitlab, Nexus, Ansible, Maven, Docker, and Helm.
- A BSc in an IT-related field or equivalent professional experience in cloud operations and/or cloud platforms in a Dev Ops engineering or SRE role.
- Familiarity with commonly used operating systems including Ubuntu and Windows, along with scripting experience in Bash, Python, or Power Shell.
- Experience troubleshooting issues in a cloud environment and working with multiple teams to facilitate orderly project and release plans.
- Familiarity with VMWare v Sphere (ESXi, v Center) desirable.
Desirable Skills:
- Extensive experience across a wide range of AWS services, including compute, containerization, storage, database, networking, automation, IAM, security, monitoring, logging, backup, and configuration management.
- Previous experience in an SRE team and understanding of SRE principles.
- Experience in backup and restore processes, cloud-based multi-tenancy, emergency response, and on-call duties.
- Understanding of current best practices around Security Management, patching, branching strategy, release management, and Linux administration.
- Experience working in an agile development team using SCRUM methodologies.