Job Description
Join CVS Health Enterprise Technology and put your engineering skills to use shaping the future of developer platforms for software engineers at Fortune 6 scale. We're seeking a Staff Cloud Engineer to architect and scale the data pipelines for billions of logs, metrics, and traces produced by thousands of workloads. This is a dynamic new initiative with a pioneering business that is transforming health care in the United States by making customer experiences more seamless, convenient and personalised.
Responsibilities:
- Design and scale data pipelines for logs, metrics, and traces
- Develop custom software to drive the observability platform using technologies such as Java Spring Boot, Node JS, Golang etc.
- Help engineers, who are primarily our customers to the Observability Platform, in troubleshooting the issues with Observability Platform instrumentations
- Implement OTEL client libraries for technologies such as Java Spring Boot, Node JS, Go-lang etc.
- Participate in team 24/7/365 on-call rotations to ensure the health and stability of the Observability Platform
- Manage CI/CD pipelines for deploying and managing observability platform infrastructure to Kubernetes
- Deliver an exceptional customer experience by engaging with platform customers as they reach out with support questions
- Create comprehensive documentation for observability tools and technologies
- Watch the watchers by building and managing instrumentation and alerting for the Observability Platform itself to deliver a highly available platform
- Work closely with the SRE team to understand application team challenges in the observability space and identify opportunities to improve the Observability Platform to meet these challenges
- Regularly collaborate with principal engineers, engineering leaders, product managers, and architects across Enterprise Technology to define and deliver on the Observability Platform roadmap
REQUIRED QUALIFICATIONS:- 10+ years of experience in software engineering and/or site reliability engineering roles
- 10+ years of hands-on development experience with modern microservices using technologies such as Java Spring Boot, Go-Lang, Node JS etc.
- Strong exposure to cloud platforms such as GCP, AWS or Azure
- Strong familiarity with observability patterns and best practices including concepts like SLAs, SLOs, and SLIs
- Experience creating custom dashboards and views to understand system health and availability
- Extensive experience with modern infrastructure tooling like Docker, Kubernetes, Argo CD, Envoy/Istio
- Comfort using the Grafana Labs OSS stack: Loki, Grafana, Tempo, Mimir, et al.
- Excellent technical communication skills
PREFERRED QUALIFICATIONS:- Understanding of the Open Telemetry ecosystem, OTLP, and OTel Semantic Conventions
- Experience designing and scaling distributed systems
- Background in building and operating high-traffic backend services
- Familiarity with popular data-oriented open source technologies like Kafka and Postgres
EDUCATION:Bachelor's degree or, equivalent experience.