Senior Site Reliability Engineer
Jobgether · Irlande
Job description
About the role
We are looking for a Senior Site Reliability Engineer to join a fast‑growing technology team building highly scalable, cloud‑native SaaS platforms. The role is remote‑first and offers a mix of hands‑on engineering, strategic impact, and mentorship in a modern, collaborative environment.
Key responsibilities
- Drive architecture and continuous improvement of cloud infrastructure and Kubernetes environments for high availability and scalability.
- Define and implement resilience strategies, including disaster recovery, rollback mechanisms, zero‑downtime deployments, and global scaling.
- Enhance observability frameworks and monitoring systems to ensure platform reliability and operational transparency.
- Improve Infrastructure as Code and self‑service platform capabilities to reduce operational overhead.
- Lead major incident management, coordinate post‑mortems, and implement long‑term reliability improvements.
- Mentor engineers within the Platform Squad through technical guidance and knowledge sharing.
- Collaborate with cross‑functional teams to shape platform roadmaps, architectural standards, and infrastructure strategies.
- Contribute to CI/CD pipeline enhancements, GitOps practices, and automation initiatives across the organization.
Required profile
- Minimum 5 years of hands‑on experience in SRE, Platform Engineering, DevOps, or Cloud Infrastructure.
- Proven experience building and operating high‑throughput, highly available production systems at scale.
- Deep expertise with Kubernetes environments on major cloud platforms.
- Strong experience with observability and monitoring tools such as Prometheus, Grafana, Loki, ELK, or Mimir.
- Solid programming skills in Go or Python, with a focus on infrastructure tooling and automation.
- Hands‑on experience with IaC tools like Terraform, Pulumi, OpenTofu, and GitOps frameworks such as ArgoCD.
- Strong understanding of CI/CD pipelines, reliability engineering principles, SLIs, SLOs, and error‑budget methodologies.
Required skills
- Kubernetes
- Major cloud platforms (AWS, GCP, Azure)
- Prometheus
- Grafana
- Loki
- ELK
- Mimir
- Go
- Python
- Terraform
- Pulumi
- OpenTofu
- ArgoCD
- GitOps
- CI/CD pipelines
Questions fréquentes
Why are you reporting this job?
Apply in 30 seconds
Enter your email to apply. An account will be created automatically.
By continuing, you accept our terms of use.
Already have an account? Login
Published 1 week ago
Expires 1 month from now
19 views · 0 interested
Boost your chances
Upload your CV — we will match you with relevant openings.
Analyzing your CV...
Jobgether
Irlande