Oliver Bernard
Site Reliability Engineer (SRE) | Spain - Remote | AWS, Kubernetes, Observability (Prometheus & Grafana), Terraform, Python/Go, CI/CD | €70-94K
Overall
A high‑growth global payments platform is looking for a Site Reliability Engineer to help design, build, and maintain centralised observability capabilities. You’ll work across mission‑critical, large‑scale systems used by major global brands, ensuring reliability, performance, and effective telemetry across the organisation.
Key Responsibilities
Design, implement, and maintain
observability pipelines
(logs, metrics, traces) using
OpenTelemetry .
Build
self‑service tooling
and automation enabling engineering teams to instrument and monitor their services.
Contribute to
incident management , owning processes, runbooks, and response automation.
Partner with product and engineering teams to define
monitoring, alerting, SLO/SLA
requirements.
Use
IaC
to provision and manage observability infrastructure and alerting configurations.
Establish baseline observability standards for new and existing services.
Continuously improve alert quality, signal‑to‑noise ratio, and operational reliability.
Core Tech & Skills
Strong Cloud experience with
AWS
OpenTelemetry
(collectors, pipelines, instrumentation)
Observability tooling:
Grafana, Prometheus, Loki, Datadog, New Relic
Terraform
+ GitOps CI/CD ( ArgoCD, GitHub Actions , similar)
Incident tooling:
PagerDuty, Jira
Scripting:
Python, Go
(or similar)
Strong experience in SRE, DevOps, or observability‑focused engineering roles (4+ years)
Seniority Level
Mid‑Senior level
Employment type
Full‑time
Job Function
Engineering, Finance, and Information Technology
Industries
Financial Services, Staffing and Recruiting, and IT Services and IT Consulting
#J-18808-Ljbffr
A high‑growth global payments platform is looking for a Site Reliability Engineer to help design, build, and maintain centralised observability capabilities. You’ll work across mission‑critical, large‑scale systems used by major global brands, ensuring reliability, performance, and effective telemetry across the organisation.
Key Responsibilities
Design, implement, and maintain
observability pipelines
(logs, metrics, traces) using
OpenTelemetry .
Build
self‑service tooling
and automation enabling engineering teams to instrument and monitor their services.
Contribute to
incident management , owning processes, runbooks, and response automation.
Partner with product and engineering teams to define
monitoring, alerting, SLO/SLA
requirements.
Use
IaC
to provision and manage observability infrastructure and alerting configurations.
Establish baseline observability standards for new and existing services.
Continuously improve alert quality, signal‑to‑noise ratio, and operational reliability.
Core Tech & Skills
Strong Cloud experience with
AWS
OpenTelemetry
(collectors, pipelines, instrumentation)
Observability tooling:
Grafana, Prometheus, Loki, Datadog, New Relic
Terraform
+ GitOps CI/CD ( ArgoCD, GitHub Actions , similar)
Incident tooling:
PagerDuty, Jira
Scripting:
Python, Go
(or similar)
Strong experience in SRE, DevOps, or observability‑focused engineering roles (4+ years)
Seniority Level
Mid‑Senior level
Employment type
Full‑time
Job Function
Engineering, Finance, and Information Technology
Industries
Financial Services, Staffing and Recruiting, and IT Services and IT Consulting
#J-18808-Ljbffr