Logo
iXceed Solutions

Site Reliability Engineer

iXceed Solutions, Germantown, Ohio, United States

Save Job

Platform Engineering & DevOps: Manage Kubernetes and container orchestration, including Helm chart configurations and CI/CD pipelines (Jenkins, ArgoCD). Develop automation scripts (Python, Bash, Go) and deploy Infrastructure-as-Code (IaC) solutions. Observability, Monitoring & Visualisation: Maintain Prometheus solutions (scrape configurations, alert rules, PromQL queries), administer Thanos and Grafana. Elastic Stack Operations & Log Management: Configure and optimise Elasticsearch clusters, Logstash pipelines, and Kibana dashboards for secure, scalable log processing. Incident Response, Troubleshooting & Collaboration: Participate in 24x7 on-call rotations for rapid incident response, troubleshoot platform, data and performance issues, and engage in Major Incident Management (MIM). Secure Operations & Compliance: Ensure system operations meet security and data protection requirements, maintain secure documentation, and manage access control policies. Requirements: Strong grasp of Linux concepts, preferably in Kubernetes environments. Solid understanding of networking fundamentals and REST APIs. Proficiency in Python, Go, or Bash. Proficiency in Git-based configuration management workflows. Familiarity with CI/CD tools like Helm, Jenkins, or ArgoCD. Experience with Elasticsearch and/or OpenSearch. Willingness to work shift-based 24x7 on-call support, including weekends and holidays. Must reside in Germany and hold a German labor contract or is ready to relocate to Germany. Preferred Certifications: Elastic Certified Engineer, LPIC Level 2, Kubernetes Administrator. Elastic Stack (Elasticsearch, Logstash, Kibana) Grafana Seniority level

Mid-Senior level Employment type

Full-time Job function

Information Technology Industries: Staffing and Recruiting

#J-18808-Ljbffr