Logo
SS&C Technologies

Senior Observability Platform Engineer

SS&C Technologies, Chicago, Illinois, United States, 60290

Save Job

Overview

Senior Observability Platform Engineer at SS&C Technologies. Join a leading financial services and healthcare technology company with a global presence and a large engineering community. The role focuses on building and maintaining an observability stack to monitor system metrics, database performance, network health, and message queues across cloud, on-premises, and private cloud environments. Responsibilities

Design, develop, implement, and maintain our comprehensive observability stack, including tracing, telemetry, logging, health monitoring, visualization, and dashboards to ensure reliability, performance, and operational efficiency. Design and implement a robust observability framework using composable open source solutions such as Prometheus, Alertmanager, OpenTelemetry, Grafana, Loki, Promtail, Tempo, Thanos, ELK stack, Zabbix, and similar tools. Develop and maintain health monitoring and alerting systems for compute platforms, databases, network infrastructure, and Kubernetes-based platforms (including GPU-supported environments). Create and manage visualization dashboards to monitor system performance, resource utilization, and operational health. Implement scalable, distributed logging and tracing solutions to diagnose, troubleshoot, and resolve system issues effectively. Collaborate with development and operations teams to integrate observability practices into the development lifecycle. Conduct performance analysis and optimization to ensure system reliability and efficiency. Stay updated with the latest trends and technologies in observability and performance monitoring. Collaborate with cross-functional teams (Cloud Engineering, Network, and DevOps/Solutions Engineering) to troubleshoot and resolve infrastructure issues. Preferred Qualifications

Proven experience in observability, system and network monitoring, and system performance analysis in cloud or data center environments. Expertise in implementing and managing observability tools and technologies, including Prometheus, Alertmanager, OpenTelemetry, Grafana, Loki, Promtail, Tempo, Thanos, ELK stack, Zabbix, and similar tools. Hands-on experience with Kubernetes. Experience with infrastructure-as-code and configuration management tools (e.g., Terraform, Consul, GitHub, Salt Stack). Proficiency in scripting and automation (Go, Python, Shell). Excellent problem-solving skills and the ability to work independently or in a team. Strong communication skills and the ability to work in a fast-paced, dynamic environment. Educational Qualifications

Bachelors or Masters degree in Computer Science, Information Technology, or a related field. Additional Information

SS&C Technologies is an Equal Employment Opportunity employer and does not discriminate on the basis of race, color, religious creed, gender, age, marital status, sexual orientation, national origin, disability, veteran status, or any other classification protected by applicable discrimination laws. SS&C offers health, dental, 401k, and tuition and professional development reimbursement. Unsolicited resumes from headhunters, recruitment agencies, or fee-based recruitment services will not be accepted unless explicitly requested.

#J-18808-Ljbffr