The Judge Group
About the Role:
Our client is seeking a Site Reliability Engineer (SRE) with deep expertise in monitoring, debugging, and optimizing Azure App Services. This role is critical in ensuring our platforms remain reliable, performant, and scalable as we continue to grow.
Youll combine hands-on Azure experience with code-level debugging, observability best practices, and automation to prevent issues before they occur, drive down MTTD/MTTR, and deliver an exceptional experience for patients and providers. If you thrive at the intersection of infrastructure, development, and performance, this is the role for you.
What Youll Do: Monitoring & Debugging Design, implement, and fine-tune monitoring systems for Azure-based applications. Build custom dashboards with Azure Application Insights, Azure Monitor, and related tools. Analyze logs, metrics, and traces to proactively troubleshoot performance and reliability issues. Apply proficiency in C#, .NET, Angular, and SQL for code-level debugging and issue resolution.
Azure App Service Expertise Optimize application performance through a deep understanding of Azure App Service architecture. Configure, manage, and scale App Service environments for multiple applications.
Azure Tooling & Automation Leverage Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting to resolve application and infrastructure issues. Automate monitoring, alerting, and remediation workflows to improve reliability and reduce toil.
Application Performance Monitoring Use tools like Grafana, Prometheus, or other APM platforms to optimize system health and application performance. Stay adaptable and quickly learn new monitoring tools and frameworks as needed.
Collaboration & Communication Partner closely with developers and operations to design effective monitoring solutions. Document and communicate findings, solutions, and RCA reports with clarity and impact.
What Were Looking For: Bachelors degree in Computer Science, IT, or related field. Microsoft Azure Fundamentals (AZ-900) certification required Proven SRE experience with a focus on monitoring, debugging, and incident response. Extensive hands-on work with Azure App Services, Application Insights, and Azure Monitor. Skilled with Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting. Strong programming fundamentals with the ability to read and troubleshoot .NET/C# and Angular code. Experience in on-call operations, incident response, and RCA writing. Bonus: Experience with Grafana/Prometheus, DataDog/Dynatrace, Azure Front Door, CDN, Function Apps, WebJobs, Service Bus, or Event Hub. Excellent communication, collaboration, and problem-solving skills. Azure certifications are a strong plus.
What Youll Do: Monitoring & Debugging Design, implement, and fine-tune monitoring systems for Azure-based applications. Build custom dashboards with Azure Application Insights, Azure Monitor, and related tools. Analyze logs, metrics, and traces to proactively troubleshoot performance and reliability issues. Apply proficiency in C#, .NET, Angular, and SQL for code-level debugging and issue resolution.
Azure App Service Expertise Optimize application performance through a deep understanding of Azure App Service architecture. Configure, manage, and scale App Service environments for multiple applications.
Azure Tooling & Automation Leverage Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting to resolve application and infrastructure issues. Automate monitoring, alerting, and remediation workflows to improve reliability and reduce toil.
Application Performance Monitoring Use tools like Grafana, Prometheus, or other APM platforms to optimize system health and application performance. Stay adaptable and quickly learn new monitoring tools and frameworks as needed.
Collaboration & Communication Partner closely with developers and operations to design effective monitoring solutions. Document and communicate findings, solutions, and RCA reports with clarity and impact.
What Were Looking For: Bachelors degree in Computer Science, IT, or related field. Microsoft Azure Fundamentals (AZ-900) certification required Proven SRE experience with a focus on monitoring, debugging, and incident response. Extensive hands-on work with Azure App Services, Application Insights, and Azure Monitor. Skilled with Diagnose and Troubleshoot Tools, Kudu, and PowerShell scripting. Strong programming fundamentals with the ability to read and troubleshoot .NET/C# and Angular code. Experience in on-call operations, incident response, and RCA writing. Bonus: Experience with Grafana/Prometheus, DataDog/Dynatrace, Azure Front Door, CDN, Function Apps, WebJobs, Service Bus, or Event Hub. Excellent communication, collaboration, and problem-solving skills. Azure certifications are a strong plus.