Logo
TechDigital Group

Product Operations Support Engineer

TechDigital Group, Nashville, Tennessee, United States, 37247

Save Job

Overview Customer is seeking a Production Operations Support Engineer to provide end-to-end support for its Direct Sales Program. This role is critical to ensuring smooth day-to-day production operations of the Field Rep App and associated backend services, including monitoring, troubleshooting, and performance optimization across multiple integrated systems.

The ideal candidate will have strong debugging skills, experience in modern application stacks, and a proven ability to work across UI, middleware, and backend services in a production environment.

Key Responsibilities

Provide L2/L3 production support for the Direct Sales program, ensuring high availability, reliability, and performance of FRApp and its integrated systems.

Troubleshoot and debug issues across multiple layers of the stack:

UI: React Native with TypeScript

Backend-for-Frontend (BFF): Spring Boot microservices

OMS Services: Spring Boot microservices deployed on Azure Kubernetes Service (AKS)

Sterling OMS: Request/response analysis, order flow troubleshooting

Monitor and analyze Quantum sessions, Splunk logs, and Dynatrace dashboards to proactively identify, diagnose, and resolve incidents.

Partner with development teams to perform root cause analysis (RCA), recommend fixes, and implement long-term solutions to recurring production issues.

Collaborate with infrastructure and DevOps teams to ensure proper monitoring, alerting, and scaling strategies are in place for production workloads.

Ensure SLAs for incident response and resolution are consistently met.

Provide clear and timely communication with stakeholders, including field representatives, IT leadership, and business users, regarding issue status and resolution.

Contribute to the creation of knowledge base articles, playbooks, and standard operating procedures for production support.

Required Skills & Experience

Strong experience in production support or application operations in a multi-system enterprise environment.

Hands-on experience in troubleshooting:

React Native (TypeScript) UIs

Spring Boot microservices (BFF and OMS services)

Sterling OMS request/response, flows

Familiarity with AKS (Azure Kubernetes Service) and containerized deployments.

Proficiency with monitoring/logging tools:

Dynatrace (APM monitoring and troubleshooting)

Splunk (log analysis and query creation)

Quantum sessions analysis

Strong analytical and problem-solving skills with the ability to diagnose issues across distributed systems.

#J-18808-Ljbffr