MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD
Lead Engineer – L2 Support & Infrastructure Operations
MANPOWER STAFFING SERVICES (SINGAPORE) PTE LTD, West Islip, New York, United States
Responsibilities:
Operational Support
Lead and coordinate level 2 support operations for mission-critical applications and infrastructure
Provide troubleshooting and diagnostics for incidents escalated from level 1
Ensure adherence to SLA, system availability
Application Support
Lead and resolve application incidents escalated from Level 1; perform root cause analysis and workarounds where possible
Lead and monitor application logs, integration points such as REST API, message queues, file-based transfer
Lead and liaise with Level 3 to resolve complex application issues and escalate bugs or enhancement requests
Lead and support / maintain job schedulers, interface configurations and integration points
Lead and document known issues, resolution procedure, rollback in the knowledge base
Incident & Problem Management
Act as incident manager for P1/P2 issues
Coordinate resolution and communications
Perform root cause analysis and recommend permanent fixes
Escalate unresolved issues that required software coding to Level 3 or engineering teams
Change Management
Perform operational impact assessment
Part of the CAB to review and approve change
Pre-Change Preparation such as review Change Request and Release Plan
Supervise post-change production verification
Documentation update and knowledge transfer
Post change review and feedback
Patch Management
Perform patch management readiness
Stakeholder coordination and team coordination
System Readiness and Post-Patch Validation
Documentation update and knowledge transfer
Compliance and audit readiness
Documentation and Compliance
Operational documentation. SOPs, Incident response checklist, RCA, PIR, monitoring and alert guidebook
Configuration & Infrastructure Documentation. System configuration baseline, application dependency maps, environment inventories such as hosts, services, accounts
Knowledge Base Articles for level 2 enablement and faster resolution e.g. Known Errors and Fixes, Frequent How-To Guides, Script Repositories, Lessons Learned
Knowledge Management
Configuration Management
Perform validation and accuracy of configurations
Maintain readiness of operational documentation
Perform audit to confirm compliance of configurations
CMDB asset verification
Change-linked configuration tracking
Ensure environment consistency between DEV – IVVQ – ISO-PROD – UAT and PROD
Testing and Verification
Ensure operational readiness testing before production deployment rollout
Ensure post-change verification coordination
Perform regression and sanity test following patching or upgrades, in UAT and PROD
Participation in user acceptance testing
Knowledge Management
Documentation of resolution
Knowledge Base Contribution
Validation of knowledge
Subject Matter Expertise Sharing
Root Cause Analysis
Gather logs, system metrics at the time of failure
Reproduction of issues in a controlled environment to understand the conditions under which it occurs
Determine the scope and severity in terms of the systems affected, downtime duration and business impact
Narrow down the possible sources of causing the failure
Use of diagnostic tools such to analyse the application behaviour
Correlation of events to sequence the chain of events leading up to the failure and identify the dependencies
Leadership
Supervision and provision of guidance to Level 2 engineers for change requests and service requests
Lead and manage day-to-day operations of the Level 2 support
Track and report the Level 2 key performance indicators such as resolution rate, mean time to resolve and system availability
Process and quality improvement. Document down known issues, troubleshooting steps and standard operating procedures. Propose improvements to incident handling
Identify tools and systems to streamline Level 2 support operations
Requirements:
Education and Experience
Bachelor Degree in Information Technology, Computer Science, Engineering, or a closely related discipline
At least 5 years in Level 2 support for mission critical 24x7 production support, preferably in public sector
At least 2 years in a team lead or supervisory role, coordinating tasks and mentoring junior engineers
Proven experience in handling P1/P2 incidents, managing post-incident reviews (PIRs) and root cause analysis
Preferably certification in Red Hat Enterprise Linux or Kubernetes
Knowledge/Skills
Operating Systems. RHEL (90%) and Windows Server (10%)
Networking Fundamentals
Middleware & Infrastructure (Web Server – Nginx, App Servers – Kubernetes with containers (Docker + Spring Boot)
Message Queues (IBM MQ, Kafka)
Java, C#, MQTT, Golang
Database (SQL Server, PostgreSQL)
ITIL/ITSM Process Knowledge
Security Awareness
DR and HA concepts
Technical Skills
Leadership & Coordination
Communication & Collaboration
Operational Governance
#J-18808-Ljbffr
#J-18808-Ljbffr