Logo
Convergenz

Operations Center Analyst (Tier 3)

Convergenz, Washington, District of Columbia, us, 20022

Save Job

The Integrated Operation Center (IOC) Analyst is essential to the success of the newly established Operations Center. This role provides 24/7 monitoring, rapid incident response, and cross functional coordination to ensure service stability and minimize downtime of IT delivered services. With increasing demands on IT services, the position is critical for maintaining operational readiness, improving service delivery , and supporting the mission.

The contractor shall provide qualified personnel to perform work in the following areas: ? Monitoring IT systems, infrastructure, and applications to detect issues Respond to incidents, and escalating complex problems when necessary Use various tools to identify and diagnose performance problems Responsible for writing and disseminating Alert notifications to stakeholders Write and contribute to post-incident analysis and documentation Technical skills in monitoring and observability tools Strong problem-solving skills in various disciplines Ability to use Business tools to create business workflow automation, and reports Knowledge of ITIL and ITSM Processes Ability to diagnose and troubleshoot network, server, application, and database issues at a Tier 3 level Be able to interact professionally with executive-level customers and management in resolving technical problems on an emergency basis.

Specific Responsibilities include: Incident Monitoring and Detection

Identify opportunities to improve operational workflows and system monitoring effectiveness for the IOC. Alert Management: Respond to alerts generated by monitoring tools (e.g., Nagios, or Dynatrace to assess and triage their severity, categorization, and impact on business operations. Monitor System Health: Continuously monitor IT systems, infrastructure, and applications to identify performance issues, downtime, or anomalies that could lead to incidents.

Incident Response and Troubleshooting:

Initial Incident Response: For incidents that are detected or reported, respond to Tier 1 customer reported issues promptly, assessing the problem to determine if escalation is required. Incident Classification: Categorize and prioritize incidents according to their impact and urgency, in line with ITIL, ITSM guidelines and IOC policy and procedures.

Escalation Management: Follow established escalation protocols when incidents cannot be resolved quickly, ensuring that issues are forwarded to the right department or expertise.

Communication with Stakeholders: Notify relevant stakeholders (internal teams, management, customers) of ongoing incidents, potential impacts, and updates on resolution progress. Escalation Documentation: Ensure that all escalations are documented correctly in the Problem management system (e.g. BMC Helix) for tracking, analysis, and reporting. Coordination with Tier 1 Technology Service desk: Provide guidance and support to Tier 1 support for proper incident prioritization, ensuring they escalate complex issues, when necessary, assist with troubleshooting and vetting user widespread incidents. Create, update and revise knowledge articles pertaining to internal and external processes used for incident response and escalation.

Observability and Continuous Monitoring:

Implement Monitoring Tools: Utilize and configure observability dashboards for the Operations Center (e.g., Dynatrace to gain deeper visibility into application, network, and infrastructure performance. Proactive Monitoring: Monitor the environment proactively to detect potential performance bottlenecks or security vulnerabilities that could escalate into major incidents. Documentation of Incidents: Maintain detailed records of incidents and resolution ns to ensure knowledge is shared across teams and improve future response times.

Reporting and Documentation:

Incident Reporting: Regularly report incident trends, recurring issues, and system health to management for continuous improvement. Shift Handover: Document incident status, ongoing issues, and relevant information during shift handover to ensure continuity of operations. Watch Commander: Assume duties serving as Watch Commander for HIROC activations at the discretion of the IOC Manager.

Required Minimum Qualifications:

US Citizenship Excellent analytical, problem solving, and communication skills Strong understanding of IT operations, infrastructure, and service delivery practices Familiarity with enterprise monitoring tools Ability to work in 24/7 operational environment Experience in ticketing systems and incident management tools Familiarity with knowledge management Experience with Power BI, Power Automate, Power Automate required.

Education Requirements

Related IT certifications preferred but not required ? Related college degree preferred but not required. ITIL qualification preferred CompTIA Certification preferred but not required. Formalized training in technology being supported Certification in technical domain is a plus

o Networking o Monitoring o Microsoft products Technical Experience Requirements

Technical Skills: Proficiency in monitoring tools (e.g., Nagios, Dynatrace), observability platforms (e.g., Dynatrace), and incident management tools (e.g., ServiceNow, Jira, BMC Helix). Knowledge of ITIL and ITSM Processes: Strong understanding of ITIL processes for incident, problem, change, and service level management. Troubleshooting Skills: Ability to diagnose and troubleshoot network, server, application, and database issues at a Tier 3 level. Communication Skills: Clear and concise communication, especially under pressure when liaising with other teams or stakeholders. Incident Escalation and Prioritization: Ability to determine incident severity and impact, escalating efficiently based on predefined procedures. Problem-Solving Skills: Strong analytical skills for root cause analysis and resolution of recurring issues. Technical competence: in creating reports, data analysis, and using Microsoft power platform tools like Power Automate for business process workflows, PowerBi for reporting operational status and metrics.