Releady
OVERVIEW
This Principal Observability Architect role supports a major Washington-based client in the financial services industry. This institution operates mission‑critical systems across banking, payments, digital channels, and regulatory reporting—requiring extremely high reliability, auditability, and operational transparency.
The architect will lead an enterprise‑wide review of the current observability ecosystem, evaluate tooling effectiveness, identify coverage gaps, and produce a strategic roadmap and architectural recommendations for a unified, scalable observability platform. This includes ensuring alignment with financial industry requirements around uptime, risk controls, audit readiness, operational resiliency, and incident transparency.
A core mandate is to help define how AI/ML can improve incident detection, reduce noise, strengthen root cause analysis, and accelerate recovery in a regulated environment.
This role serves as the client’s principal advisor, guiding engineering, SRE, DevOps, cloud, risk, and operations teams on observability strategy, technology modernization, and enterprise standards.
Duration:
6+ months (not C2C eligible)
Location:
Remote (PST hours; NOT based in California)
Rate:
$90 – $120/hr DOE
Work Authorization:
Must be able to work on W2 without sponsorship
RESPONSIBILITIES
Build a comprehensive architecture assessment outlining gaps, duplication, cost inefficiencies, and modernization opportunities.
Design a target‑state observability architecture that integrates tools across the financial organization (Grafana, Sumo, AppDynamics, New Relic, Prometheus, OTel, etc.).
Build executive and operational dashboards in Grafana and related platforms.
Develop cross‑functional observability patterns connecting data sources across tools (Grafana, Sumo Logic, AppDynamics, New Relic, ThousandEyes, etc.).
Create standardized templates for alerting, logging structure, distributed tracing, and SLO/SLI frameworks.
Instrument applications and services into modern observability pipelines, including OpenTelemetry adoption.
Establish and maintain repeatable patterns for engineering team onboarding and telemetry consistency.
Build playbooks and automation for monitoring setup and configuration using tools such as Ansible or similar automation platforms.
Evaluate, recommend, and guide implementation of observability, APM, AIOps, and incident intelligence tooling.
Provide ongoing consulting to teams to ensure observability coverage meets financial‑industry reliability, audit, and risk requirements.
QUALIFICATIONS Required Skills
10+ years in SRE, DevOps, Observability Engineering, Platform Engineering, or Systems Engineering roles.
3+ years architecting enterprise‑scale observability systems, preferably in regulated or financial environments.
Advanced Grafana expertise — complex dashboards, transformations, templating, SLO/SLI modeling, enterprise integrations.
Strong experience evaluating existing monitoring ecosystems, tooling, platforms and producing strategic recommendations.
Deep knowledge of SRE principles—service health modeling, SLIs/SLOs, error budgets, incident analysis, distributed systems.
Experience designing, integrating, and implementing AI/ML into observability or incident response workflows.
Ability to partner with engineering, operations, risk, and compliance teams to gather requirements and translate them into observability architecture.
Strong communication skills—able to clearly explain technical concepts, tradeoffs, and architectural choices to senior stakeholders and executive leadership.
Preferred
Experience with ThousandEyes, AppDynamics, New Relic, Sumo Logic, or similar APM/logging platforms.
Familiarity with OpenTelemetry, Kubernetes, cloud platforms, CI/CD pipelines, and modern software delivery practices.
Experience contributing to organization‑wide observability governance, tool standardization, or telemetry frameworks.
Background in high‑uptime, regulated industries such as banking, payments, fintech, telecom, or enterprise SaaS.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability status, or other non‑merit factor. We are committed to creating a diverse and inclusive environment for all employees.
#J-18808-Ljbffr
The architect will lead an enterprise‑wide review of the current observability ecosystem, evaluate tooling effectiveness, identify coverage gaps, and produce a strategic roadmap and architectural recommendations for a unified, scalable observability platform. This includes ensuring alignment with financial industry requirements around uptime, risk controls, audit readiness, operational resiliency, and incident transparency.
A core mandate is to help define how AI/ML can improve incident detection, reduce noise, strengthen root cause analysis, and accelerate recovery in a regulated environment.
This role serves as the client’s principal advisor, guiding engineering, SRE, DevOps, cloud, risk, and operations teams on observability strategy, technology modernization, and enterprise standards.
Duration:
6+ months (not C2C eligible)
Location:
Remote (PST hours; NOT based in California)
Rate:
$90 – $120/hr DOE
Work Authorization:
Must be able to work on W2 without sponsorship
RESPONSIBILITIES
Build a comprehensive architecture assessment outlining gaps, duplication, cost inefficiencies, and modernization opportunities.
Design a target‑state observability architecture that integrates tools across the financial organization (Grafana, Sumo, AppDynamics, New Relic, Prometheus, OTel, etc.).
Build executive and operational dashboards in Grafana and related platforms.
Develop cross‑functional observability patterns connecting data sources across tools (Grafana, Sumo Logic, AppDynamics, New Relic, ThousandEyes, etc.).
Create standardized templates for alerting, logging structure, distributed tracing, and SLO/SLI frameworks.
Instrument applications and services into modern observability pipelines, including OpenTelemetry adoption.
Establish and maintain repeatable patterns for engineering team onboarding and telemetry consistency.
Build playbooks and automation for monitoring setup and configuration using tools such as Ansible or similar automation platforms.
Evaluate, recommend, and guide implementation of observability, APM, AIOps, and incident intelligence tooling.
Provide ongoing consulting to teams to ensure observability coverage meets financial‑industry reliability, audit, and risk requirements.
QUALIFICATIONS Required Skills
10+ years in SRE, DevOps, Observability Engineering, Platform Engineering, or Systems Engineering roles.
3+ years architecting enterprise‑scale observability systems, preferably in regulated or financial environments.
Advanced Grafana expertise — complex dashboards, transformations, templating, SLO/SLI modeling, enterprise integrations.
Strong experience evaluating existing monitoring ecosystems, tooling, platforms and producing strategic recommendations.
Deep knowledge of SRE principles—service health modeling, SLIs/SLOs, error budgets, incident analysis, distributed systems.
Experience designing, integrating, and implementing AI/ML into observability or incident response workflows.
Ability to partner with engineering, operations, risk, and compliance teams to gather requirements and translate them into observability architecture.
Strong communication skills—able to clearly explain technical concepts, tradeoffs, and architectural choices to senior stakeholders and executive leadership.
Preferred
Experience with ThousandEyes, AppDynamics, New Relic, Sumo Logic, or similar APM/logging platforms.
Familiarity with OpenTelemetry, Kubernetes, cloud platforms, CI/CD pipelines, and modern software delivery practices.
Experience contributing to organization‑wide observability governance, tool standardization, or telemetry frameworks.
Background in high‑uptime, regulated industries such as banking, payments, fintech, telecom, or enterprise SaaS.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, disability status, or other non‑merit factor. We are committed to creating a diverse and inclusive environment for all employees.
#J-18808-Ljbffr