Yochana
Role: Cloud Infra SME
Terraforms & GKE Location: Plano, TX-Remote
Role Summary: The Cloud Administrator SME (Onshore) is a critical, customer-facing technical expert responsible for the 24x7 operational health, effective L2 incident management, and onsite support of our hybrid, multi-cloud infrastructure supporting mission-critical Healthcare systems like EPIC Electronic Health Record (EHR) workloads across Azure, GCP, and OCI. This role requires hands-on expertise in IaC, container orchestration (GKE), and ensuring absolute compliance with HIPAA and other healthcare regulations, with a strong focus on high-touch coordination during major hospital events.
Key Responsibilities: Serve as the primary L2/L3 Onshore technical expert for all cloud infrastructure and EPIC workload incidents, ensuring rapid diagnosis and resolution to minimize impact on clinical operations. Lead communications during active incidents, providing clear, concise, and professional updates to hospital IT teams, clinical stakeholders, and internal leadership. Perform onsite, hands-on support during critical periods, including EPIC go-lives, major version upgrades, patching windows, and disaster recovery drills. Coordinate directly with EPIC technical staff, hospital IT teams (network, security, application), and vendors to resolve complex cross-functional dependencies. Daily administration, monitoring, and proactive remediation of IaaS/PaaS services across Azure, GCP, and Oracle Cloud Infrastructure (OCI) supporting EPIC environments. Operational expertise with Google Kubernetes Engine (GKE): Manage the deployment, scaling, health, and operational troubleshooting of containerized EPIC components or supporting microservices. Perform resource provisioning, capacity monitoring, and infrastructure tagging for cost management and chargebacks across all multi-cloud tenants. Maintain and validate IAM, backup, and geo-redundant Disaster Recovery (DR)/Business Continuity Planning (BCP) mechanisms for EPIC and integrated hospital systems. Implement and enforce security baselines and compliance controls (e.g., HIPAA, SOC2, CIS, NIST) at the infrastructure layer across Azure, GCP, and OCI. Drive automation efforts for routine operational tasks and provisioning using Terraform for IaC and scripting tools (PowerShell, Python) to ensure operational consistency and auditability. Manage log monitoring, correlation, and initial incident response for security and performance events within the EPIC cloud environments. Tools: Terraform, GitHub, Ansible, Tanium, PowerShell, YAML, Bash, Python.
Qualifications & Technical Skills: Experience & Environment: 710 years in infrastructure operations, with at least 5 years dedicated to Cloud Administration and 5+ years supporting EPIC EHR or similar mission-critical clinical systems in a hybrid environment. Multi-Cloud Hands-on Expertise: Proven expertise in the administration, configuration, and operational support of Microsoft Azure, Google Cloud Platform (GCP), and Oracle Cloud Infrastructure (OCI). Containerization: Strong practical experience in the operational management, troubleshooting, and administration of Google Kubernetes Engine (GKE) clusters. EPIC Systems Knowledge: Solid understanding of EPIC infrastructure requirements, deployment topologies (e.g., Clarity, Chronicles), and operational dependencies. Automation & Scripting: Highly proficient with: IaC: Terraform for multi-cloud deployments. Scripting: PowerShell, Python, YAML, and Bash. Compliance: Deep knowledge of HIPAA requirements and operational procedures necessary to maintain compliance in a cloud environment. Communication & Coordination: Exceptional onsite communication, stakeholder management, and incident command skills required to effectively coordinate during critical hospital operations.
Terraforms & GKE Location: Plano, TX-Remote
Role Summary: The Cloud Administrator SME (Onshore) is a critical, customer-facing technical expert responsible for the 24x7 operational health, effective L2 incident management, and onsite support of our hybrid, multi-cloud infrastructure supporting mission-critical Healthcare systems like EPIC Electronic Health Record (EHR) workloads across Azure, GCP, and OCI. This role requires hands-on expertise in IaC, container orchestration (GKE), and ensuring absolute compliance with HIPAA and other healthcare regulations, with a strong focus on high-touch coordination during major hospital events.
Key Responsibilities: Serve as the primary L2/L3 Onshore technical expert for all cloud infrastructure and EPIC workload incidents, ensuring rapid diagnosis and resolution to minimize impact on clinical operations. Lead communications during active incidents, providing clear, concise, and professional updates to hospital IT teams, clinical stakeholders, and internal leadership. Perform onsite, hands-on support during critical periods, including EPIC go-lives, major version upgrades, patching windows, and disaster recovery drills. Coordinate directly with EPIC technical staff, hospital IT teams (network, security, application), and vendors to resolve complex cross-functional dependencies. Daily administration, monitoring, and proactive remediation of IaaS/PaaS services across Azure, GCP, and Oracle Cloud Infrastructure (OCI) supporting EPIC environments. Operational expertise with Google Kubernetes Engine (GKE): Manage the deployment, scaling, health, and operational troubleshooting of containerized EPIC components or supporting microservices. Perform resource provisioning, capacity monitoring, and infrastructure tagging for cost management and chargebacks across all multi-cloud tenants. Maintain and validate IAM, backup, and geo-redundant Disaster Recovery (DR)/Business Continuity Planning (BCP) mechanisms for EPIC and integrated hospital systems. Implement and enforce security baselines and compliance controls (e.g., HIPAA, SOC2, CIS, NIST) at the infrastructure layer across Azure, GCP, and OCI. Drive automation efforts for routine operational tasks and provisioning using Terraform for IaC and scripting tools (PowerShell, Python) to ensure operational consistency and auditability. Manage log monitoring, correlation, and initial incident response for security and performance events within the EPIC cloud environments. Tools: Terraform, GitHub, Ansible, Tanium, PowerShell, YAML, Bash, Python.
Qualifications & Technical Skills: Experience & Environment: 710 years in infrastructure operations, with at least 5 years dedicated to Cloud Administration and 5+ years supporting EPIC EHR or similar mission-critical clinical systems in a hybrid environment. Multi-Cloud Hands-on Expertise: Proven expertise in the administration, configuration, and operational support of Microsoft Azure, Google Cloud Platform (GCP), and Oracle Cloud Infrastructure (OCI). Containerization: Strong practical experience in the operational management, troubleshooting, and administration of Google Kubernetes Engine (GKE) clusters. EPIC Systems Knowledge: Solid understanding of EPIC infrastructure requirements, deployment topologies (e.g., Clarity, Chronicles), and operational dependencies. Automation & Scripting: Highly proficient with: IaC: Terraform for multi-cloud deployments. Scripting: PowerShell, Python, YAML, and Bash. Compliance: Deep knowledge of HIPAA requirements and operational procedures necessary to maintain compliance in a cloud environment. Communication & Coordination: Exceptional onsite communication, stakeholder management, and incident command skills required to effectively coordinate during critical hospital operations.