Logo
Career Techniques

Staff/Senior Platform Engineer

Career Techniques, Dallas, Texas, United States, 75215

Save Job

Overview

Hybrid - 3 days/week in-office • USC Preferred Key Responsibilities

Lead design and operations of multi-cloud infrastructure (AWS, Azure, On-Premises) and Kubernetes environments, including high-availability, autoscaling, and secure networking. Architect and manage distributed compute systems that support large-scale, parallel workloads with efficient job scheduling and resource management. Build resilient CI/CD pipelines using GitHub Actions and drive DevSecOps culture across teams. Implement zero-trust networking and secure connectivity solutions using Tailscale. Implement and maintain workspace automation (Coder Workspaces, Infrastructure as Code) to empower developers across projects. Own end-to-end platform observability: performance tuning, cost optimization, alerting, and incident response using tools like Grafana and Prometheus. Integrate and maintain secure authentication and identity management, leveraging OIDC, OAuth2, SSO, and RBAC. Work closely with development teams to influence cloud-native architecture decisions. Evaluate and adopt emerging technologies to continuously evolve platform capabilities. Write, build, and push quality compact and optimized container images to maximize workload performance at scale. Tech Stack

Cloud Platforms: AWS (EKS, EC2, IAM, SSM, CloudWatch), Azure (AKS, AD, ARM) Kubernetes: Helm, CRDs, Operators, k0s, HPA, Network Policies, Ingress Controllers CI/CD & Automation: GitHub Actions, Terraform, Docker, ArgoCD Networking & Security: Tailscale, WireGuard, SSO/OIDC, Vault, RBAC Developer Tooling: Coder Workspaces, VSCode Remote, self-service portal frameworks Distributed Systems: Argo Workflows, Apache Airflow, Dask/Spark (nice to have) Observability: Prometheus, Grafana Ideal Candidate Profile Must-Have Qualifications

6+ years of experience building and operating cloud infrastructure in production environments. Deep understanding of Kubernetes internals, custom controllers/operators, and containerized workflows. Strong command of both AWS and Azure services and multi-cloud strategies. Expertise in writing secure, reusable infrastructure-as-code and CI/CD pipelines Proven experience managing distributed job scheduling, autoscaling workloads, and optimizing resource allocation, priority and queues. Strong grasp of network and authentication architectures, including VPN, mesh networking, and identity federation. Preferred Experience

Experience with Tailscale, AKS, k0s, and lightweight K8s distributions. Docker image building and CI/CD build and push automations. Experience driving platform adoption and developer enablement at scale

#J-18808-Ljbffr