Itcharm

DevOps Engineer (AWS)

Itcharm, WorkFromHome

About Us and the Role is a fast-growing software development company delivering innovative solutions to clients around the world, with a strong focus on the US market. Headquartered in sunny San Diego, California, we’ve built a team of over 300 talented professionals across multiple countries. Our success comes from a deep commitment to results and long‑term partnerships built on trust. We’re launching a transformative partnership with a new US‑based client in the data quality and enrichment space who is embarking on a bold modernization initiative. This client is transitioning from a fragmented legacy ecosystem to a fully cloud‑native, scalable platform built on AWS. The transformation is not just technical – it’s strategic, touching every layer of infrastructure, product delivery, and operational efficiency. As a DevOps Engineer, you’ll be at the center of this journey, helping architect the systems and practices that will support long‑term growth, automation, and innovation. You’ll play a critical role in designing and implementing the infrastructure and DevOps strategy that enables this transformation. This includes solving deep‑rooted challenges around scalability, release orchestration, cost optimization, and environment consistency – while laying the foundation for a secure, observable, and automated cloud environment. This is a hands‑on role with strategic influence, ideal for someone who thrives in high‑autonomy environments and enjoys solving complex infrastructure challenges.

Tech Stack Snapshot

Cloud Provider: AWS
Compute & Orchestration: ECS, Fargate, Elastic Beanstalk, Lambda, EC2 (legacy)
Storage & Data: S3, RDS (PostgreSQL), Aurora
IaC: Terraform (selective use, expanding)
Containers & Configuration: ECR, AppConfig
Messaging & Coordination: SNS, SQS
Monitoring & Logging: CloudWatch, RDS Console
CI/CD: GitHub Actions, manual deployments (transitioning to full automation)

Key Responsibilities

Infrastructure & Environment Management

Design and provision isolated environments for development, QA, staging, and production using AWS best practices.
Standardize infrastructure provisioning using Terraform, ensuring consistency and version control across services.
Improve IAM role management and automate access provisioning to support secure and flexible operations.
Define ownership and review protocols for infrastructure changes and environment templates.

CI/CD & Deployment Automation

Architect and implement unified CI/CD pipelines across a diverse service landscape using GitHub Actions.
Integrate automated testing, linting, security scanning, and deployment validation into every pipeline.
Formalize and enforce a branching strategy to improve release management, collaboration, and CI/CD stability.
Introduce rollback strategies (e.g., blue/green or canary deployments) to support safe and resilient releases.
Establish isolated QA environments to support pre‑production testing and reduce deployment risk.

Cloud Architecture & Scalability Engineering

Implement auto‑scaling policies based on real‑time load metrics using ECS, Fargate, and Lambda.
Conduct performance simulations to validate scaling behavior and forecast capacity needs.
Address architectural bottlenecks, particularly in the database layer (e.g., write throughput, replication latency).
Define and monitor non‑functional scalability requirements (NFRs) and continuously improve system responsiveness.

Monitoring, Observability & Incident Response

Introduce full‑stack observability tools (e.g., Datadog, AWS X‑Ray, New Relic) for distributed tracing and performance insights.
Implement centralized logging using ELK stack or CloudWatch Logs Insights.
Define and monitor SLOs/SLIs for critical services and set up alerting and dashboards for ingestion pipelines and APIs.
Participate in incident response and postmortems, driving continuous improvement in system reliability and recovery.

Security & Compliance

Enforce secure configuration management and secrets handling across environments.
Support SoC2 and GDPR compliance efforts through infrastructure‑level controls and audit readiness.
Evaluate and implement network segmentation, VPC isolation, and firewall rules to enforce boundaries and reduce risk.

Cloud Cost Optimization & Governance

Audit AWS usage and apply cost optimization techniques including reserved instances, savings plans, and storage tiering.
Implement lifecycle policies for S3 and other storage services to manage data retention and cost.
Build dashboards to visualize resource usage, monitor cost trends, and align infrastructure decisions with budget targets.

Collaboration & Enablement

Work closely with engineering teams to unblock delivery and improve deployment confidence.
Support QA and product teams by stabilizing environments and improving release visibility.
Champion DevOps best practices across teams and foster a culture of automation, ownership, and continuous delivery.

Required Qualification

5+ years of experience in DevOps, Site Reliability Engineering, or Cloud Infrastructure roles, ideally within high‑availability, distributed systems.
Deep expertise in AWS, including hands‑on experience with ECS, Lambda, RDS, S3, IAM, and messaging services like SNS/SQS.
Proficiency in Terraform for infrastructure provisioning and lifecycle management, with a strong understanding of version control and CI/CD integration.
Hands‑on experience with GitHub Actions, including pipeline design, deployment automation, and quality gate implementation.
Strong grasp of CI/CD principles, branching strategies, and deployment methodologies (e.g., blue/green, canary).
Solid understanding of containerization and orchestration using Docker, Fargate, and EKS, with experience managing container images via ECR.
Familiarity with monitoring and observability tools, such as CloudWatch, Datadog, ELK stack, and AWS X‑Ray, with the ability to define and track SLOs/SLIs.
Working knowledge of security best practices, including secrets management, IAM policies, and infrastructure controls aligned with SoC2 and GDPR compliance.
Excellent communication and collaboration skills in English, with the ability to work cross‑functionally and support engineering, QA, and product teams.

Preferred Qualifications

Experience with PostgreSQL performance tuning, replication strategies, and backup automation.
Exposure to data pipelines and ETL workflows, especially in batch or event‑driven architectures.
Familiarity with cloud cost optimization techniques, including reserved instances, storage tiering, and usage forecasting.
Experience supporting QA automation, test environments, and release validation workflows.

Why choose us?

We’re a company where curiosity fuels everything – from new ideas to personal growth.
We believe that common sense sits at the heart of everything we do. It helps us stay focused, make smart decisions, and move fast without the noise.
We trust each other to follow through and own the outcome, no micromanagement, just real commitment.
You’ll be heard here, even when you challenge the status quo. Because courage matters. Who dares — wins.

#J-18808-Ljbffr