Logo
Merciv

Senior DevOps Engineer

Merciv, New York, New York, us, 10261

Save Job

This range is provided by Merciv. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Base Pay Range

$160,000.00/yr - $220,000.00/yr

About Merciv

Merciv is pioneering autonomous retail intelligence through EVA (Evolving Virtual Analyst), our agentic AI platform that's transforming how the world's largest retailers operate. We don't just provide insights -- we enable AI agents to actively manage critical business functions from consumer intelligence to inventory optimization to competitive analysis. We've intentionally stayed under the radar while building something transformative. Our platform already powers retail intelligence for Fortune 500 leaders including Gap Inc. (and their portfolio brands like Old Navy, Banana Republic, and Athleta), Hain Celestial (Terra, Celestial Seasonings, etc.), and Boston Beer Company (Samuel Adams, Truly, etc.). Fresh off a $14M Series Seed, we’re scaling our team of innovators as we prepare for significant growth in 2025‑26. We believe in making every feature intuitive and every component interactive, helping our users leverage the tools they recognize in a whole new light. Our lean, agile team thrives on innovation and autonomy. Here, you won’t just fill a role—you’ll shape the future of our product and company. We embrace a "many hats" approach, offering unparalleled opportunities for growth and impact. Our work environment thrives on creativity and cutting‑edge technology, built for those eager to shape the future of human‑AI collaboration. If you’re excited about being a part of a fast‑paced, innovative startup culture and making a significant impact on client business, Merciv is the place for you. Join us as we expand the horizons of retail intelligence and deliver unparalleled value to our clients.

Current State

Multiple Fortune 500 enterprise customers Strong distribution partnerships driving majority of revenue Platform managing millions of data points across retail ecosystems Small (Sub‑20) team of exceptional AI researchers, engineers, and operators

Where We’re Going

50,000+ users on our platform within 12 months $20M+ ARR by end of 2026 Category leadership in autonomous retail operations Series A fundraise in next 12 months

The Role

Merciv is transforming retail through autonomous AI that doesn't just analyze businesses - it helps run them. Our platform powers some of the largest retail organizations in the world, processing data and managing workflows across millions of datapoints. We’re building the infrastructure that enables AI agents to inform consumer‑facing strategies, make million‑dollar inventory decisions, optimize pricing in real‑time, and orchestrate complex retail operations 24/7. As we scale our enterprise platform, we need a Senior DevOps Engineer who can build and maintain the rock‑solid, secure infrastructure that autonomous commerce demands. This is your chance to architect the systems powering the future of retail AI.

You’ll own the infrastructure that supports AI agents making split‑second decisions for major retailers. Working closely with our ML and backend engineers, you’ll ensure our platform maintains 99.97%+ uptime while handling Black Friday‑level traffic every day. This is a hands‑on role where you’ll build the secure, scalable systems that enterprise retailers trust with their entire operations. In this role, you’ll be the guardian of infrastructure that must be enterprise‑grade (SOC 2, GDPR, ISO 27001 compliant) while maintaining startup agility. Your work directly impacts whether a retailer’s AI can respond to market changes in milliseconds or minutes - a difference measured in millions of dollars.

What You’ll Do

Scale AI Infrastructure: Architect and optimize infrastructure supporting high‑volume daily agentic decisions Ensure Enterprise Reliability: Build systems that maintain 99.97% uptime for mission‑critical retail operations across Fortune 500 clients Automate Everything: Develop robust CI/CD pipelines for rapid ML model deployment and infrastructure updates without downtime Secure Sensitive Data: Implement and maintain SOC 2, GDPR, and ISO 27001 compliant infrastructure for enterprise retail data Optimize AI/ML Workflows: Partner with engineers to streamline model training, deployment, and inference pipelines at scale Champion GitOps: Implement infrastructure‑as‑code practices that let us scale from hundreds to thousands of AI agents seamlessly Monitor Autonomous Systems: Build observability into distributed agent networks processing millions of retail data points Enable Multi‑Tenancy: Design secure, isolated environments for enterprise clients while maintaining operational efficiency Integrate Enterprise Systems: Support seamless connections with Shopify Plus, SAP, Oracle Retail, and other major platforms Own Production Excellence: Lead incident response for a platform where minutes of downtime could mean millions in lost revenue

Core Requirements

6-10+ years of industry experience with at least 4 years in hands‑on DevOps roles 4+ years managing cloud infrastructure in production (AWS strongly preferred) 2+ years of production Kubernetes experience (EKS preferred)

Technical Skills

Cloud & Infrastructure

Expert‑level AWS knowledge (EC2, EKS, Lambda, S3, RDS, IAM, Secrets Manager, KMS) Advanced Infrastructure‑as‑Code expertise with Terraform and Terragrunt Strong GitOps experience and configuration management (Ansible) Hands‑on experience with bare metal configuration and machine templates Containers & Orchestration

Advanced Docker knowledge and container debugging skills Production Kubernetes with Helm, FluxCD, and KEDA Container‑based deployment strategies (blue‑green, canary, rolling) Programming & Automation

Strong Python and Bash scripting for automation and CLI tool development CI/CD pipeline design with GitHub Actions and other platforms Ability to write robust, production‑ready automation Monitoring & Reliability

Experience with observability stacks (NewRelic preferred, CloudWatch, Prometheus/InfluxDB) Distributed tracing, log aggregation, and alerting strategies Root cause analysis and post‑mortem expertise Security & Networking

Deep understanding of network security, load balancing, and DNS IAM best practices, key management, and secret rotation Compliance experience (SOC2, GDPR) and zero‑trust architecture principles Threat modeling capabilities with a proactive security mindset Systems

Solid Linux administration and system debugging skills Strong networking fundamentals and troubleshooting abilities

Nice‑to‑Have Skills

Backend or full‑stack development experience AI/ML infrastructure experience (model serving, GPU clusters, training pipelines) Experience with real‑time, high‑throughput data systems Multi‑tenant SaaS platform expertise Retail or e‑commerce domain knowledge eBPF for advanced observability Experience with Terraform Cloud at scale Service mesh technologies Multi‑region deployment expertise for global retail operations SecOps experience at enterprise scale Experience with event‑driven architectures Knowledge of streaming platforms (Kafka, Kinesis)

What We’re Looking For

Ownership Mentality: Track record of owning critical infrastructure outcomes Problem Solver: Strong debugging skills with the ability to work through ambiguous problems Security‑First Mindset: Proactive approach to identifying and mitigating threats Clear Communicator: Excellent written and verbal communication, comfortable with synch and async work Documentation Champion: Creates clear runbooks, architecture diagrams, and knowledge bases Collaborative Spirit: Motivated by helping others succeed and working cross‑functionally Strategic Thinker: Ability to balance immediate needs with long‑term infrastructure vision Growth Mindset: Continuous learner who stays current with DevOps best practices

Why Join Us

Revolutionary Impact: Build infrastructure for AI that's literally impacting retailers’ bottom lines autonomously Cutting‑Edge Stack: Work with the latest in AI/ML infrastructure, distributed systems, and cloud‑native architectures Enterprise Trust: Your work enables Fortune 500 retailers to trust AI with million‑dollar decisions Rapid Growth: Join us as we expand from retail into new verticals, scaling our platform globally Technical Excellence: Collaborate with world‑class engineers building autonomous AI agents that are redefining commerce Ownership & Equity: Significant equity participation in a company transforming a $30 trillion industry Innovation Freedom: Shape the infrastructure strategy for a platform processing billions in retail transactions Customer Impact: See your work directly impact major brands Professional Growth: Budget for conferences, certifications, and staying at the forefront of DevOps and AI infrastructure

Compensation

Compensation Range: $160k - $220k

Benefits

Health Dental Vision Life Commuter

Interview Process

Technical screening call (45‑60 minutes) Technical deep dives covering infrastructure, architecture, and hands‑on coding Team collaboration session (preferably in‑person) Culture & vision discussion with leadership

Merciv is building the future of autonomous commerce. We’re committed to assembling a diverse team of builders who want to revolutionize how the world does business. We are an equal opportunity employer, and we do not discriminate on the basis of race, religion, color, national origin, sex, sexual orientation, age, veteran status, disability, genetic information, or other applicable legally protected characteristic.

Location: New York, NY

Seniority level: Mid‑Senior level

Employment type: Full‑time

Job function: Engineering and Information Technology

#J-18808-Ljbffr