Logo
ValidMind Inc

Infrastructure Engineer US - Remote

ValidMind Inc, Jackson, Mississippi, United States

Save Job

ValidMind empowers financial services organizations to bring more trust and transparency to the worlds AI/ML/LLM models. With the rapid evolution of AI, increased regulatory scrutiny, and lack of fit‑for‑purpose tooling, financial services’ Model Risk Management (MRM) and AI Governance functions are under enormous pressure to ensure compliance. We are passionate about helping these organizations seamlessly and confidently test, validate, and document their business’ AI models while ensuring compliance with domestic and international AI and model risk regulations.

Overview We’re looking for a skilled

Infrastructure Engineer

to design, build, and maintain reliable, scalable infrastructure that supports our engineering teams and product delivery. You’ll be responsible for managing cloud environments, implementing infrastructure-as-code practices, and ensuring high availability and observability of our systems.

What You’ll Do & Your Impact:

Design, deploy, and manage infrastructure

using

Docker ,

Kubernetes , and

Terraform

to support production and development environments.

Manage cloud infrastructure

on a major provider, preferably

AWS

(experience with GCP or Azure also considered).

Implement monitoring and observability solutions

using tools such as

Datadog ,

Splunk ,

Prometheus , or

Grafana

to ensure system reliability and performance.

Collaborate closely

with

backend and fullstack engineers

to support continuous integration, delivery, and deployment pipelines.

Participate in the on‑call rotation , respond to incidents, and help drive post‑incident reviews and reliability improvements.

Automate operational tasks

using scripting languages such as

Bash

and

Python .

Maintain and improve security and compliance

practices within infrastructure and deployment processes.

Document

infrastructure designs, processes, and procedures to promote transparency and knowledge sharing across the team.

Who You Are & What Makes You Qualified:

3+ years of professional experience

in infrastructure, DevOps, or SRE roles.

Strong experience with

containerization (Docker)

and

orchestration (Kubernetes)

in production environments.

Proven experience with

Terraform

or other infrastructure-as-code tools.

Hands‑on experience with

AWS

(EC2, ECS/EKS, S3, IAM, CloudWatch, etc.) or another major cloud platform.

Proficiency in

monitoring and logging tools

(e.g., Datadog, Splunk, Prometheus, ELK stack).

Comfortable writing

automation scripts

in

Bash

and

Python .

Experience supporting

CI/CD pipelines

and deployment workflows.

Strong communication skills and ability to

collaborate effectively

with cross‑functional teams.

Willingness to

participate in an on‑call rotation

and help improve system reliability and response processes.

Nice‑to‑Have(s):

Familiarity with

service mesh

or

networking within Kubernetes .

Experience with

security best practices

in cloud and containerized environments.

Understanding of

GitOps

workflows (e.g., ArgoCD, Flux).

Knowledge of

performance tuning ,

capacity planning , and

cost optimization

in cloud environments.

Why Join Us

Opportunity to have a direct impact on the stability and scalability of core systems.

Collaborative engineering culture with strong ownership and autonomy.

Exposure to a modern tech stack and opportunities for professional growth.

At ValidMind, we create the most efficient solution for organizations to automate testing, documentation, and risk management for AI and statistical models. Working here means being at the forefront of AI risk management, but it’s also more personal than that: we promote an inclusive culture where we value your ideas and creativity. We want you to have a sense of ownership over your work, to build mutual trust with your peers, and to feel supported in everything you do. There is ample room to grow as a VC‑backed company in the early stages of growth.

As set forth in ValidMind’s Equal Employment Opportunity policy, we do not discriminate on the basis of any protected group status under any applicable law.

#J-18808-Ljbffr