Logo
Kaizen Analytix

AI Engineer (Generative AI, LLMs, AIOps)

Kaizen Analytix, Atlanta, Georgia, United States, 30383

Save Job

AI Engineer (Generative AI, LLMs, AIOps) Kaizen Analytix LLC, an analytics products and services company that gives clients unmatched speed to value through AI/ML solutions and actionable business insights, is seeking qualified candidates for AI Engineer who are highly skilled and experienced professionals responsible for designing, developing, and maintaining complex AI projects and managed data warehouses for hosting large datasets used in the models. The ideal candidate will have a strong understanding of deep learning, vector embeddings, and resolving the challenges of storing the embeddings; and using advanced data engineering principles and best practices, as well as working with massive datasets (100 GB+) that are unstructured, like video, audio images, and text. We seek candidates who can support AI projects with the requisite knowledge on Deep learning, Embeddings, Data engineering skills required for storing Deep learning-based outcomes.

Responsibilities

Hands‑on Development & Implementation:

Design, build, fine‑tune, and deploy generative models and LLMs for various enterprise applications (e.g., content generation, chatbots, code assistance, data analysis).

Leverage deep knowledge of transformer architectures (e.g., GPT, BERT, T5) to fine‑tune, optimize, and deploy Large Language Models (LLMs) for specific tasks.

Implement state‑of‑the‑art techniques such as Agentic AI, Context Engineering – Retrieval‑Augmented Generation (RAG), prompt engineering, and model quantization.

Build, train, and deploy autonomous and semi‑autonomous AI agents capable of complex reasoning, tool use, and decision‑making.

Experience in leveraging document extraction models for information retrieval using Document Intelligence or Textract cloud services.

Develop and implement AI/ML models for AIOps use cases, including anomaly detection, predictive monitoring, root cause analysis, and automated remediation.

Write clean, efficient, well‑documented, and production‑ready code (primarily Python).

Build and maintain data pipelines for training, evaluating, and serving AI models.

AI Architecture & Design:

Design end‑to‑end architectures for complex AI applications, considering scalability, reliability, security, maintainability, and cost‑effectiveness within an enterprise environment.

Evaluate and select appropriate AI/ML frameworks, models, platforms, and tools for specific projects.

Collaborate with data scientists, software engineers, DevOps engineers, product managers, and business stakeholders to define requirements and translate them into technical designs.

Develop and advocate for best practices in AI development, deployment (MLOps), and governance.

Deep Learning & Research:

Understanding fundamental deep learning concepts (e.g., CNNs, RNNs, LSTMs, Transformers, attention mechanisms) to solve complex problems.

Stay abreast of the latest advancements and research papers in Gen AI, LLMs, AIOps, and deep learning.

Experiment with new algorithms, techniques, and tools to drive innovation.

AIOps Integration:

Integrate AI capabilities into IT operations monitoring, logging, and management tools.

Analyze operational data (logs, metrics, traces) to identify opportunities for AI‑driven improvements.

Develop systems to automate operational tasks and improve system resilience using AI.

Required Qualifications

Bachelor's or Master's degree in Computer Science, Artificial Intelligence, Machine Learning, Data Science, or a related quantitative field. PhD is a plus.

Proven industry experience (typically 3‑5+ years, adjust as needed) as an AI/ML Engineer, with hands‑on experience building and deploying machine learning models in production.

Demonstrable experience working with Generative AI and Large Language Models (e.g., GPT variants, Llama, Mistral, Gemini) including fine‑tuning, RAG, and prompt engineering.

Strong proficiency in Python and common AI/ML libraries/frameworks (e.g., TensorFlow, PyTorch, Keras, Scikit‑learn, Hugging Face Transformers, LangChain).

Solid understanding of core deep learning concepts, model architectures, training methodologies, and evaluation metrics.

Experience designing scalable, reliable AI/ML system architectures.

Familiarity with cloud platforms (AWS, Azure, or GCP) and their respective AI/ML services (e.g., SageMaker, Azure ML, Azure AI Foundry, GCP Vertex AI, GCP AI Studio).

Understanding of AIOps principles and common use cases (experience implementing AIOps solutions is a strong plus).

Solid software engineering fundamentals (data structures, algorithms, code quality, testing, version control with Git).

Excellent analytical, problem‑solving, and critical‑thinking skills.

Strong communication and collaboration skills, with the ability to explain complex technical concepts to diverse audiences.

Good‑to‑have Qualifications

Direct hands‑on experience developing and deploying AIOps solutions.

Experience with MLOps tools and practices (e.g., MLflow, Kubeflow, DVC, CI/CD for ML).

Enthusiastic in learning new tools and technologies and solve adhoc challenges.

Experience with containerization technologies (Docker, Kubernetes).

Experience with big data technologies (e.g., Spark, Hadoop, Kafka).

Contributions to open‑source AI/ML projects or relevant publications.

Experience working in an agile development environment.

This is a remote role.

Analysis and Design

Conducts fact‑gathering sessions with users.

Consult with Technical Managers and Business Owners to identify and analyze technological needs and problems.

Performs data flow diagramming and/or process modeling (code architecture).

Design, develop, and deploy machine learning models for a variety of tasks, such as classification, image recognition, natural language processing, and anomaly detection.

Evaluate the performance of machine learning models and tune them for optimal performance.

Design, develop and implement efficient MLOps that are scalable and robust.

Collect, clean, and prepare data for machine learning models.

Collaborate with data scientists, engineers, and other stakeholders to define and prioritize machine learning projects.

Stay up to date on the latest advancements in machine learning research and best practices.

Work with stakeholders to gather requirements and define data models.

Troubleshoot data issues and performance problems.

Work with other engineers to develop and maintain the company's data infrastructure.

Stay up to date on the latest data engineering technologies and trends, specifically on vector stores.

Strategy Alignment

Works with the other technical team members to continually improve implementation strategies, development standards, documentation, and other departmental processes.

Provides technical assistance and mentoring to subordinates.

Communicates plans, status, and issues to management regularly.

Adheres to department standards, policies, procedures, and industry best practices.

#J-18808-Ljbffr