ACL Digital
Get AI-powered advice on this job and more exclusive features.
Direct message the job poster from ACL Digital
Overview
We are seeking a highly skilled and forward-thinking Data Engineer to drive the integration of Large Language Models (LLMs) and Generative AI systems into our data ecosystem. This role will focus on designing and operationalizing intelligent data pipelines and interfaces that enable seamless interaction between curated enterprise data and advanced AI models. You will play a key role in bridging data engineering and AI innovation, ensuring secure, scalable, and high-performance systems that power next-generation language-based applications. Key Responsibilities
Design, build, and optimize data pipelines that serve as the backbone for LLM-powered systems and AI applications. Integrate Generative AI and LLM technologies (e.g., OpenAI, Anthropic, Azure OpenAI, or open-source models like LLaMA or Mistral) with curated enterprise data. Develop and maintain retrieval-augmented generation (RAG) pipelines to connect structured and unstructured data to model contexts. Collaborate with data scientists, ML engineers, and AI researchers to ensure alignment between data readiness and model performance. Implement agentic system architectures, including orchestration frameworks (e.g., LangChain, Semantic Kernel, or similar). Enforce AI security, compliance, and data governance best practices to ensure responsible use of enterprise data in AI applications. Automate LLM evaluation, model fine-tuning, and deployment workflows where applicable. Monitor and troubleshoot AI data pipelines, ensuring high availability, scalability, and accuracy of responses. Document design patterns, integration strategies, and operational playbooks for AI-driven data engineering. Required Skills & Qualifications
Proven experience as a Data Engineer or ML Engineer with hands-on expertise in LLM or Generative AI system integrations. Strong proficiency in Python, SQL, and distributed data frameworks (e.g., Spark, DataBricks). Practical understanding of RAG architectures, vector databases (e.g., Pinecone, Weaviate, Chroma, FAISS), and embedding pipelines. Familiarity with LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks. Experience implementing secure and compliant AI pipelines, with understanding of AI security, prompt injection defenses, and data privacy. Solid understanding of cloud-based AI infrastructure—preferably Azure AI Services, Azure DataBricks, and Azure OpenAI Service. Excellent problem-solving skills and ability to work across data, infrastructure, and AI teams. Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience). Preferred Qualifications
Experience fine-tuning or customizing LLMs for enterprise use cases. Familiarity with MLflow, MLOps, and CI/CD for model deployment. Knowledge of medallion data architecture and Delta Lake for AI-ready data management. Experience with streaming data systems (e.g., Kafka, Event Hubs) for real-time AI applications. Contributions to open-source AI frameworks or enterprise AI integrations. Seniorit y level
Mid-Senior level Employment type
Contract Job function
Other Industries
IT Services and IT Consulting Referrals increase your chances of interviewing at ACL Digital by 2x Get notified about new Data Engineer jobs in
Columbus, OH .
#J-18808-Ljbffr
We are seeking a highly skilled and forward-thinking Data Engineer to drive the integration of Large Language Models (LLMs) and Generative AI systems into our data ecosystem. This role will focus on designing and operationalizing intelligent data pipelines and interfaces that enable seamless interaction between curated enterprise data and advanced AI models. You will play a key role in bridging data engineering and AI innovation, ensuring secure, scalable, and high-performance systems that power next-generation language-based applications. Key Responsibilities
Design, build, and optimize data pipelines that serve as the backbone for LLM-powered systems and AI applications. Integrate Generative AI and LLM technologies (e.g., OpenAI, Anthropic, Azure OpenAI, or open-source models like LLaMA or Mistral) with curated enterprise data. Develop and maintain retrieval-augmented generation (RAG) pipelines to connect structured and unstructured data to model contexts. Collaborate with data scientists, ML engineers, and AI researchers to ensure alignment between data readiness and model performance. Implement agentic system architectures, including orchestration frameworks (e.g., LangChain, Semantic Kernel, or similar). Enforce AI security, compliance, and data governance best practices to ensure responsible use of enterprise data in AI applications. Automate LLM evaluation, model fine-tuning, and deployment workflows where applicable. Monitor and troubleshoot AI data pipelines, ensuring high availability, scalability, and accuracy of responses. Document design patterns, integration strategies, and operational playbooks for AI-driven data engineering. Required Skills & Qualifications
Proven experience as a Data Engineer or ML Engineer with hands-on expertise in LLM or Generative AI system integrations. Strong proficiency in Python, SQL, and distributed data frameworks (e.g., Spark, DataBricks). Practical understanding of RAG architectures, vector databases (e.g., Pinecone, Weaviate, Chroma, FAISS), and embedding pipelines. Familiarity with LangChain, LlamaIndex, Semantic Kernel, or equivalent frameworks. Experience implementing secure and compliant AI pipelines, with understanding of AI security, prompt injection defenses, and data privacy. Solid understanding of cloud-based AI infrastructure—preferably Azure AI Services, Azure DataBricks, and Azure OpenAI Service. Excellent problem-solving skills and ability to work across data, infrastructure, and AI teams. Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience). Preferred Qualifications
Experience fine-tuning or customizing LLMs for enterprise use cases. Familiarity with MLflow, MLOps, and CI/CD for model deployment. Knowledge of medallion data architecture and Delta Lake for AI-ready data management. Experience with streaming data systems (e.g., Kafka, Event Hubs) for real-time AI applications. Contributions to open-source AI frameworks or enterprise AI integrations. Seniorit y level
Mid-Senior level Employment type
Contract Job function
Other Industries
IT Services and IT Consulting Referrals increase your chances of interviewing at ACL Digital by 2x Get notified about new Data Engineer jobs in
Columbus, OH .
#J-18808-Ljbffr