Logo
C&G Consulting Services Inc

Generative AI Engineer. No 3rd parties

C&G Consulting Services Inc, Raritan, New Jersey, us, 08869

Save Job

NO 3 Party candidates Candidates must be direct DO NOT RESPOND UNLESS YOU ARE DIRECT Key Responsibilities: High-Throughput RAG Pipeline Development: Design and build scalable document processing pipelines to ingest and semantically chunk large batches of documents (PDF/DOCX) from sources like Azure Blob and AWS S3. Integrate embedding models and tune vector databases like Milvus for high-performance, sub-100 ms k-NN retrieval. Implement hybrid retrieval systems using BM25 and vector search, and continually track and improve retrieval performance using metrics like MRR and recall@k. Model Fine-Tuning & Prompt Engineering: Apply large language models (LLMs) and NLP techniques to solve complex problems such as named-entity recognition, question answering, and summarization. Build fine-tuning pipelines using frameworks like LoRA/PEFT and run hyperparameter sweeps in Azure ML. Author multi-step prompt chains, enforce structured JSON outputs, and use validation guards to reduce hallucinations and improve model consistency. MLOps & Production Deployment: Develop and containerize agent-based microservices using frameworks like FastAPI or Azure Functions. Define Infrastructure as Code using Terraform/ARM and build CI/CD workflows in GitHub Actions for automated testing and canary rollouts. Implement robust monitoring and alerting for latency (p50/p95) and error rates using tools like Prometheus, Grafana, or Azure Monitor to ensure SLA compliance. Performance, Cost & Standards: Profile API calls and implement cost-reduction strategies like batching, caching, and early-stop logits. Produce high-quality documentation, including architecture diagrams, sequence flows, and data schemas. Enforce security and compliance standards, including data encryption and PII redaction, to align with HIPAA/GxP requirements. Required Qualifications: BS/MS in Computer Science, AI/ML, or a related field. 3+ years of experience building end-to-end LLM/RAG systems in a production environment. Deep Python experience, including libraries like FastAPI, pandas, and NumPy. Hands-on experience with LLM orchestration frameworks (LangChain, LlamaIndex), NLP libraries (HuggingFace), and OpenAI/Azure SDKs. Proven expertise in MLOps including CI/CD (GitHub Actions/Azure DevOps) and containerization (Docker/Kubernetes). Preferred Qualifications (Nice-to-Haves): Experience working in a regulated industry such as pharmaceuticals or life sciences. Hands-on experience with vector databases like Milvus or Pinecone. Familiarity with chatbot frameworks like Rasa or Botpress. Experience with data-centric AI tools for validation and monitoring, such as Great Expectations or Deepchecks.