C&G Consulting Services Inc
Generative AI Engineer. No 3rd parties
C&G Consulting Services Inc, Somerville, New Jersey, us, 08876
NO 3 Party candidates
Candidates must be direct
DO NOT RESPOND UNLESS YOU ARE DIRECT
Key Responsibilities:
High-Throughput RAG Pipeline Development:
Design and build scalable document processing pipelines to ingest and semantically chunk large batches of documents (PDF/DOCX) from sources like Azure Blob and AWS S3.
Integrate embedding models and tune vector databases like Milvus for high-performance, sub-100 ms k-NN retrieval.
Implement hybrid retrieval systems using BM25 and vector search, and continually track and improve retrieval performance using metrics like MRR and recall@k.
Model Fine-Tuning & Prompt Engineering:
Apply large language models (LLMs) and NLP techniques to solve complex problems such as named-entity recognition, question answering, and summarization.
Build fine-tuning pipelines using frameworks like LoRA/PEFT and run hyperparameter sweeps in Azure ML.
Author multi-step prompt chains, enforce structured JSON outputs, and use validation guards to reduce hallucinations and improve model consistency.
MLOps & Production Deployment:
Develop and containerize agent-based microservices using frameworks like FastAPI or Azure Functions.
Define Infrastructure as Code using Terraform/ARM and build CI/CD workflows in GitHub Actions for automated testing and canary rollouts.
Implement robust monitoring and alerting for latency (p50/p95) and error rates using tools like Prometheus, Grafana, or Azure Monitor to ensure SLA compliance.
Performance, Cost & Standards:
Profile API calls and implement cost-reduction strategies like batching, caching, and early-stop logits.
Produce high-quality documentation, including architecture diagrams, sequence flows, and data schemas.
Enforce security and compliance standards, including data encryption and PII redaction, to align with HIPAA/GxP requirements.
Required Qualifications:
BS/MS in Computer Science, AI/ML, or a related field.
3+ years of experience building end-to-end LLM/RAG systems in a production environment.
Deep Python experience, including libraries like FastAPI, pandas, and NumPy.
Hands-on experience with LLM orchestration frameworks (LangChain, LlamaIndex), NLP libraries (HuggingFace), and OpenAI/Azure SDKs.
Proven expertise in MLOps including CI/CD (GitHub Actions/Azure DevOps) and containerization (Docker/Kubernetes).
Preferred Qualifications (Nice-to-Haves):
Experience working in a regulated industry such as pharmaceuticals or life sciences.
Hands-on experience with vector databases like Milvus or Pinecone.
Familiarity with chatbot frameworks like Rasa or Botpress.
Experience with data-centric AI tools for validation and monitoring, such as Great Expectations or Deepchecks.