CLOUDRAY

Senior Generative AI Engineer

CLOUDRAY, Jersey City, New Jersey, United States, 07390

Senior Generative AI Engineer Remote

Job Summary: We are looking for a Senior Generative AI Engineer to lead the development of Proofs of Concept (POCs) and transition them into robust, scalable production-grade solutions. The ideal candidate has strong expertise in LLMs, prompt engineering, RAG, and deploying GenAI-powered applications. You'll collaborate across product, data, and engineering teams to rapidly prototype ideas and deliver AI-first features that create business impact.

Key Responsibilities: • Drive end-to-end development of POCs using Generative AI models (OpenAI, Claude, Gemini, Mistral, open-source LLMs). • Translate business problems into AI-powered use cases and prototypes with clear outcomes. • Architect and build production-ready systems from validated POCs. • Implement Retrieval-Augmented Generation (RAG) pipelines, vector databases (e.g., Pinecone, FAISS, Weaviate), and embedding-based search. • Optimize prompts, model selection, fine-tuning, and response pipelines for reliability and cost-efficiency. • Build API services, microservices, or SDKs for GenAI functionalities and expose them to frontend or enterprise systems. • Evaluate open-source and proprietary models and recommend fit-for-purpose solutions. • Ensure secure, ethical, and responsible AI use in compliance with organizational and regulatory guidelines. • Collaborate closely with product managers and software engineers to integrate GenAI into real-world applications.

Required Skills & Qualifications: • Strong experience with LLMs, transformer-based architectures, and NLP pipelines. • Proven track record building and deploying GenAI-powered POCs or applications. • Hands-on experience with OpenAI, Anthropic, Google Gemini, Hugging Face, Llama, etc. • Experience in Python, LangChain, LlamaIndex, or similar orchestration frameworks. • Working knowledge of vector databases, embedding models, and RAG architecture. • Cloud experience (AWS/GCP/Azure) including AI/ML services, serverless architecture, and containerization. • Familiarity with API design, backend development, and microservice architecture. • Strong understanding of model safety, cost optimization, prompt chaining, token limits, and response streaming.

Preferred Skills: • Experience with fine-tuning open-source models (e.g., LLaMA, Mistral, Falcon). • Familiarity with agentic workflows (e.g., AutoGPT, CrewAI, LangGraph). • Exposure to MLOps tools (MLflow, Kubeflow, SageMaker Pipelines). • Ability to handle unstructured data (PDFs, audio, images, structured logs) and convert into usable GenAI formats.