Logo
Veracity

Data scientists, AI engineering, LLM engineer, Machine Learning Engineers

Veracity, Charlotte, North Carolina, United States, 28245

Save Job

Role: Data scientists, AI engineering, LLM engineer, Machine Learning Engineers Charlotte, NC (Need Local candidate only) Video Interview

Look for an ex-Client/Banking & Financial domain candidate.

Project Details: •Implemented a chatbot internally with the bank, build interface and now users can interact. •It's a RAG framework so instead of tuning it into the actual applications you can prompt it to give you prevectorized queries. •ble to feed it documents even if they don't know what your team is or who you are. Should have 1000 users by the end of the year and another 2000 next year. Must Haves / Required Skills:

•LLMs & Inference: •Experience with major LLMs, specifically Llama 3, Mistral, and possibly "Quinn." •Direct experience with VLLM (an inference engine) is a perfect match, as it's the core technology they are using to handle batched requests. •Experience with Nvidia Triton is a "big bonus" and a key part of their model serving infrastructure. •Core Development: •Python: A mandatory skill. They are using Python 3.12, but experience with 3.10 and above is sufficient. •Web Frameworks: Knowledge of Flask or FastAPI is required, as they are using a Python endpoint to host the LLM. •Java: A "secondary preferred" skill, used to create the REST service that interacts with the front-end UI. •Database & Data Management: •Vector Databases: Experience with Redis and other vector databases is essential for the RAG component. •SQL: Required. •RAG Skills: The candidate needs to understand how to handle the business-side parameters from the product team and push back if they are technically unfeasible. This shows they need to be a critical thinker, not just a code-jockey. •Infrastructure & Operations (MLOps) •Containers & Orchestration: Knowledge of containers and OpenShift (a Kubernetes platform) for CI/CD. •CI/CD Tools: Experience with XLR and Datical for pipeline deployments is required. •Hardware: A solid understanding of GPUs is necessary, as they are the most critical and challenging component of their infrastructure. •gile: The team uses Agile methodology.

Scaling: The project's growth is tied to hardware availability. The initial deployment will be capped at 1,000 users, and scaling will only happen with more budget. This shows the importance of efficient resource management.

Skills / Experience That Are A Plus "nice to have", but not necessarily required: •Based on exp, likes to dig deep on exp. •General awareness of Vector DB as opposed to relational and others. •Pushing code to a controlled environment and seeing something go into production. Better if it's an AI application. •ny model experience they have had professionally, some quantitative models in the past or have written white papers before (required at Client)