Salt

Generative AI Engineer

Salt, Oakland, California, United States, 94616

This range is provided by Salt. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.

Squad Leader | Sr Consultant specialized in the US Market We are seeking a

GenAI Engineer

with deep expertise in Large Language Models to drive the development, optimization, and deployment of

advanced LLM capabilities

across the organization. This role focuses on fine‑tuning foundation models, developing scalable prompt engineering frameworks, and building retrieval‑augmented generation (RAG) solutions tailored to the utility industry.

You will work closely with Data Engineering, MLOps, and Product teams to ensure our AI systems are accurate, efficient, governed, and performant in production.

This is a long‑term contract role, hybrid in Oakland‑CA. (3x/week)

Key Responsibilities

Implement and optimize model fine‑tuning approaches (LoRA, PEFT, QLoRA) to adapt foundation models to domain‑specific needs.

Develop structured prompt engineering methodologies aligned with utility operations, regulatory requirements, and technical documentation workflows.

Create and maintain reusable prompt templates and shared prompt libraries for consistent usage across applications.

Build and maintain prompt testing frameworks to quantitatively evaluate and continuously improve prompt performance.

Define and enforce prompt versioning and governance standards to ensure high‑quality outputs across teams and models.

Apply model optimization techniques (knowledge distillation, quantization, pruning) to improve efficiency and reduce inference cost.

Address memory and compute constraints using strategies like sharded data parallelism, GPU offloading, and hybrid CPU+GPU execution.

Architect and deploy RAG pipelines using vector databases, embedding pipelines, and optimized chunking strategies for retrieval performance.

Design advanced prompting strategies such as chain‑of‑thought reasoning, agent orchestration, and multi‑step task decomposition.

Collaborate with MLOps to deploy, monitor, and retrain LLMs in production environments.

Expected Skillset

Deep Learning & NLP:

Strong proficiency with PyTorch/TensorFlow, Hugging Face Transformers, and modern LLM training workflows (e.g., LoRA, PEFT, QLoRA).

Prompt Engineering & RAG:

Experience designing structured prompts and implementing retrieval‑augmented pipelines with vector stores.

GPU & Compute Optimization:

Hands‑on experience with multi‑GPU training, model parallelism, memory optimization, and handling large‑scale model workloads.

LLMOps:

Understanding of deploying and monitoring LLM‑based systems in production environments.

Research Adaptability:

Ability to interpret research papers and rapidly apply emerging model optimization techniques.

Domain Adaptation:

Experience preparing and curating domain‑specific datasets for fine‑tuning and evaluation.

If you enjoy solving hard problems in scalable AI systems and want to shape the next generation of enterprise LLM capabilities, we’d like to hear from you.

Seniority level

Associate

Employment type

Contract

Job function

Engineering and Information Technology

Energy Technology and Services for Renewable Energy

Referrals increase your chances of interviewing at Salt by 2x

Get notified about new Generative AI Engineer jobs in

Oakland, CA .

#J-18808-Ljbffr