Interface AI
Staff Software Engineer - LLM expert
Interface AI, San Francisco, California, United States, 94199
interface.ai is the industry's-leading specialized AI provider for banks and credit unions, serving over 100 financial institutions. The company's integrated AI platform offers a unified banking experience through voice, chat, and employee-assisting solutions, enhanced by cutting‑edge proprietary Generative AI.
Our mission is clear: to transform the banking experience so every consumer enjoys hyper‑personalized, secure, and seamless interactions, while improving operational efficiencies and driving revenue growth.
interface.ai offers pre‑trained, domain‑specific AI solutions that are easy to integrate, scale, and manage, both in‑branch and online. Combining this with deep industry expertise,interface.ai is the AI solution for banks and credit unions that want to deliver exceptional experiences and stay at the forefront of AI innovation.
About the Role We’re hiring a
Staff Engineer – Core AI
to design, experiment, and scale the next generation of
LLM‑powered multi‑agent systems
that enable intelligent, secure, and compliant automation for financial institutions. This role goes beyond integrating third‑party APIs — it’s about
building differentiated intelligence : training, tuning, and evolving models that reason, plan, and act autonomously in high‑stakes environments. You’ll work at the intersection of
LLM research, applied reinforcement learning, and AI systems engineering , driving innovation in
model fine‑tuning, prompt optimization, encryption for inference , and
speech‑to‑speech AI .
Your mission:
create the
AI runtime layer
that powers adaptive, explainable, and policy‑aligned agents — at scale.
What You’ll Own As the
lead for LLM engineering , you’ll define how models learn, optimize, and safely interact with sensitive financial data. You’ll be responsible for:
Model Evolution:
Building fine‑tuning pipelines, exploring open‑weight models, and benchmarking their performance against proprietary LLMs.
Inference Optimization:
Driving high‑throughput, low‑latency inference strategies across GPUs, TPUs, and distributed inference clusters.
Safety & Guardrails:
Designing data‑safe pipelines with encryption for model I/O, and implementing automated
PII detection and masking
at both prompt and response layers.
RL‑Based Learning:
Applying
Reinforcement Learning (RLHF/RLAIF) , reward modeling, and policy optimization to continuously improve model performance.
Speech‑to‑Speech and Multimodal AI:
Exploring speech model architectures (ASR/TTS) and building adaptive pipelines for natural, real‑time conversational intelligence.
POCs & Experimentation:
Rapidly prototyping emerging models, toolchains, and optimization methods to maintain a competitive edge.
Framework Leadership:
Collaborating with research and backend teams to evolve our custom AI orchestration layer — combining multiple specialized models, memory systems, and evaluation tools.
What You’ll Do
Lead Fine‑Tuning and Experimentation:
Create fine‑tuning workflows using LoRA, PEFT, and instruction‑tuning pipelines; manage large‑scale training datasets.
Drive Auto‑Prompt Optimization:
Build self‑evolving prompt evaluation loops using reinforcement learning, reward modeling, and continuous evaluation frameworks.
Accelerate Inference Throughput:
Optimize model inference through quantization, batching, caching, and high‑performance serving strategies.
Implement Encrypted Inference:
Develop novel encryption and key management techniques for model‑level data protection during inferencing.
Design Guardrail Systems:
Implement policy layers that enforce safety, prevent hallucinations, and ensure compliance (SOC2, GDPR).
Integrate Speech Models:
Develop and optimize speech‑to‑speech pipelines, managing end‑to‑end latency, transcription accuracy, and model adaptation.
Run Advanced Evals:
Establish evaluation harnesses that measure factual accuracy, latency, cost‑efficiency, and safety compliance in production environments.
Research and Publish:
Explore the latest advancements in open‑source LLMs and reinforcement learning for agents, driving our internal AI innovation roadmap.
What We’re Looking For Required Qualifications
Strong LLM Expertise:
5–8 years of experience working directly with
transformer architectures
and
LLM fine‑tuning
(e.g., Llama, Mistral, GPT, Mixtral, Gemma, Falcon, Claude)
Applied Reinforcement Learning:
Hands‑on experience with
RLHF/RLAIF , reward modeling, and multi‑objective optimization for generative models
Prompt Optimization & Evaluation:
Deep knowledge of auto‑prompting, chain‑of‑thought evaluation, and self‑improving agent loops.
Inference Engineering:
Experience improving
throughput, quantization, and token efficiency
on GPUs or specialized inference hardware.
Data Security in AI:
Knowledge of
PII masking ,
data encryption , and
secure model pipelines
in production settings.
Modern AI Tooling:
Experience with frameworks such as
PyTorch ,
Transformers ,
Deep Speed ,
Hugging Face ,
LangChain , or
vLLM .
Preferred Qualifications
Experience with
speech‑to‑speech
or
multimodal
models (ASR, TTS, embeddings)
Understanding of
AI evaluation frameworks
(e.g., Evals, Llama Index Benchmarks, or custom metrics)
Familiarity with
financial data compliance
and
AI observability tools
Contributions to open‑source
LLM or RL research
projects
What Makes This Role Special?
You’ll
shape the core AI
that powers agentic intelligence for financial systems serving millions of users.
You’ll own a
research‑meets‑engineering
mandate — from exploring new models to bringing them to life in production.
You’ll define how
autonomous AI systems learn, adapt, and remain safe
in a regulated environment.
You’ll work with a team
combining AI research, applied data science, and product engineering , moving fast with purpose and rigor.
Compensation
Compensation is expected to be between $200,000 - $240,000. Exact compensation may vary based on skills and location.
What We Offer
401(k) match & financial wellness perks
Discretionary PTO + paid parental leave
Mental health, wellness & family benefits
A mission‑driven team shaping the future of banking
At interface.ai, we are committed to providing an inclusive and welcoming environment for all employees and applicants. We celebrate diversity and believe it is critical to our success as a company. We do not discriminate on the basis of race, color, religion, national origin, age, sex, gender identity, gender expression, sexual orientation, marital status, veteran status, disability status, or any other legally protected status. All employment decisions at Interface.ai are based on business needs, job requirements, and individual qualifications. We strive to create a culture that values and respects each person's unique perspective and contributions. We encourage all qualified individuals to apply for employment opportunities with Interface.ai and are committed to ensuring that our hiring process is inclusive and accessible.
#J-18808-Ljbffr
Our mission is clear: to transform the banking experience so every consumer enjoys hyper‑personalized, secure, and seamless interactions, while improving operational efficiencies and driving revenue growth.
interface.ai offers pre‑trained, domain‑specific AI solutions that are easy to integrate, scale, and manage, both in‑branch and online. Combining this with deep industry expertise,interface.ai is the AI solution for banks and credit unions that want to deliver exceptional experiences and stay at the forefront of AI innovation.
About the Role We’re hiring a
Staff Engineer – Core AI
to design, experiment, and scale the next generation of
LLM‑powered multi‑agent systems
that enable intelligent, secure, and compliant automation for financial institutions. This role goes beyond integrating third‑party APIs — it’s about
building differentiated intelligence : training, tuning, and evolving models that reason, plan, and act autonomously in high‑stakes environments. You’ll work at the intersection of
LLM research, applied reinforcement learning, and AI systems engineering , driving innovation in
model fine‑tuning, prompt optimization, encryption for inference , and
speech‑to‑speech AI .
Your mission:
create the
AI runtime layer
that powers adaptive, explainable, and policy‑aligned agents — at scale.
What You’ll Own As the
lead for LLM engineering , you’ll define how models learn, optimize, and safely interact with sensitive financial data. You’ll be responsible for:
Model Evolution:
Building fine‑tuning pipelines, exploring open‑weight models, and benchmarking their performance against proprietary LLMs.
Inference Optimization:
Driving high‑throughput, low‑latency inference strategies across GPUs, TPUs, and distributed inference clusters.
Safety & Guardrails:
Designing data‑safe pipelines with encryption for model I/O, and implementing automated
PII detection and masking
at both prompt and response layers.
RL‑Based Learning:
Applying
Reinforcement Learning (RLHF/RLAIF) , reward modeling, and policy optimization to continuously improve model performance.
Speech‑to‑Speech and Multimodal AI:
Exploring speech model architectures (ASR/TTS) and building adaptive pipelines for natural, real‑time conversational intelligence.
POCs & Experimentation:
Rapidly prototyping emerging models, toolchains, and optimization methods to maintain a competitive edge.
Framework Leadership:
Collaborating with research and backend teams to evolve our custom AI orchestration layer — combining multiple specialized models, memory systems, and evaluation tools.
What You’ll Do
Lead Fine‑Tuning and Experimentation:
Create fine‑tuning workflows using LoRA, PEFT, and instruction‑tuning pipelines; manage large‑scale training datasets.
Drive Auto‑Prompt Optimization:
Build self‑evolving prompt evaluation loops using reinforcement learning, reward modeling, and continuous evaluation frameworks.
Accelerate Inference Throughput:
Optimize model inference through quantization, batching, caching, and high‑performance serving strategies.
Implement Encrypted Inference:
Develop novel encryption and key management techniques for model‑level data protection during inferencing.
Design Guardrail Systems:
Implement policy layers that enforce safety, prevent hallucinations, and ensure compliance (SOC2, GDPR).
Integrate Speech Models:
Develop and optimize speech‑to‑speech pipelines, managing end‑to‑end latency, transcription accuracy, and model adaptation.
Run Advanced Evals:
Establish evaluation harnesses that measure factual accuracy, latency, cost‑efficiency, and safety compliance in production environments.
Research and Publish:
Explore the latest advancements in open‑source LLMs and reinforcement learning for agents, driving our internal AI innovation roadmap.
What We’re Looking For Required Qualifications
Strong LLM Expertise:
5–8 years of experience working directly with
transformer architectures
and
LLM fine‑tuning
(e.g., Llama, Mistral, GPT, Mixtral, Gemma, Falcon, Claude)
Applied Reinforcement Learning:
Hands‑on experience with
RLHF/RLAIF , reward modeling, and multi‑objective optimization for generative models
Prompt Optimization & Evaluation:
Deep knowledge of auto‑prompting, chain‑of‑thought evaluation, and self‑improving agent loops.
Inference Engineering:
Experience improving
throughput, quantization, and token efficiency
on GPUs or specialized inference hardware.
Data Security in AI:
Knowledge of
PII masking ,
data encryption , and
secure model pipelines
in production settings.
Modern AI Tooling:
Experience with frameworks such as
PyTorch ,
Transformers ,
Deep Speed ,
Hugging Face ,
LangChain , or
vLLM .
Preferred Qualifications
Experience with
speech‑to‑speech
or
multimodal
models (ASR, TTS, embeddings)
Understanding of
AI evaluation frameworks
(e.g., Evals, Llama Index Benchmarks, or custom metrics)
Familiarity with
financial data compliance
and
AI observability tools
Contributions to open‑source
LLM or RL research
projects
What Makes This Role Special?
You’ll
shape the core AI
that powers agentic intelligence for financial systems serving millions of users.
You’ll own a
research‑meets‑engineering
mandate — from exploring new models to bringing them to life in production.
You’ll define how
autonomous AI systems learn, adapt, and remain safe
in a regulated environment.
You’ll work with a team
combining AI research, applied data science, and product engineering , moving fast with purpose and rigor.
Compensation
Compensation is expected to be between $200,000 - $240,000. Exact compensation may vary based on skills and location.
What We Offer
401(k) match & financial wellness perks
Discretionary PTO + paid parental leave
Mental health, wellness & family benefits
A mission‑driven team shaping the future of banking
At interface.ai, we are committed to providing an inclusive and welcoming environment for all employees and applicants. We celebrate diversity and believe it is critical to our success as a company. We do not discriminate on the basis of race, color, religion, national origin, age, sex, gender identity, gender expression, sexual orientation, marital status, veteran status, disability status, or any other legally protected status. All employment decisions at Interface.ai are based on business needs, job requirements, and individual qualifications. We strive to create a culture that values and respects each person's unique perspective and contributions. We encourage all qualified individuals to apply for employment opportunities with Interface.ai and are committed to ensuring that our hiring process is inclusive and accessible.
#J-18808-Ljbffr