AI Fiesta

Senior Full - Stack Engineer

AI Fiesta, Los Angeles, California, United States

We are seeking a Senior Full-Stack Engineer to design and build an AI-driven conversational platform. The role involves architecting scalable systems that integrate multiple LLM providers, support real-time interactions, handle complex state management, and ensure performance, observability, and security across the stack. This position requires a deep understanding of modern web architectures, cloud-native deployments, and AI/LLM integration patterns. Responsibilities

Implement advanced chat session management, including context persistence, token optimization, and retrieval-augmented generation (RAG). Design and optimize high-throughput APIs (REST, GraphQL, and WebSockets/SSE) with rate limiting and fault tolerance. Integrate token metering, analytics, and usage-based billing systems. Develop a secure, multi-tenant user management system with granular authentication/authorization. Leverage event-driven architectures (Kafka, Pub/Sub, or equivalent) for real-time processing and monitoring. Optimize database schemas and queries (PostgreSQL, pgvector, Supabase) for low-latency chat history retrieval. Implement vector search and RAG pipelines using Pinecone, Weaviate, or pgvector for knowledge grounding. Ensure cloud-native scalability with Docker/Kubernetes, CI/CD pipelines, and IaC (Terraform, Pulumi). Set up observability (distributed tracing, structured logging, metrics, error tracking) for debugging and performance monitoring. Apply AI safety and guardrails (moderation APIs, prompt filtering, structured outputs). Stay ahead of AI ecosystem developments and propose new integrations. Requirements

5-7 years of professional experience in full-stack or platform engineering. Proven experience delivering production-grade distributed systems. Strong understanding of LLM APIs and AI/ML system integrations. Bachelor's/Master's in Computer Science or equivalent practical experience from a pedigree background. Must-Have Skills

Backend Engineering: Node.js/Deno + TypeScript, event-driven design, API performance optimization. Frontend Development: React/Next.js (SSR, streaming responses, optimistic UI updates). Databases: PostgreSQL + vector extensions (pgvector), Redis/Valkey for caching and pub/sub. Cloud and Infra: AWS/GCP/Azure, Kubernetes, serverless compute (Lambda/Cloud Functions), load balancing. Real-time Systems: WebSockets, Server-Sent Events, or WebRTC for interactive chat. Security: OAuth2, JWT, token expiration/refresh strategies, encryption at rest and in transit. Testing and Quality: Unit/integration/e2e testing frameworks, contract testing for APIs. Nice-to-Have Skills

Experience with LangChain, LlamaIndex, or custom orchestration engines. Knowledge of embeddings, vector databases, and hybrid search techniques. Familiarity with streaming LLM APIs and fine-tuning workflows. Background in distributed systems, CAP theorem trade-offs, and scaling stateful apps. Experience with observability stacks (OpenTelemetry, Grafana, Prometheus). Understanding of payment and billing systems (Stripe, usage-based pricing). Prior work on multi-tenant SaaS platforms.

#J-18808-Ljbffr