The Rundown AI, Inc.
Senior Backend Engineer, AI Data Platform
The Rundown AI, Inc., Snowflake, Arizona, United States, 85937
Role Overview
As a
Backend Engineer, AI Data Platform
at Labelbox, you will lead the design and development of our core data infrastructure, powering the seamless flow, storage, and processing of data for our AI platform. Your expertise will drive the evolution of scalable systemsanchored by high-performance databasesto support large-scale workflows, high-throughput data I/O, and streaming capabilities. Youll enable Labelbox customers to efficiently manage and stream data for training next-generation AI models. Owning critical components of our data infrastructure, including database architecture, youll work end-to-end on projects from design to deployment. Collaborating cross-functionally with stakeholders, youll transform ideas into robust, scalable solutions that enhance platform adoption and customer success. Your Impact
Design and build scalable data infrastructure, integrating high-performance databases (relational, NoSQL, cloud-native) with distributed systems for data processing, storage, and streaming. Optimize database systems for performance, reliability, and scalability, ensuring efficient data retrieval, indexing, and querying to support AI workflows. Develop and maintain data pipelines using distributed queues, message brokers, and job management mechanisms to enable high-throughput import/export operations. Collaborate with team members and stakeholders to align data infrastructure with platform goals and customer needs. Participate in Sprint Planning, Standups, and related activities to drive data-focused initiatives forward. Mentor and guide less experienced engineers, sharing expertise in data infrastructure and database optimization. Support the teams area of ownership by working with the Support organization to resolve customer-facing data issues. Stay abreast of industry trends in data infrastructure and database technologies, incorporating relevant innovations into our systems. Contribute to technical documentation, research publications, blog posts, and presentations at conferences and forums. Innovation in AI: Enhance data infrastructure capabilities for an AI platform used by leading AI labs to develop powerful multi-modal large language models (LLMs). What You Bring
Bachelors degree in Computer Science, Data Engineering, or a related field. Advanced degree preferred. 2+ years of work experience in a software or data-focused company, with significant expertise in data infrastructure and backend engineering. Deep knowledge of designing and managing scalable database systems, including relational databases (e.g., PostgreSQL, MySQL), NoSQL stores (e.g., MongoDB, Cassandra), and cloud-native solutions (e.g., Google Spanner, AWS DynamoDB). Strong experience with data infrastructure components such as data pipelines, streaming systems, and storage architectures (e.g., Cloud Buckets, Key-Value Stores). Proficiency in optimizing databases for performance (e.g., schema design, indexing, query tuning) and integrating them with broader data workflows. Previous experience with distributed systems tools (e.g., queues, message brokers like Kafka or RabbitMQ, job orchestration frameworks) for real-time data processing and other use cases. Previous experience with search engines (e.g., ElasticSearch). Knowledge of backend development using languages like Python, Java, or TypeScript; familiarity with NodeJS and NestJS is a plus. Proficient in data structures, algorithms, and system design for large-scale data management. Demonstrated ability to keep up with trends in data infrastructure and database technologies. Excellent communication and collaboration skills. Strong sense of ownership and ability to thrive in a fast-paced environment. Comfortable with ambiguity, breaking down high-level requirements into actionable data infrastructure tasks methodically. Resourceful problem-solver with attention to detail, eager to take initiative and deliver results. High proficiency in leveraging AI tools for daily development
(e.g., Cursor, GitHub Copilot). Nice to Have
Familiarity with data warehousing solutions (e.g., Snowflake, BigQuery). Experience with container orchestration systems (e.g., Kubernetes) for deploying data infrastructure components. Experience with one or more public cloud platforms:
Google Cloud Platform (GCP) (preferred) Amazon Web Services (AWS) Microsoft Azure
Understanding of the Data + AI ecosystem and its relevance to large-scale AI platforms. Knowledge of memory management and optimization in data-intensive systems. Experience with DevOps tools (e.g., ArgoCD, DataDog) for monitoring and managing data infrastructure. Previous experience using LLM backed AI services such as from OpenAI, Anthropic, Google, etc. to develop product features. Engineering at Labelbox
At Labelbox Engineering, we're building a comprehensive platform that powers the future of AI development. Our team combines deep technical expertise with a passion for innovation, working at the intersection of AI infrastructure, data systems, and user experience. We believe in pushing technical boundaries while maintaining high standards of code quality and system reliability. Our engineering culture emphasizes autonomous decision-making, rapid iteration, and collaborative problem-solving. We've cultivated an environment where engineers can take ownership of significant challenges, experiment with cutting-edge technologies, and see their solutions directly impact how leading AI labs and enterprises build the next generation of AI systems. Our Technology Stack
Our engineering team works with a modern tech stack designed for scalability, performance, and developer efficiency: Frontend:
React.js with Redux, TypeScript Backend:
Node.js, TypeScript, Python, some Java & Kotlin APIs:
GraphQL Cloud & Infrastructure:
Google Cloud Platform (GCP), Kubernetes Databases:
MySQL, Spanner, PostgreSQL Queueing / Streaming:
Kafka, PubSub
#J-18808-Ljbffr
As a
Backend Engineer, AI Data Platform
at Labelbox, you will lead the design and development of our core data infrastructure, powering the seamless flow, storage, and processing of data for our AI platform. Your expertise will drive the evolution of scalable systemsanchored by high-performance databasesto support large-scale workflows, high-throughput data I/O, and streaming capabilities. Youll enable Labelbox customers to efficiently manage and stream data for training next-generation AI models. Owning critical components of our data infrastructure, including database architecture, youll work end-to-end on projects from design to deployment. Collaborating cross-functionally with stakeholders, youll transform ideas into robust, scalable solutions that enhance platform adoption and customer success. Your Impact
Design and build scalable data infrastructure, integrating high-performance databases (relational, NoSQL, cloud-native) with distributed systems for data processing, storage, and streaming. Optimize database systems for performance, reliability, and scalability, ensuring efficient data retrieval, indexing, and querying to support AI workflows. Develop and maintain data pipelines using distributed queues, message brokers, and job management mechanisms to enable high-throughput import/export operations. Collaborate with team members and stakeholders to align data infrastructure with platform goals and customer needs. Participate in Sprint Planning, Standups, and related activities to drive data-focused initiatives forward. Mentor and guide less experienced engineers, sharing expertise in data infrastructure and database optimization. Support the teams area of ownership by working with the Support organization to resolve customer-facing data issues. Stay abreast of industry trends in data infrastructure and database technologies, incorporating relevant innovations into our systems. Contribute to technical documentation, research publications, blog posts, and presentations at conferences and forums. Innovation in AI: Enhance data infrastructure capabilities for an AI platform used by leading AI labs to develop powerful multi-modal large language models (LLMs). What You Bring
Bachelors degree in Computer Science, Data Engineering, or a related field. Advanced degree preferred. 2+ years of work experience in a software or data-focused company, with significant expertise in data infrastructure and backend engineering. Deep knowledge of designing and managing scalable database systems, including relational databases (e.g., PostgreSQL, MySQL), NoSQL stores (e.g., MongoDB, Cassandra), and cloud-native solutions (e.g., Google Spanner, AWS DynamoDB). Strong experience with data infrastructure components such as data pipelines, streaming systems, and storage architectures (e.g., Cloud Buckets, Key-Value Stores). Proficiency in optimizing databases for performance (e.g., schema design, indexing, query tuning) and integrating them with broader data workflows. Previous experience with distributed systems tools (e.g., queues, message brokers like Kafka or RabbitMQ, job orchestration frameworks) for real-time data processing and other use cases. Previous experience with search engines (e.g., ElasticSearch). Knowledge of backend development using languages like Python, Java, or TypeScript; familiarity with NodeJS and NestJS is a plus. Proficient in data structures, algorithms, and system design for large-scale data management. Demonstrated ability to keep up with trends in data infrastructure and database technologies. Excellent communication and collaboration skills. Strong sense of ownership and ability to thrive in a fast-paced environment. Comfortable with ambiguity, breaking down high-level requirements into actionable data infrastructure tasks methodically. Resourceful problem-solver with attention to detail, eager to take initiative and deliver results. High proficiency in leveraging AI tools for daily development
(e.g., Cursor, GitHub Copilot). Nice to Have
Familiarity with data warehousing solutions (e.g., Snowflake, BigQuery). Experience with container orchestration systems (e.g., Kubernetes) for deploying data infrastructure components. Experience with one or more public cloud platforms:
Google Cloud Platform (GCP) (preferred) Amazon Web Services (AWS) Microsoft Azure
Understanding of the Data + AI ecosystem and its relevance to large-scale AI platforms. Knowledge of memory management and optimization in data-intensive systems. Experience with DevOps tools (e.g., ArgoCD, DataDog) for monitoring and managing data infrastructure. Previous experience using LLM backed AI services such as from OpenAI, Anthropic, Google, etc. to develop product features. Engineering at Labelbox
At Labelbox Engineering, we're building a comprehensive platform that powers the future of AI development. Our team combines deep technical expertise with a passion for innovation, working at the intersection of AI infrastructure, data systems, and user experience. We believe in pushing technical boundaries while maintaining high standards of code quality and system reliability. Our engineering culture emphasizes autonomous decision-making, rapid iteration, and collaborative problem-solving. We've cultivated an environment where engineers can take ownership of significant challenges, experiment with cutting-edge technologies, and see their solutions directly impact how leading AI labs and enterprises build the next generation of AI systems. Our Technology Stack
Our engineering team works with a modern tech stack designed for scalability, performance, and developer efficiency: Frontend:
React.js with Redux, TypeScript Backend:
Node.js, TypeScript, Python, some Java & Kotlin APIs:
GraphQL Cloud & Infrastructure:
Google Cloud Platform (GCP), Kubernetes Databases:
MySQL, Spanner, PostgreSQL Queueing / Streaming:
Kafka, PubSub
#J-18808-Ljbffr