Meshy LLC.
Data Infrastructure Engineer New Bay Area
Meshy LLC., Berkeley, California, United States, 94709
About Meshy
Headquartered in the Silicon Valley, Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. What once took weeks and $1,000 now takes 2 minutes and $1.
Our global team of top experts in computer graphics, AI, and art includes alumni from MIT, Stanford, Berkeley, as well as veterans from Nvidia and Microsoft. With 3 million users (and growing), Meshy is trusted by top developers and backed by premiere venture capital firms like Sequoia and GGV.
No. 1 popularity, among 3D AI tools, according to A16Z games, No. 1 website traffic, among 3D AI tools, according to SimilarWeb (2M monthly visits), Leading 3D foundation model, delighted texture & fine geometry, $52M funding by Top VCs, 2.5M users & 20M models generated!
Ethan Yuanming Hu serves as the founder and CEO. Ethan got his Ph.D. in graphics and AI from MIT, where he developed the Taichi GPU programming language (27K stars on GitHub, used by 300+ institutes). His Ph.D. thesis got a honorable mention of SIGGRAPH 2022 Outstanding Doctoral Dissertation Award and his research has been cited over 2700 times . his favorite animal is the llama.
About the Role: We are seeking a Data Infrastructure Engineer to join our growing team. In this role, you will design, build, and operate distributed data systems that power large-scale ingestion, processing, and transformation of datasets used for AI model training. These datasets span traditional structured data as well as unstructured assets such as images and 3D models, which often require specialized preprocessing for pretraining and fine-tuning workflows. This is a versatile role: you’ll own end-to-end pipelines (from ingestion to transformation), ensure data quality and scalability, and collaborate closely with ML researchers to prepare diverse datasets for cutting-edge model training. You’ll thrive in our fast-paced startup environment, where problem-solving, adaptability, and wearing multiple hats are the norm. What You’ll Do: Core Data Pipelines Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D/2D assets, binaries). Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics. Pretrain Data Processing Support preprocessing of unstructured assets (e.g., images, 3D/2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction. Implement validation and quality checks to ensure datasets meet ML training requirements. Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs. Distributed Systems & Storage Architect pipelines across cloud object storage (S3, GCS, Azure Blob), data lakes, and metadata catalogs. Optimize large-scale processing with distributed frameworks (Spark, Dask, Ray, Flink, or equivalents). Implement partitioning, sharding, caching strategies, and observability (monitoring, logging, alerting) for reliable pipelines. Infrastructure & DevOps Use infrastructure-as-code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments. Data Governance & Collaboration Maintain data lineage, reproducibility, and governance for datasets used in AI/ML pipelines. Work cross-functionally with ML researchers, graphics/vision engineers, and platform teams. Embrace versatility: switch between infrastructure-level challenges and asset/data-level problem solving. Contribute to a culture of fast iteration, pragmatic trade-offs, and collaborative ownership. Technical Background 5+ years of experience in data engineering, distributed systems, or similar. Solid skills in SQL for analytics, transformations, and warehouse/lakehouse integration. Proficiency with distributed frameworks (Spark, Dask, Ray, Flink). Familiarity with cloud platforms (AWS/GCP/Azure) and storage systems (S3, Parquet, Delta Lake, etc.). Experience with workflow orchestration tools (Airflow, Prefect, Dagster). Domain Skills (Preferred) Experience handling large-scale unstructured datasets (images, video, binaries, or 3D/2D assets). Familiarity with AI/ML training data pipelines, including dataset versioning, augmentation, and sharding. Exposure to computer graphics or 3D/2D data processing is strongly preferred. Mindset Comfortable in a startup environment: versatile, self-directed, pragmatic, and adaptive. Strong problem solver who enjoys tackling ambiguous challenges. Commitment to building robust, maintainable, and observable systems. Kubernetes for distributed workloads and orchestration. Data warehouses or lakehouse platforms (Snowflake, BigQuery, Databricks, Redshift). Familiarty GPU-accelerated computing and HPC clusters Experience with 3D/2D asset processing (geometry transformations, rendering pipelines, texture handling). Open-source contributions in ML infrastructure, distributed systems, or data platforms. Familiarity with secure data handling and compliance Brain : We value intelligence and the pursuit of knowledge. Our team is composed of some of the brightest minds in the industry. Heart : We care deeply about our work, our users, and each other. Empathy and passion drive us forward. Gut : We trust our instincts and are not afraid to take bold risks. Innovation requires courage. Taste : We have a keen eye for quality and aesthetics. Our products are not just functional but also beautiful. Why Join Meshy?
Competitive salary, equity, and benefits package. Opportunity to work with a talented and passionate team at the forefront of AI and 3D technology. Flexible work environment, with options for remote and on-site work. Opportunities for fast professional growth and development. An inclusive culture that values creativity, innovation, and collaboration. Unlimited, flexible time off. Competitive salary, benefits and stock options. 401(k) plan for employees. Comprehensive health, dental, and vision insurance. The latest and best office equipment. Create a Job Alert Interested in building your career at Meshy? Get future opportunities sent straight to your email. Apply for this job
* indicates a required field First Name * Last Name * Preferred First Name Email * Phone Resume/CV Enter manually Accepted file types: pdf, doc, docx, txt, rtf Enter manually Accepted file types: pdf, doc, docx, txt, rtf How did you hear about us? * Select... For planning purposes only, are you currently authorized to work in the United States, and would you require visa sponsorship in the future to maintain your work authorization? * Just to help us better understand timezone alignment and employment logistics, may I ask which U.S. state you're currently located in? * If you heard us from headhunter/recruiter or employee referral, please provide headhunter/recruiter name or employee name below. LinkedIn Profile Website Personal Blog or Portfolio GitHub I have read, understood, and agree to the terms and conditions of the attached Confidentiality and Non-Disclosure Agreement. By checking this box and submitting my application, I hereby electronically sign this Agreement as an Applicant and agree to be bound by its terms.Any information disclosed by Meshy to an Applicant during the job application and interview process is part of the Confidential Information. * Select... Do you agree to the terms of our Non-Disclosure Agreement (NDA), available here ?
#J-18808-Ljbffr
Headquartered in the Silicon Valley, Meshy is the leading 3D generative AI company on a mission to Unleash 3D Creativity. Meshy makes it effortless for both professional artists and hobbyists to create unique 3D assets—turning text and images into stunning 3D models in just minutes. What once took weeks and $1,000 now takes 2 minutes and $1.
Our global team of top experts in computer graphics, AI, and art includes alumni from MIT, Stanford, Berkeley, as well as veterans from Nvidia and Microsoft. With 3 million users (and growing), Meshy is trusted by top developers and backed by premiere venture capital firms like Sequoia and GGV.
No. 1 popularity, among 3D AI tools, according to A16Z games, No. 1 website traffic, among 3D AI tools, according to SimilarWeb (2M monthly visits), Leading 3D foundation model, delighted texture & fine geometry, $52M funding by Top VCs, 2.5M users & 20M models generated!
Ethan Yuanming Hu serves as the founder and CEO. Ethan got his Ph.D. in graphics and AI from MIT, where he developed the Taichi GPU programming language (27K stars on GitHub, used by 300+ institutes). His Ph.D. thesis got a honorable mention of SIGGRAPH 2022 Outstanding Doctoral Dissertation Award and his research has been cited over 2700 times . his favorite animal is the llama.
About the Role: We are seeking a Data Infrastructure Engineer to join our growing team. In this role, you will design, build, and operate distributed data systems that power large-scale ingestion, processing, and transformation of datasets used for AI model training. These datasets span traditional structured data as well as unstructured assets such as images and 3D models, which often require specialized preprocessing for pretraining and fine-tuning workflows. This is a versatile role: you’ll own end-to-end pipelines (from ingestion to transformation), ensure data quality and scalability, and collaborate closely with ML researchers to prepare diverse datasets for cutting-edge model training. You’ll thrive in our fast-paced startup environment, where problem-solving, adaptability, and wearing multiple hats are the norm. What You’ll Do: Core Data Pipelines Design, implement, and maintain distributed ingestion pipelines for structured and unstructured data (images, 3D/2D assets, binaries). Build scalable ETL/ELT workflows to transform, validate, and enrich datasets for AI/ML model training and analytics. Pretrain Data Processing Support preprocessing of unstructured assets (e.g., images, 3D/2D models, video) for training pipelines, including format conversion, normalization, augmentation, and metadata extraction. Implement validation and quality checks to ensure datasets meet ML training requirements. Collaborate with ML researchers to quickly adapt pipelines to evolving pretraining and evaluation needs. Distributed Systems & Storage Architect pipelines across cloud object storage (S3, GCS, Azure Blob), data lakes, and metadata catalogs. Optimize large-scale processing with distributed frameworks (Spark, Dask, Ray, Flink, or equivalents). Implement partitioning, sharding, caching strategies, and observability (monitoring, logging, alerting) for reliable pipelines. Infrastructure & DevOps Use infrastructure-as-code (Terraform, Kubernetes, etc.) to manage scalable and reproducible environments. Data Governance & Collaboration Maintain data lineage, reproducibility, and governance for datasets used in AI/ML pipelines. Work cross-functionally with ML researchers, graphics/vision engineers, and platform teams. Embrace versatility: switch between infrastructure-level challenges and asset/data-level problem solving. Contribute to a culture of fast iteration, pragmatic trade-offs, and collaborative ownership. Technical Background 5+ years of experience in data engineering, distributed systems, or similar. Solid skills in SQL for analytics, transformations, and warehouse/lakehouse integration. Proficiency with distributed frameworks (Spark, Dask, Ray, Flink). Familiarity with cloud platforms (AWS/GCP/Azure) and storage systems (S3, Parquet, Delta Lake, etc.). Experience with workflow orchestration tools (Airflow, Prefect, Dagster). Domain Skills (Preferred) Experience handling large-scale unstructured datasets (images, video, binaries, or 3D/2D assets). Familiarity with AI/ML training data pipelines, including dataset versioning, augmentation, and sharding. Exposure to computer graphics or 3D/2D data processing is strongly preferred. Mindset Comfortable in a startup environment: versatile, self-directed, pragmatic, and adaptive. Strong problem solver who enjoys tackling ambiguous challenges. Commitment to building robust, maintainable, and observable systems. Kubernetes for distributed workloads and orchestration. Data warehouses or lakehouse platforms (Snowflake, BigQuery, Databricks, Redshift). Familiarty GPU-accelerated computing and HPC clusters Experience with 3D/2D asset processing (geometry transformations, rendering pipelines, texture handling). Open-source contributions in ML infrastructure, distributed systems, or data platforms. Familiarity with secure data handling and compliance Brain : We value intelligence and the pursuit of knowledge. Our team is composed of some of the brightest minds in the industry. Heart : We care deeply about our work, our users, and each other. Empathy and passion drive us forward. Gut : We trust our instincts and are not afraid to take bold risks. Innovation requires courage. Taste : We have a keen eye for quality and aesthetics. Our products are not just functional but also beautiful. Why Join Meshy?
Competitive salary, equity, and benefits package. Opportunity to work with a talented and passionate team at the forefront of AI and 3D technology. Flexible work environment, with options for remote and on-site work. Opportunities for fast professional growth and development. An inclusive culture that values creativity, innovation, and collaboration. Unlimited, flexible time off. Competitive salary, benefits and stock options. 401(k) plan for employees. Comprehensive health, dental, and vision insurance. The latest and best office equipment. Create a Job Alert Interested in building your career at Meshy? Get future opportunities sent straight to your email. Apply for this job
* indicates a required field First Name * Last Name * Preferred First Name Email * Phone Resume/CV Enter manually Accepted file types: pdf, doc, docx, txt, rtf Enter manually Accepted file types: pdf, doc, docx, txt, rtf How did you hear about us? * Select... For planning purposes only, are you currently authorized to work in the United States, and would you require visa sponsorship in the future to maintain your work authorization? * Just to help us better understand timezone alignment and employment logistics, may I ask which U.S. state you're currently located in? * If you heard us from headhunter/recruiter or employee referral, please provide headhunter/recruiter name or employee name below. LinkedIn Profile Website Personal Blog or Portfolio GitHub I have read, understood, and agree to the terms and conditions of the attached Confidentiality and Non-Disclosure Agreement. By checking this box and submitting my application, I hereby electronically sign this Agreement as an Applicant and agree to be bound by its terms.Any information disclosed by Meshy to an Applicant during the job application and interview process is part of the Confidential Information. * Select... Do you agree to the terms of our Non-Disclosure Agreement (NDA), available here ?
#J-18808-Ljbffr