Vnsilicon

Data Engineer Leader

Vnsilicon, Snowflake, Arizona, United States, 85937

At

Vietnam Silicon , we are on a mission to innovate and create world-class technology solutions. Lead the design and implementation of highly scalable, performance data pipelines for large-scale datasets supporting AI and analytics workloads. Architect and optimize advanced Extract, Transform, Load (ETL) processes using Python, SQL, and distributed computing frameworks. Oversee the development and maintenance of cloud-based data infrastructure on platforms like AWS, Azure, or GCP, ensuring scalability, security, and cost-efficiency. Drive seamless integration of data pipelines with AI/ML models, enabling efficient data flow for embeddings, training, and inference. Establish and enforce data governance, quality, and compliance standards aligned with regional and industry regulations. Design and deploy real-time and streaming data pipelines using tools like Kafka or Flink to support dynamic AI applications. Lead and mentor a team of 10-20 data engineers, fostering technical excellence and collaboration. Collaborate with business stakeholders and data scientists to translate complex requirements into robust data engineering solutions. Contribute to Vietnam Silicon’s technical leadership in the region by driving innovative data engineering initiatives. Develop strategies for adopting MLOps practices to enhance data pipeline integration with AI model deployment and monitoring. Other tasks assigned by the Company or Line Manager Requirements Master’s or PhD in Computer Science, Data Engineering, or a related technical field. 7+ years of professional experience in data engineering, with at least 3 years in a leadership role managing teams. Expert-level proficiency in Python, SQL, and advanced data pipeline tools (e.g., Apache Airflow, PySpark, dbt). Extensive experience with cloud platforms (e.g., AWS, Azure, GCP) and data storage solutions (e.g., Redshift, Postgres, Snowflake). Proven expertise in designing and optimizing ETL processes for complex, large-scale datasets. Strong experience integrating data pipelines with AI/ML workloads, including support for embedded and model training pipelines. Demonstrated ability to architect scalable, secure, and cost-efficient data infrastructure. Excellent leadership, problem-solving, and communication skills. Proven ability to collaborate with cross-functional teams and influence stakeholders. Prefer Extensive experience with distributed computing frameworks (e.g., PySpark, Hadoop) and GPU-accelerated data processing. Deep expertise in MLOps practices, including data versioning, pipeline orchestration, and monitoring. Familiarity with Southeast Asian data privacy regulations and regional business contexts. Significant contributions to open-source data engineering or AI projects. Experience designing and deploying real-time or streaming data pipelines for enterprise AI applications. Proficiency in building advanced data visualization tools or dashboards (e.g., Streamlit, Tableau, Superset). Recruitment Process

1 1 ApplicationReview 2 2 InitialConversation 3 3 TechnicalInterview 4 4 FinalDiscussion 5 5 Offer &Welcome Please upload your Resume Select relevant documents to upload your Resume Select a file or drag and drop here JPG, PNG or PDF, file size no more than 10MB

#J-18808-Ljbffr