Tevogen
Data Engineer
Work Location:
Tevogen Corporate HQ, Warren, NJ Reports to:
Chief Information Officer Position:
Full-time, Exempt Job Description: As a
Data Engineer
at Tevogen Bio, you will work closely with the
Data Science
team to design, develop, and maintain the infrastructure and systems required for data storage, processing, and analysis. These datasets will be used to train machine learning and predictive AI models that bridge the fields of
genomics and proteomics . Your work will play a crucial role in building and managing the data pipelines that enable efficient and reliable data integration, transformation, and delivery to expand our understanding of
T-cell receptor (TCR) interactions with proteins , ultimately aiding in the design of novel immunotherapies. You will work on creating foundational models with the Data Science team with several downstream applications. Responsibilities: Work directly with the Head of Tevogen AI, the Chief Scientific Officer, and Clinical Development Lead to design and develop data pipelines that extract data from various sources, transform it into the desired format, and load it into the appropriate data storage systems. Collaborate with data scientists and analysts to optimize models and algorithms for data quality, security, and governance. Work with the latest tools to integrate data from different sources, including databases, data warehouses, APIs, and external systems. Ensure data consistency and integrity during the integration process, performing data validation and cleaning as needed. Prepare data for proprietary model development and assist with creating predictive models as part of the broader organization. Present findings and contribute to research publications and patents. Optimize data pipelines and data processing workflows for performance, scalability, and efficiency. Monitor and tune data systems, identify and resolve performance bottlenecks, and implement caching and indexing strategies to enhance query performance. Implement data quality checks and validations within data pipelines to ensure the accuracy, consistency, and completeness of data. Establish the governance of data and algorithms used for analysis, analytical applications, and automated decision making. Education: A bachelor’s or master’s degree in computer science, data science, software engineering, information systems, or related quantitative field. Experience: At least six years of work experience in data management disciplines, including data integration, modeling, optimization, and data quality, or other areas directly relevant to data engineering responsibilities and tasks. Possess storytelling skills to guide cross-functional teams through findings and discoveries. Proven project experience developing and maintaining data warehouses in big data solutions (Azure/AWS/Databricks). Skills: Expert knowledge in Apache technologies such as Kafka, Airflow, and Spark to build scalable and efficient data pipelines. Ability to design, build, and deploy data solutions that capture, explore, transform, and utilize data to support AI, ML, and BI. Strong ability in programming languages such as Java, Python, and C/C++. Ability in data science languages/tools such as SQL, R, SAS, or Excel. Proficiency in the design and implementation of modern data architectures and concepts such as cloud services (AWS, Azure, GCP) and modern data warehouse tools (Snowflake, Databricks). Experience with database technologies such as SQL, NoSQL, Oracle, Hadoop, or Teradata. Ability to collaborate within and across teams of different technical knowledge to support delivery and educate end users on data products. Expert problem-solving skills, including debugging skills, allowing the determination of sources of issues in unfamiliar code or systems, and the ability to recognize and solve repetitive problems. Excellent business acumen and interpersonal skills; able to work across business lines at a senior level to influence and effect change to achieve common goals. Ability to describe business use cases/outcomes, data sources and management concepts, and analytical approaches/options. Why join Tevogen Bio? Opportunity to work on groundbreaking research in
T-cell therapy and computational biology . Collaborative, fast-paced, and innovative work environment. Competitive compensation and comprehensive benefits package. Professional growth opportunities in a rapidly evolving biotech landscape. If you are passionate about the intersection of
machine learning, bioinformatics, and immunotherapy , we encourage you to apply and be a part of our mission at
Tevogen Bio !
#J-18808-Ljbffr
Work Location:
Tevogen Corporate HQ, Warren, NJ Reports to:
Chief Information Officer Position:
Full-time, Exempt Job Description: As a
Data Engineer
at Tevogen Bio, you will work closely with the
Data Science
team to design, develop, and maintain the infrastructure and systems required for data storage, processing, and analysis. These datasets will be used to train machine learning and predictive AI models that bridge the fields of
genomics and proteomics . Your work will play a crucial role in building and managing the data pipelines that enable efficient and reliable data integration, transformation, and delivery to expand our understanding of
T-cell receptor (TCR) interactions with proteins , ultimately aiding in the design of novel immunotherapies. You will work on creating foundational models with the Data Science team with several downstream applications. Responsibilities: Work directly with the Head of Tevogen AI, the Chief Scientific Officer, and Clinical Development Lead to design and develop data pipelines that extract data from various sources, transform it into the desired format, and load it into the appropriate data storage systems. Collaborate with data scientists and analysts to optimize models and algorithms for data quality, security, and governance. Work with the latest tools to integrate data from different sources, including databases, data warehouses, APIs, and external systems. Ensure data consistency and integrity during the integration process, performing data validation and cleaning as needed. Prepare data for proprietary model development and assist with creating predictive models as part of the broader organization. Present findings and contribute to research publications and patents. Optimize data pipelines and data processing workflows for performance, scalability, and efficiency. Monitor and tune data systems, identify and resolve performance bottlenecks, and implement caching and indexing strategies to enhance query performance. Implement data quality checks and validations within data pipelines to ensure the accuracy, consistency, and completeness of data. Establish the governance of data and algorithms used for analysis, analytical applications, and automated decision making. Education: A bachelor’s or master’s degree in computer science, data science, software engineering, information systems, or related quantitative field. Experience: At least six years of work experience in data management disciplines, including data integration, modeling, optimization, and data quality, or other areas directly relevant to data engineering responsibilities and tasks. Possess storytelling skills to guide cross-functional teams through findings and discoveries. Proven project experience developing and maintaining data warehouses in big data solutions (Azure/AWS/Databricks). Skills: Expert knowledge in Apache technologies such as Kafka, Airflow, and Spark to build scalable and efficient data pipelines. Ability to design, build, and deploy data solutions that capture, explore, transform, and utilize data to support AI, ML, and BI. Strong ability in programming languages such as Java, Python, and C/C++. Ability in data science languages/tools such as SQL, R, SAS, or Excel. Proficiency in the design and implementation of modern data architectures and concepts such as cloud services (AWS, Azure, GCP) and modern data warehouse tools (Snowflake, Databricks). Experience with database technologies such as SQL, NoSQL, Oracle, Hadoop, or Teradata. Ability to collaborate within and across teams of different technical knowledge to support delivery and educate end users on data products. Expert problem-solving skills, including debugging skills, allowing the determination of sources of issues in unfamiliar code or systems, and the ability to recognize and solve repetitive problems. Excellent business acumen and interpersonal skills; able to work across business lines at a senior level to influence and effect change to achieve common goals. Ability to describe business use cases/outcomes, data sources and management concepts, and analytical approaches/options. Why join Tevogen Bio? Opportunity to work on groundbreaking research in
T-cell therapy and computational biology . Collaborative, fast-paced, and innovative work environment. Competitive compensation and comprehensive benefits package. Professional growth opportunities in a rapidly evolving biotech landscape. If you are passionate about the intersection of
machine learning, bioinformatics, and immunotherapy , we encourage you to apply and be a part of our mission at
Tevogen Bio !
#J-18808-Ljbffr