SystImmune
AI Machine Learning and Data Augmentation, Senior Scientist
SystImmune, Redmond, Washington, United States, 98052
Overview
SystImmune is a leading and well-funded clinical-stage biopharmaceutical company located in Redmond, WA and Princeton, NJ. It specializes in developing innovative cancer treatments using its established drug development platforms, focusing on bi-specific, multi-specific antibodies, and antibody-drug conjugates (ADCs). SystImmune has multiple assets in various stages of clinical trials for solid tumor and hematologic indications. We offer an opportunity for you to learn and grow while making significant contributions to the company's success. Responsibilities
Llama 3.3 Implementation and Extension: Develop and fine-tune Llama 3.3 models for sequence-to-structure-to-activity relationship prediction, leveraging internal data from antibody discovery, protein engineering, and immunology oncology projects; integrate domain-specific knowledge and constraints into the Llama 3.3 framework to improve model performance and accuracy. Data Generation and Processing: Design and implement data generation pipelines to produce high-quality training datasets for Llama 3.3 models, including sequence, structure, and activity data from internal projects; develop and optimize algorithms for data processing, feature extraction, and data augmentation. Fine-Tuning of AI Model with Processed SI R&D Data: Fine-tune the Llama 3.3 model using processed SystImmune R&D data, including antibody discovery, protein engineering, and immunology oncology datasets; integrate additional features and constraints from internal data to improve model performance. Embedding Language Models for New Data Structure Design: Embed language models (e.g., Milvus, LangChain) to convert data into numerical representations and store it in a vector database for new data structure design, mining, and processing; utilize RAG (Retrieval-Augmented Generator) with MariaDB Vector DB to enhance data retrieval and generation capabilities. Automatic Data Flow Management: Design and implement automatic data flow management from current LIMS (Laboratory Information Management System) MariaDB to AI Embedding DB, ensuring seamless data integration and synchronization; develop data pipelines to extract, transform, and load data from various sources into the AI Embedding DB. Parallel Computing and Optimization: Implement parallel computing architectures using MPI, OpenMP, or distributed computing frameworks (e.g., Dask, Ray) to accelerate Llama 3.3 model training and inference; optimize code performance on CPUs, GPUs, and HPC clusters. Software Development and Integration: Design, develop, and maintain software applications and tools for Llama 3.3 model training, data processing, and parallel computing, using Python, C++, or Julia; collaborate with the AIDD team to integrate Llama 3.3 workflows into production. Data Security and Backup Management: Develop and implement robust data security measures to protect sensitive external data, including encryption, access controls, and authentication protocols; design and manage backup strategies for external data, ensuring data integrity, redundancy, and recoverability; collaborate with IT and cybersecurity teams to ensure compliance with GDPR, HIPAA; conduct regular security audits and risk assessments. Data Loss Prevention and Incident Response: Develop and implement procedures for preventing data loss and responding to security incidents; establish a disaster recovery plan to ensure business continuity. Qualifications
Education: Ph.D. or Master\'s degree in Computer Science, Artificial Intelligence, Bioinformatics, Computational Biology, or related field Experience: 5+ years in AI/ML model development, with a focus on natural language processing and/or database management Technical Skills: Experience in drug development or selecting drug targets; proficiency in Python, C++, or Julia; experience with PyTorch or TensorFlow; familiarity with parallel and distributed computing; data security and backup management Domain Knowledge: Strong understanding of antibody discovery, protein engineering, and immunology oncology; experience working with internal data from these fields Communication: Excellent communication and collaboration skills to work with cross-functional teams Nice to Have
Experience with Llama 3.3 or other large language models Experience with RAG and MariaDB Vector DB Knowledge of process development principles and their application in biopharma manufacturing Experience with Milvus, LangChain, or other vector databases Compensation and Benefits
Base salary range: $150,000 - $250,000 annually. Actual compensation based on qualifications, experience, and skills. Offers may be near the higher end for exceptional candidates; typical ranges may fall toward the low to mid-point. SystImmune offers comprehensive benefits including 100% paid employee premiums for medical/dental/vision, STD, LTD, a 401(k) with 50% company match up to 3%, 5-year vesting, 15 PTO days, sick leave, 11 paid holidays, and more. Equal Opportunity
SystImmune is an Equal Opportunity Employer. We welcome diverse talent and encourage all qualified applicants to apply.
#J-18808-Ljbffr
SystImmune is a leading and well-funded clinical-stage biopharmaceutical company located in Redmond, WA and Princeton, NJ. It specializes in developing innovative cancer treatments using its established drug development platforms, focusing on bi-specific, multi-specific antibodies, and antibody-drug conjugates (ADCs). SystImmune has multiple assets in various stages of clinical trials for solid tumor and hematologic indications. We offer an opportunity for you to learn and grow while making significant contributions to the company's success. Responsibilities
Llama 3.3 Implementation and Extension: Develop and fine-tune Llama 3.3 models for sequence-to-structure-to-activity relationship prediction, leveraging internal data from antibody discovery, protein engineering, and immunology oncology projects; integrate domain-specific knowledge and constraints into the Llama 3.3 framework to improve model performance and accuracy. Data Generation and Processing: Design and implement data generation pipelines to produce high-quality training datasets for Llama 3.3 models, including sequence, structure, and activity data from internal projects; develop and optimize algorithms for data processing, feature extraction, and data augmentation. Fine-Tuning of AI Model with Processed SI R&D Data: Fine-tune the Llama 3.3 model using processed SystImmune R&D data, including antibody discovery, protein engineering, and immunology oncology datasets; integrate additional features and constraints from internal data to improve model performance. Embedding Language Models for New Data Structure Design: Embed language models (e.g., Milvus, LangChain) to convert data into numerical representations and store it in a vector database for new data structure design, mining, and processing; utilize RAG (Retrieval-Augmented Generator) with MariaDB Vector DB to enhance data retrieval and generation capabilities. Automatic Data Flow Management: Design and implement automatic data flow management from current LIMS (Laboratory Information Management System) MariaDB to AI Embedding DB, ensuring seamless data integration and synchronization; develop data pipelines to extract, transform, and load data from various sources into the AI Embedding DB. Parallel Computing and Optimization: Implement parallel computing architectures using MPI, OpenMP, or distributed computing frameworks (e.g., Dask, Ray) to accelerate Llama 3.3 model training and inference; optimize code performance on CPUs, GPUs, and HPC clusters. Software Development and Integration: Design, develop, and maintain software applications and tools for Llama 3.3 model training, data processing, and parallel computing, using Python, C++, or Julia; collaborate with the AIDD team to integrate Llama 3.3 workflows into production. Data Security and Backup Management: Develop and implement robust data security measures to protect sensitive external data, including encryption, access controls, and authentication protocols; design and manage backup strategies for external data, ensuring data integrity, redundancy, and recoverability; collaborate with IT and cybersecurity teams to ensure compliance with GDPR, HIPAA; conduct regular security audits and risk assessments. Data Loss Prevention and Incident Response: Develop and implement procedures for preventing data loss and responding to security incidents; establish a disaster recovery plan to ensure business continuity. Qualifications
Education: Ph.D. or Master\'s degree in Computer Science, Artificial Intelligence, Bioinformatics, Computational Biology, or related field Experience: 5+ years in AI/ML model development, with a focus on natural language processing and/or database management Technical Skills: Experience in drug development or selecting drug targets; proficiency in Python, C++, or Julia; experience with PyTorch or TensorFlow; familiarity with parallel and distributed computing; data security and backup management Domain Knowledge: Strong understanding of antibody discovery, protein engineering, and immunology oncology; experience working with internal data from these fields Communication: Excellent communication and collaboration skills to work with cross-functional teams Nice to Have
Experience with Llama 3.3 or other large language models Experience with RAG and MariaDB Vector DB Knowledge of process development principles and their application in biopharma manufacturing Experience with Milvus, LangChain, or other vector databases Compensation and Benefits
Base salary range: $150,000 - $250,000 annually. Actual compensation based on qualifications, experience, and skills. Offers may be near the higher end for exceptional candidates; typical ranges may fall toward the low to mid-point. SystImmune offers comprehensive benefits including 100% paid employee premiums for medical/dental/vision, STD, LTD, a 401(k) with 50% company match up to 3%, 5-year vesting, 15 PTO days, sick leave, 11 paid holidays, and more. Equal Opportunity
SystImmune is an Equal Opportunity Employer. We welcome diverse talent and encourage all qualified applicants to apply.
#J-18808-Ljbffr