Logo
ZipRecruiter

Data Scientist

ZipRecruiter, Hyattsville, Maryland, United States, 20780

Save Job

Job DescriptionJob DescriptionDescription: We are looking for a highly skilled

NLP Data Scientist / Developer

to design and implement natural processing solutions for real-world problems. You will work on extracting insights from unstructured text data, building models, and deploying real-world, intelligent applications that understand and process human . This role blends data science, machine learning, and software development, with Python and LLMs at the core. Key Responsibilities: Develop and implement NLP pipelines to process, analyze, and extract insights from structured and unstructured text data. Build and fine-tune models for text classification, named entity recognition, summarization, sentiment analysis, topic modeling, etc. Work with state-of-the-art models (e.g., BERT/DeBERTa, spaCy, LLM APIs) and apply transfer learning techniques. Clean, tokenize, and normalize large text corpora in various formats (PDFs, HTML, etc.). Collaborate with cross-functional teams to integrate NLP features into software tools and customer-facing applications. Create REST APIs or services to serve models in production using frameworks like FastAPI or Flask. Optimize performance, accuracy, and scalability of NLP systems. Document technical approaches, experiment results, and development procedures for internal and external stakeholders. What We Offer: Competitive salary and benefits package Flexible remote work options Access to GPU resources and cloud infrastructure Opportunities to work on cutting-edge NLP problems A collaborative, forward-thinking AI/ML team Requirements: Required Qualifications: 2+ years of experience with NLP development and Python packages. Strong knowledge of NLP libraries such as spaCy and Transformers (Hugging Face). Solid understanding of text preprocessing, vectorization (TF-IDF, word embeddings), and classification techniques. Experience with machine learning libraries like

TensorFlow/PyTorch. Strong knowledge of hybrid models incorporating LLMs/genAI and traditional ML approaches Experience with PDF text extraction. Must currently possess or be eligible to obtain a Public Trust clearance Qualifications: Bachelor’s or Master’s degree in Data Science, Computational Linguistics, Machine Learning, Applied Mathematics, Statistics, Computer Science or a related field. Experience with LLMs (Large Models) and prompt engineering. Knowledge of data privacy, redaction, and PII detection in text. Background in information retrieval or question-answering systems. Prior work with government, legal, healthcare, or enterprise document processing is a plus. Experience working with cloud platforms (AWS, Azure, GCP) and containerization (Docker). Familiarity with REST APIs, FastAPI/Flask, and deploying models to production. Proficiency with version control (Git) and collaborative development workflows.

#J-18808-Ljbffr