Grassroots Carbon
Environmental Data Engineer / Machine Learning Engineer
Grassroots Carbon, San Antonio, Texas, United States, 78208
Overview
Position: Environmental Data Engineer / Machine Learning Engineer Position Type: Full-Time Reports to: Jay Weeks, Director of Data & Soil Science Role Overview As a Data Engineer / Machine Learning Engineer, you will play a pivotal role in bridging the gap between experimental prototypes and scalable, production-ready systems. You'll spend approximately 50% of your time optimizing and deploying data pipelines to support long-term business needs, and the other 50% developing advanced machine learning models for environmental mapping and predicting changes in environmental metrics (e.g., soil organic carbon stocks) using time series data. This is a hands-on position in a fast-paced startup environment, where you'll collaborate with cross-functional teams to deliver impactful, reliable solutions.
Key Responsibilities
Production Pipeline Development (50% of time):
Evaluate and refactor prototype code from R&D phases into efficient, maintainable production pipelines
Design, implement, and maintain scalable data ingestion, processing, and ETL (Extract, Transform, Load) workflows using cloud-based infrastructure (e.g., AWS, GCP, or Azure)
Ensure pipelines are robust, fault-tolerant, and optimized for performance, security, and cost-efficiency
Integrate monitoring, logging, and alerting systems to support ongoing operations and quick issue resolution
Collaborate with software engineers, scientists, data scientists, and other stakeholders to align pipelines with business objectives, enabling long-term scalability and reliability
Model Development (50% of time):
Build, train, and deploy machine learning models for environmental quantification (e.g., digital soil mapping, predicting soil organic carbon stock changes, etc.)
Work with time series data from various sources (e.g., satellite imagery, sensor data, historical records) to develop predictive models using techniques like time-series forecasting, geospatial analysis, and deep learning
Perform feature engineering, model evaluation, hyperparameter tuning, and validation to ensure accuracy and generalizability
Integrate ML models into production environments, including API development for real-time predictions and batch processing
Stay abreast of advancements in ML for geospatial and environmental applications, experimenting with new algorithms and tools to improve model performance
General Duties:
Conduct code reviews, write documentation, and mentor junior team members on best practices in data engineering and ML
Troubleshoot and debug issues in both data pipelines and ML systems
Required Qualifications
Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, Environmental Science, or a related field
3+ years of experience in data engineering, with a proven track record of productionizing prototype code in a startup or fast-paced environment
Strong proficiency in programming languages such as Python (with libraries like Pandas, NumPy, Scikit-learn) and SQL
Experience with ML frameworks (e.g., TensorFlow, PyTorch, XGBoost) and timeseries analysis (e.g., Prophet, LSTM networks, PINNs)
Hands-on experience with cloud platforms, containerization, and CI/CD pipelines
Familiarity with geospatial data processing and environmental modeling concepts, particularly in soil science or agriculture
Excellent problem-solving skills, with the ability to handle ambiguous requirements and deliver under tight deadlines
Preferred Skills
Experience in digital soil mapping, carbon stock prediction models, advanced statistics, and/or Bayesian model calibration / inference
Knowledge of big data technologies (e.g., Spark, Kafka) for handling large-scale timeseries datasets
Background in DevOps practices and infrastructure as code (e.g., Terraform)
Passion for sustainability and environmental impact
Benefits
Health Insurance plan with $0 deductible and $0 co-pay
Dental and vision insurance plans
Flexible spending account option.
Open Paid Time Off Policy plus 9 paid holidays per year as listed in our Company Handbook
Participation in our 401(k) savings plan
Company-paid Life and AD&D coverage
Educational materials and expenses to support continuing education opportunities
About Grassroots Carbon Grassroots Carbon is the leading grasslands restoration and soil carbon storage company that partners with landowners to implement and scale regenerative land management practices. In addition to enhancing soil health, promoting biodiversity, and improving water quality, these regenerative practices have tremendous potential to combat climate change by drawing down large quantities of atmospheric CO2 into the soil. Grassroots Carbon is proud to have partnered with ranchers across 1.6 million acres in 21 states to implement practices that restore grasslands, improve bird habitats, build soil health, and drive nature-based soil organic carbon drawdown through the healthy soils. Built on a foundation of scientific rigor, quality, and transparency, Grassroots Carbon has built strong partnerships with Audubon Conservation Ranching, Texas Agricultural Land Trust, Understand Ag, and Colorado State University\'s Soil Carbon Solutions Center while generating high-quality soil carbon drawdown credits for leading corporations, including Nestle, Microsoft, Shopify, Marathon Oil, H-E-B, Olipop, and Urban Villages, to offset their carbon impact and reach their sustainability goals. *Grassroots Carbon is proud to be a portfolio company of Soilworks Natural Capital*
About Soilworks Natural Capital Grassroots Carbon is proud to be a portfolio company of Soilworks Natural Capital, which provides shared services to our fast-growing company. Soilworks is a private equity fund that invests in, incubates, and acquires companies to help accelerate the Regenerative Agriculture movement and is on a mission to prove Regenerative grazing is the most profitable way to ranch. Soilworks principles include better and healthier food, restoring plant and animal diversity, regenerating soil to store water and carbon, and creating more profitable family farms. Soilworks was launched by the co-founders of Scaleworks, a technology venture equity fund based in San Antonio, TX. We are proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background leads to a better environment for our employees and a better experience for our users and our customers. We are an equal-opportunity employer and do not discriminate against protected characteristics. All candidates will be given the same consideration. *No visa sponsorship is available for this position*
#J-18808-Ljbffr
Position: Environmental Data Engineer / Machine Learning Engineer Position Type: Full-Time Reports to: Jay Weeks, Director of Data & Soil Science Role Overview As a Data Engineer / Machine Learning Engineer, you will play a pivotal role in bridging the gap between experimental prototypes and scalable, production-ready systems. You'll spend approximately 50% of your time optimizing and deploying data pipelines to support long-term business needs, and the other 50% developing advanced machine learning models for environmental mapping and predicting changes in environmental metrics (e.g., soil organic carbon stocks) using time series data. This is a hands-on position in a fast-paced startup environment, where you'll collaborate with cross-functional teams to deliver impactful, reliable solutions.
Key Responsibilities
Production Pipeline Development (50% of time):
Evaluate and refactor prototype code from R&D phases into efficient, maintainable production pipelines
Design, implement, and maintain scalable data ingestion, processing, and ETL (Extract, Transform, Load) workflows using cloud-based infrastructure (e.g., AWS, GCP, or Azure)
Ensure pipelines are robust, fault-tolerant, and optimized for performance, security, and cost-efficiency
Integrate monitoring, logging, and alerting systems to support ongoing operations and quick issue resolution
Collaborate with software engineers, scientists, data scientists, and other stakeholders to align pipelines with business objectives, enabling long-term scalability and reliability
Model Development (50% of time):
Build, train, and deploy machine learning models for environmental quantification (e.g., digital soil mapping, predicting soil organic carbon stock changes, etc.)
Work with time series data from various sources (e.g., satellite imagery, sensor data, historical records) to develop predictive models using techniques like time-series forecasting, geospatial analysis, and deep learning
Perform feature engineering, model evaluation, hyperparameter tuning, and validation to ensure accuracy and generalizability
Integrate ML models into production environments, including API development for real-time predictions and batch processing
Stay abreast of advancements in ML for geospatial and environmental applications, experimenting with new algorithms and tools to improve model performance
General Duties:
Conduct code reviews, write documentation, and mentor junior team members on best practices in data engineering and ML
Troubleshoot and debug issues in both data pipelines and ML systems
Required Qualifications
Bachelor's or Master's degree in Computer Science, Data Science, Machine Learning, Environmental Science, or a related field
3+ years of experience in data engineering, with a proven track record of productionizing prototype code in a startup or fast-paced environment
Strong proficiency in programming languages such as Python (with libraries like Pandas, NumPy, Scikit-learn) and SQL
Experience with ML frameworks (e.g., TensorFlow, PyTorch, XGBoost) and timeseries analysis (e.g., Prophet, LSTM networks, PINNs)
Hands-on experience with cloud platforms, containerization, and CI/CD pipelines
Familiarity with geospatial data processing and environmental modeling concepts, particularly in soil science or agriculture
Excellent problem-solving skills, with the ability to handle ambiguous requirements and deliver under tight deadlines
Preferred Skills
Experience in digital soil mapping, carbon stock prediction models, advanced statistics, and/or Bayesian model calibration / inference
Knowledge of big data technologies (e.g., Spark, Kafka) for handling large-scale timeseries datasets
Background in DevOps practices and infrastructure as code (e.g., Terraform)
Passion for sustainability and environmental impact
Benefits
Health Insurance plan with $0 deductible and $0 co-pay
Dental and vision insurance plans
Flexible spending account option.
Open Paid Time Off Policy plus 9 paid holidays per year as listed in our Company Handbook
Participation in our 401(k) savings plan
Company-paid Life and AD&D coverage
Educational materials and expenses to support continuing education opportunities
About Grassroots Carbon Grassroots Carbon is the leading grasslands restoration and soil carbon storage company that partners with landowners to implement and scale regenerative land management practices. In addition to enhancing soil health, promoting biodiversity, and improving water quality, these regenerative practices have tremendous potential to combat climate change by drawing down large quantities of atmospheric CO2 into the soil. Grassroots Carbon is proud to have partnered with ranchers across 1.6 million acres in 21 states to implement practices that restore grasslands, improve bird habitats, build soil health, and drive nature-based soil organic carbon drawdown through the healthy soils. Built on a foundation of scientific rigor, quality, and transparency, Grassroots Carbon has built strong partnerships with Audubon Conservation Ranching, Texas Agricultural Land Trust, Understand Ag, and Colorado State University\'s Soil Carbon Solutions Center while generating high-quality soil carbon drawdown credits for leading corporations, including Nestle, Microsoft, Shopify, Marathon Oil, H-E-B, Olipop, and Urban Villages, to offset their carbon impact and reach their sustainability goals. *Grassroots Carbon is proud to be a portfolio company of Soilworks Natural Capital*
About Soilworks Natural Capital Grassroots Carbon is proud to be a portfolio company of Soilworks Natural Capital, which provides shared services to our fast-growing company. Soilworks is a private equity fund that invests in, incubates, and acquires companies to help accelerate the Regenerative Agriculture movement and is on a mission to prove Regenerative grazing is the most profitable way to ranch. Soilworks principles include better and healthier food, restoring plant and animal diversity, regenerating soil to store water and carbon, and creating more profitable family farms. Soilworks was launched by the co-founders of Scaleworks, a technology venture equity fund based in San Antonio, TX. We are proud to foster a workplace free from discrimination. We strongly believe that diversity of experience, perspectives, and background leads to a better environment for our employees and a better experience for our users and our customers. We are an equal-opportunity employer and do not discriminate against protected characteristics. All candidates will be given the same consideration. *No visa sponsorship is available for this position*
#J-18808-Ljbffr