Straddle
ML/Data Analytics Engineer $120k – $155k base 1 Denver, CO
Straddle, Denver, Colorado, United States, 80285
We are seeking an ML/Data Analytics Engineer to join our engineering team and take ownership of the data pipelines and machine learning infrastructure that support our fintech platform. In this role, you will be the crucial link between raw data and actionable insights, ensuring that data flows smoothly from our products into analytics dashboards and fraud detection models. You’ll work on building systems that handle everything from aggregating transaction data and customer information, to deploying machine learning models that evaluate risk in real time. If you enjoy writing production-quality code as much as wrangling datasets and tuning models, this hybrid role at the intersection of software engineering and data science will be a great fit.
On any given day, you might be writing Python ETL jobs to extract and transform new data sources (for example, pulling in bank transaction logs or user activity events), orchestrating these jobs with a tool like Apache Airflow or cloud data pipelines. You’ll collaborate with the Data Science Lead to take prototypes of fraud detection or identity scoring models and implement the robust, scalable systems needed to run these models in production (such as setting up an API endpoint or microservice for real-time scoring). You will also create analytical queries or dashboards to help the team monitor key metrics – like success rates of payments, model performance, or user growth trends. This role involves a mix of backend engineering, DevOps for data (managing databases, cloud services), and applied ML engineering.
Because we are a small, agile team, the ML/Data Analytics Engineer will have broad responsibilities and plenty of autonomy. You’ll be expected to uphold strong coding standards and deliver reliable systems even as requirements change rapidly. Your contributions will directly influence our ability to make data-driven decisions and deliver intelligent features to customers. This is a full-time position based in Denver, CO with flexibility for remote work. We offer a competitive base salary and equity package. For an engineer who loves data and wants to build something impactful from the ground up, this role provides the opportunity to shape the data foundation of a promising fintech startup.
Key Responsibilities
Design, build, and maintain robust data pipelines that collect and process data from various parts of our system (e.g., user onboarding data, transaction records, external banking data via APIs). Ensure data is ETL’d into appropriate storage (databases, data lakes/warehouses) in a reliable, repeatable way. Collaborate with data scientists to productionize machine learning models. Rewrite or optimize model code for efficiency, set up REST/GraphQL endpoints or batch processes to serve model predictions (such as fraud risk scores) to the application, and integrate these into the transaction workflow. Implement real-time or near-real-time data processing where required. Set up message queues or streaming systems to handle events like incoming payments or login attempts, feeding them into fraud detection algorithms with low latency. Write complex SQL queries or use BI tools to enable reporting on key business and product metrics. Develop internal dashboards to surface insights (e.g., daily active users, number of payments processed, fraud alerts triggered) for team members and leadership. Oversee our databases and data warehouse solutions. Tune database performance, manage schema migrations for new data needs, and ensure secure and compliant handling of sensitive information (encryption, access controls, data retention policies). Partner with the Data Science Lead and Risk team to understand data requirements and ensure the pipeline meets their needs (e.g., delivering labeled datasets for model training or features for analytics). Work with software engineers to instrument the application code to emit important events and logs for analysis. Assist Customer Success or Product teams by pulling data when ad-hoc analysis is needed. Implement monitoring for data pipeline jobs and ML services to quickly detect failures or anomalies. Set up alerting for data quality issues (like missing data or pipeline delays) and work to make the system self-healing where possible. Write unit and integration tests for your pipelines and model serving code to maintain a high reliability bar. Keep up with the latest tools and best practices in data engineering and MLOps. Evaluate and introduce new technologies (like analytics platforms, feature stores, or ML workflow tools) that could enhance our capabilities. Continuously refactor and improve existing data systems for better performance and maintainability. Required Qualifications
3+ years of experience as a data engineer, machine learning engineer, or similar role. Strong knowledge of building data pipelines and working with ETL processes in a production environment. Proficiency in Python (for data scripts and ML integration) and SQL (for querying and managing data). Familiarity with at least one statically-typed language (Java, Scala, Go, etc.) is a bonus. Writing clean, maintainable code is a must. Experience with relational databases (PostgreSQL, MySQL) and writing complex SQL queries. Familiarity with data warehousing concepts and tools (Snowflake, BigQuery, Redshift) and possibly NoSQL data stores for unstructured data. Solid understanding of how common machine learning models function and are deployed. Comfortable taking a trained model and handling tasks like serialization, versioning, and setting up an API to serve predictions. Hands‑on experience with cloud services (AWS, GCP, Azure) especially those related to data processing (Lambda, Kinesis, S3, Google Cloud Functions, Pub/Sub, BigQuery). Knowledge of containerization (Docker) and CI/CD pipelines to deploy data/ML services. Ability to use or learn tools for creating dashboards or reports (Tableau, Looker, or Python visualization libraries) to help non‑technical team members understand the data. Strong analytical thinking to interpret data trends. Excellent debugging and problem‑solving abilities, especially when dealing with messy data or system issues. Diligence in verifying data accuracy and consistency. Understanding of the importance of clean data for analytics and model performance. Good communication skills to work with a cross‑functional team. Capable of translating technical information for less‑technical stakeholders. Comfortable working in an agile, iterative development process. Preferred Qualifications
Experience with financial datasets or payment processing systems. Understanding of transactions, ledgers, or fraud signals in banking data. Familiarity with streaming frameworks or messaging systems (Kafka, Kinesis, RabbitMQ) and experience building stream processing jobs for real‑time analytics. Experience with MLOps: model deployment workflows, automation of retraining, continuous integration for ML. Familiarity with tools like MLflow, Kubeflow, Vertex AI. Knowledge of big data processing frameworks (Spark, Hadoop). Experience optimizing large‑scale data jobs for performance/cost efficiency. Understanding of data security best practices and compliance measures (GDPR, SOC 2). Experience implementing data compliance measures or working with encrypted data. Bachelor’s or Master’s in Computer Science, Data Engineering, or related field. Benefits
Hybrid and remote options for flexible work location. Equity ownership in the form of options or RSUs. Unlimited paid time off. Comprehensive health, dental, and vision coverage. Fully paid parental leave (including adoption and foster care). Professional development budget. Home office stipend. Regular team retreats and offsites.
#J-18808-Ljbffr
Design, build, and maintain robust data pipelines that collect and process data from various parts of our system (e.g., user onboarding data, transaction records, external banking data via APIs). Ensure data is ETL’d into appropriate storage (databases, data lakes/warehouses) in a reliable, repeatable way. Collaborate with data scientists to productionize machine learning models. Rewrite or optimize model code for efficiency, set up REST/GraphQL endpoints or batch processes to serve model predictions (such as fraud risk scores) to the application, and integrate these into the transaction workflow. Implement real-time or near-real-time data processing where required. Set up message queues or streaming systems to handle events like incoming payments or login attempts, feeding them into fraud detection algorithms with low latency. Write complex SQL queries or use BI tools to enable reporting on key business and product metrics. Develop internal dashboards to surface insights (e.g., daily active users, number of payments processed, fraud alerts triggered) for team members and leadership. Oversee our databases and data warehouse solutions. Tune database performance, manage schema migrations for new data needs, and ensure secure and compliant handling of sensitive information (encryption, access controls, data retention policies). Partner with the Data Science Lead and Risk team to understand data requirements and ensure the pipeline meets their needs (e.g., delivering labeled datasets for model training or features for analytics). Work with software engineers to instrument the application code to emit important events and logs for analysis. Assist Customer Success or Product teams by pulling data when ad-hoc analysis is needed. Implement monitoring for data pipeline jobs and ML services to quickly detect failures or anomalies. Set up alerting for data quality issues (like missing data or pipeline delays) and work to make the system self-healing where possible. Write unit and integration tests for your pipelines and model serving code to maintain a high reliability bar. Keep up with the latest tools and best practices in data engineering and MLOps. Evaluate and introduce new technologies (like analytics platforms, feature stores, or ML workflow tools) that could enhance our capabilities. Continuously refactor and improve existing data systems for better performance and maintainability. Required Qualifications
3+ years of experience as a data engineer, machine learning engineer, or similar role. Strong knowledge of building data pipelines and working with ETL processes in a production environment. Proficiency in Python (for data scripts and ML integration) and SQL (for querying and managing data). Familiarity with at least one statically-typed language (Java, Scala, Go, etc.) is a bonus. Writing clean, maintainable code is a must. Experience with relational databases (PostgreSQL, MySQL) and writing complex SQL queries. Familiarity with data warehousing concepts and tools (Snowflake, BigQuery, Redshift) and possibly NoSQL data stores for unstructured data. Solid understanding of how common machine learning models function and are deployed. Comfortable taking a trained model and handling tasks like serialization, versioning, and setting up an API to serve predictions. Hands‑on experience with cloud services (AWS, GCP, Azure) especially those related to data processing (Lambda, Kinesis, S3, Google Cloud Functions, Pub/Sub, BigQuery). Knowledge of containerization (Docker) and CI/CD pipelines to deploy data/ML services. Ability to use or learn tools for creating dashboards or reports (Tableau, Looker, or Python visualization libraries) to help non‑technical team members understand the data. Strong analytical thinking to interpret data trends. Excellent debugging and problem‑solving abilities, especially when dealing with messy data or system issues. Diligence in verifying data accuracy and consistency. Understanding of the importance of clean data for analytics and model performance. Good communication skills to work with a cross‑functional team. Capable of translating technical information for less‑technical stakeholders. Comfortable working in an agile, iterative development process. Preferred Qualifications
Experience with financial datasets or payment processing systems. Understanding of transactions, ledgers, or fraud signals in banking data. Familiarity with streaming frameworks or messaging systems (Kafka, Kinesis, RabbitMQ) and experience building stream processing jobs for real‑time analytics. Experience with MLOps: model deployment workflows, automation of retraining, continuous integration for ML. Familiarity with tools like MLflow, Kubeflow, Vertex AI. Knowledge of big data processing frameworks (Spark, Hadoop). Experience optimizing large‑scale data jobs for performance/cost efficiency. Understanding of data security best practices and compliance measures (GDPR, SOC 2). Experience implementing data compliance measures or working with encrypted data. Bachelor’s or Master’s in Computer Science, Data Engineering, or related field. Benefits
Hybrid and remote options for flexible work location. Equity ownership in the form of options or RSUs. Unlimited paid time off. Comprehensive health, dental, and vision coverage. Fully paid parental leave (including adoption and foster care). Professional development budget. Home office stipend. Regular team retreats and offsites.
#J-18808-Ljbffr