Gilder Search Group
Prediktive -Latin America, LATAM, United States
We are looking for a
Senior Data Engineer
based in
Latin America
to work on a long-term project for one of our clients, a Non-profit Organization based in New York. Our client is a nonprofit organization that offers free, 24/7, confidential mental health support via text message, in English and Spanish. The person in this role should have extensive experience in building, managing, and optimizing data pipelines, leveraging Infrastructure as Code (IaC) methodologies, and strong technical skills in SQL, Python, Spark, and PySpark, and have hands-on experience with AWS and Databricks. Responsibilities Collaborate with other engineering teams, data scientists, and business units to understand data requirements and improve data ingestion and integration processes. Collaborate with cross-functional teams to understand data needs and deliver solutions that support business objectives. Design, build, and maintain scalable and reliable data pipelines using Infrastructure as Code (IaC) principles and Databricks, along with Datadog. Design and implement scalable and secure integration solutions between the data sources and our client’s data pipeline infrastructure. Develop and optimize big data processing using Spark and PySpark within the Databricks environment. Write complex SQL queries for data extraction, transformation, and loading (ETL) processes. Implement automation and monitoring tools to ensure data accuracy and pipeline efficiency. Maintain comprehensive documentation for all data engineering processes, including pipeline architectures, codebase, and infrastructure configurations. Ensure that best practices and coding standards are followed and documented for future reference. Stay updated with the latest technologies and methodologies in data engineering and propose improvements to current processes. Requirements Advanced Level of English. 6+ years of experience working as a Software Engineer, focus on Data Engineering. 4+ years of experience querying and modeling data using SQL. 4+ years of experience working with Python for data analysis, data science, and development (matplotlib, numpy, pandas). Experience building data models for use in dashboarding tools (Looker, Sisense, PowerBI, Tableau, or similar) Experience with infrastructure-as-code tools such as Terraform. Familiar with distributed computing systems such as Spark (Scala, Pyspark). Ability to influence strategy and align stakeholders to evolve data use based on research, data, and industry trends. Fluent with git (preferred) or other version control systems and practices. Experience with pipeline orchestration tools such as Databricks Workflows or Airflow. Experience with dimensional modeling and creating data marts for a variety of stakeholders. Experience with monitoring, logging, and other observability practices for maintaining data pipelines in production. Familiarity with CI/CD systems. Bonus Points Bachelor’s Degree in Computer Science, Systems Engineering or related fields. Knowledge of data governance and compliance (GDPR, HIPAA). Knowledge of NLP techniques and commonly used AI/ML models. What we offer Compensation in USD Paid time off Cool clients and products Work with great engineers 4tech
#J-18808-Ljbffr
We are looking for a
Senior Data Engineer
based in
Latin America
to work on a long-term project for one of our clients, a Non-profit Organization based in New York. Our client is a nonprofit organization that offers free, 24/7, confidential mental health support via text message, in English and Spanish. The person in this role should have extensive experience in building, managing, and optimizing data pipelines, leveraging Infrastructure as Code (IaC) methodologies, and strong technical skills in SQL, Python, Spark, and PySpark, and have hands-on experience with AWS and Databricks. Responsibilities Collaborate with other engineering teams, data scientists, and business units to understand data requirements and improve data ingestion and integration processes. Collaborate with cross-functional teams to understand data needs and deliver solutions that support business objectives. Design, build, and maintain scalable and reliable data pipelines using Infrastructure as Code (IaC) principles and Databricks, along with Datadog. Design and implement scalable and secure integration solutions between the data sources and our client’s data pipeline infrastructure. Develop and optimize big data processing using Spark and PySpark within the Databricks environment. Write complex SQL queries for data extraction, transformation, and loading (ETL) processes. Implement automation and monitoring tools to ensure data accuracy and pipeline efficiency. Maintain comprehensive documentation for all data engineering processes, including pipeline architectures, codebase, and infrastructure configurations. Ensure that best practices and coding standards are followed and documented for future reference. Stay updated with the latest technologies and methodologies in data engineering and propose improvements to current processes. Requirements Advanced Level of English. 6+ years of experience working as a Software Engineer, focus on Data Engineering. 4+ years of experience querying and modeling data using SQL. 4+ years of experience working with Python for data analysis, data science, and development (matplotlib, numpy, pandas). Experience building data models for use in dashboarding tools (Looker, Sisense, PowerBI, Tableau, or similar) Experience with infrastructure-as-code tools such as Terraform. Familiar with distributed computing systems such as Spark (Scala, Pyspark). Ability to influence strategy and align stakeholders to evolve data use based on research, data, and industry trends. Fluent with git (preferred) or other version control systems and practices. Experience with pipeline orchestration tools such as Databricks Workflows or Airflow. Experience with dimensional modeling and creating data marts for a variety of stakeholders. Experience with monitoring, logging, and other observability practices for maintaining data pipelines in production. Familiarity with CI/CD systems. Bonus Points Bachelor’s Degree in Computer Science, Systems Engineering or related fields. Knowledge of data governance and compliance (GDPR, HIPAA). Knowledge of NLP techniques and commonly used AI/ML models. What we offer Compensation in USD Paid time off Cool clients and products Work with great engineers 4tech
#J-18808-Ljbffr