Selby Jennings
Requirements for This Role:
Experience:
7+ years working with modern data technologies and/or building data-intensive distributed systems. Programming Skills:
Expert-level proficiency in Java/Scala or Python, with a proven ability to write high-quality, maintainable code. Database and Scripting:
Strong knowledge of SQL and Bash. Cloud Technologies:
Experience in leveraging and building cloud-native technologies for scalable data processing. Data Systems:
Previous experience with both batch and streaming systems, understanding their limitations and challenges. Data Processing Technologies:
Familiarity with a range of technologies such as Flink, Spark, Polars, Dask, etc. Data Storage Solutions:
Knowledge of various storage technologies, including S3, RDBMS, NoSQL, Delta/Iceberg, Cassandra, Clickhouse, Kafka, etc. Data Formats and Serialization:
Experience with multiple data formats and serialization systems like Arrow, Parquet, Protobuf/gRPC, Avro, Thrift, JSON, etc. ETL Pipelines:
Proven track record of managing complex data ETL pipelines using tools like Kubernetes, Argo Workflows, Airflow, Prefect, Dagster, etc. Schema Governance:
Prior experience dealing with schema governance and schema evolution. Data Quality Control:
Experience in developing data quality control processes to detect and address data gaps or inaccuracies. Mentorship:
A desire to mentor less experienced team members and promote best practices and high standards in code quality. Problem-Solving:
Strong technical problem-solving abilities. Agile Environment:
Proven capability to work in an agile, fast-paced environment, prioritize multiple tasks and projects, and handle the demands of a trading environment efficiently.
Experience:
7+ years working with modern data technologies and/or building data-intensive distributed systems. Programming Skills:
Expert-level proficiency in Java/Scala or Python, with a proven ability to write high-quality, maintainable code. Database and Scripting:
Strong knowledge of SQL and Bash. Cloud Technologies:
Experience in leveraging and building cloud-native technologies for scalable data processing. Data Systems:
Previous experience with both batch and streaming systems, understanding their limitations and challenges. Data Processing Technologies:
Familiarity with a range of technologies such as Flink, Spark, Polars, Dask, etc. Data Storage Solutions:
Knowledge of various storage technologies, including S3, RDBMS, NoSQL, Delta/Iceberg, Cassandra, Clickhouse, Kafka, etc. Data Formats and Serialization:
Experience with multiple data formats and serialization systems like Arrow, Parquet, Protobuf/gRPC, Avro, Thrift, JSON, etc. ETL Pipelines:
Proven track record of managing complex data ETL pipelines using tools like Kubernetes, Argo Workflows, Airflow, Prefect, Dagster, etc. Schema Governance:
Prior experience dealing with schema governance and schema evolution. Data Quality Control:
Experience in developing data quality control processes to detect and address data gaps or inaccuracies. Mentorship:
A desire to mentor less experienced team members and promote best practices and high standards in code quality. Problem-Solving:
Strong technical problem-solving abilities. Agile Environment:
Proven capability to work in an agile, fast-paced environment, prioritize multiple tasks and projects, and handle the demands of a trading environment efficiently.