Logo
Gotham Technology Group

Data Engineer (New York)

Gotham Technology Group, New York, New York, United States, 10261

Save Job

Our client is seeking a Data Engineer with hands-on experience in Web Scraping technologies to help build and scale a new scraping capability within their Data Engineering team. This role will work directly with Technology, Operations, and Compliance to source, structure, and deliver alternative data from websites, APIs, files, and internal systems. This is a unique opportunity to shape a new service offering and grow into a senior engineering role as the platform evolves. Responsibilities Develop scalable Web Scraping solutions

using AI-assisted tools, Python frameworks, and modern scraping libraries. Manage the full lifecycle of scraping requests , including intake, feasibility assessment, site access evaluation, extraction approach, data storage, validation, entitlement, and ongoing monitoring. Coordinate with Compliance

to review Terms of Use, secure approvals, and ensure all scrapes adhere to regulatory and internal policy guidelines. Build and support AWS-based data pipelines

using tools such as Cron, Glue, EventBridge, Lambda, Python ETL, and Redshift. Normalize and standardize raw, vendor, and internal datasets

for consistent consumption across the firm. Implement data quality checks and monitoring

to ensure the reliability, historical continuity, and operational stability of scraped datasets. Provide operational support , troubleshoot issues, respond to inquiries about scrape behavior or data anomalies, and maintain strong communication with users. Promote data engineering best practices , including automation, documentation, repeatable workflows, and scalable design patterns. Required Qualifications Bachelors degree in Computer Science, Engineering, Mathematics, or related field. 25 years of experience in a similar Data Engineering or Web Scraping role. Capital markets knowledge with familiarity across asset classes and experience supporting trading systems. Strong hands-on experience with AWS services (S3, Lambda, EventBridge, Cron, Glue, Redshift). Proficiency with modern Web Scraping frameworks (Scrapy, BeautifulSoup, Selenium, Playwright). Strong Python programming skills and experience with SQL and NoSQL databases. Familiarity with market data and time series datasets (Bloomberg, Refinitiv) is a plus. Experience with DevOps/IaC tooling such as Terraform or CloudFormation is desirable.