CurvePoint

Data Engineer (IoT) (Pittsburgh)

CurvePoint, Pittsburgh, Pennsylvania, United States, 15289

Job Summary

As an

IoT Data Engineer

at CurvePoint, you will design, build, and optimize the data pipelines that power our Wi-AI sensing platform. Your work will focus on

reliable, low-latency data acquisition from constrained on-prem IoT devices , efficient buffering and streaming, and scalable cloud-based storage and training workflows. You will own how raw sensor data (e.g., wireless CSI, video, metadata) moves from

edge devices with limited disk and compute

into durable, well-structured datasets used for model training, evaluation, and auditability. You will work closely with hardware, ML, and infrastructure teams to ensure our data systems are fast, resilient, and cost-efficient at scale. Duties and Responsibilities

Edge & On-Prem Data Acquisition Design and improve data capture pipelines on constrained IoT devices and host servers (limited disk, intermittent connectivity, real-time constraints). Implement buffering, compression, batching, and backpressure strategies to prevent data loss. Optimize data transfer from edge

on-prem host

cloud. Streaming & Ingestion Pipelines Build and maintain streaming or near-real-time ingestion pipelines for sensor data (e.g., CSI, video, logs, metadata). Ensure data integrity, ordering, and recoverability across failures. Design mechanisms for replay, partial re-ingestion, and audit trails. Cloud Data Pipelines & Storage Own cloud-side ingestion, storage layout, and lifecycle policies for large time-series datasets. Balance cost, durability, and performance across hot, warm, and cold storage tiers. Implement data versioning and dataset lineage to support model training and reproducibility. Training Data Enablement Structure datasets to support efficient downstream ML training, evaluation, and experimentation. Work closely with ML engineers to align data formats, schemas, and sampling strategies with training needs. Build tooling for dataset slicing, filtering, and validation. Reliability & Observability Add monitoring, metrics, and alerts around data freshness, drop rates, and pipeline health. Debug pipeline failures across edge, on-prem, and cloud environments. Continuously improve system robustness under real-world operating conditions. Cross-Functional Collaboration Partner with hardware engineers to understand sensor behavior and constraints. Collaborate with ML engineers to adapt pipelines as model and data requirements evolve. Contribute to architectural decisions as the platform scales from pilots to production deployments.

Must Haves

Bachelors degree in Computer Science, Electrical Engineering, or a related field (or equivalent experience). 3+ years of experience as a Data Engineer or Backend Engineer

working with production data pipelines. Strong Python skills; experience building reliable data processing systems. Hands-on experience with

streaming or near-real-time data ingestion

(e.g., Kafka, Kinesis, MQTT, custom TCP/UDP pipelines). Experience working with

on-prem systems or edge/IoT devices , including disk, bandwidth, or compute constraints. Familiarity with cloud storage and data lifecycle management (e.g., S3-like object stores). Strong debugging skills across distributed systems.

Nice to Have

Experience with IoT or sensor data (RF/CSI, video, audio, industrial telemetry). Familiarity with data compression, time-series formats, or binary data handling. Experience supporting ML training pipelines or large-scale dataset management. Exposure to containerized or GPU-enabled data processing environments. Knowledge of data governance, retention, or compliance requirements.

Location

Pittsburgh, PA (hybrid preferred; some on-site work with hardware teams)

Salary

$110,000

$135,000 / year (depending on experience and depth in streaming + IoT systems)