RIT Solutions, Inc.
Position title: Big data developer
Location: Charlotte, NC
Onsite: 3 days a week
Contract: 6-24 months to perm
Must have: Hadoop, Pyspark and Kafka
Key Responsibilities: Design and implement scalable data ingestion and transformation pipelines using
PySpark or Scala ,
Hadoop ,
Hive , and
Dremio . Build and manage
Kafka batch pipelines
for reliable data streaming and integration. Work with
on-prem Hadoop ecosystems
(Cloudera, Hortonworks, MapR) or
cloud-native big data platforms . Develop and maintain
RESTful APIs
using
Python
(FastAPI, Flask, or Django) to expose data and services. Collaborate with data scientists, ML engineers, and platform teams to ensure seamless data flow and system performance. Monitor, troubleshoot, and optimize production data pipelines and services. Ensure security, scalability, and reliability across all data engineering components. (Optional but valuable) Contribute to the design and deployment of
AI-driven RAG systems
for enterprise use cases.
Required Skills & Qualifications:
experience in
Big Data Engineering . Strong hands-on experience with
PySpark or Scala . Deep expertise in
on-prem Hadoop distributions
(Cloudera, Hortonworks, MapR)
or
cloud-based big data platforms. Proficiency in
Kafka batch processing ,
Hive , and
Dremio . Solid understanding of
REST API development
using Python frameworks. Familiarity with
cloud platforms
(GCP, AWS, or Azure). Experience or exposure to
AI and RAG architectures
is a plus. Excellent problem-solving, communication, and collaboration skills.
Must have: Hadoop, Pyspark and Kafka
Key Responsibilities: Design and implement scalable data ingestion and transformation pipelines using
PySpark or Scala ,
Hadoop ,
Hive , and
Dremio . Build and manage
Kafka batch pipelines
for reliable data streaming and integration. Work with
on-prem Hadoop ecosystems
(Cloudera, Hortonworks, MapR) or
cloud-native big data platforms . Develop and maintain
RESTful APIs
using
Python
(FastAPI, Flask, or Django) to expose data and services. Collaborate with data scientists, ML engineers, and platform teams to ensure seamless data flow and system performance. Monitor, troubleshoot, and optimize production data pipelines and services. Ensure security, scalability, and reliability across all data engineering components. (Optional but valuable) Contribute to the design and deployment of
AI-driven RAG systems
for enterprise use cases.
Required Skills & Qualifications:
experience in
Big Data Engineering . Strong hands-on experience with
PySpark or Scala . Deep expertise in
on-prem Hadoop distributions
(Cloudera, Hortonworks, MapR)
or
cloud-based big data platforms. Proficiency in
Kafka batch processing ,
Hive , and
Dremio . Solid understanding of
REST API development
using Python frameworks. Familiarity with
cloud platforms
(GCP, AWS, or Azure). Experience or exposure to
AI and RAG architectures
is a plus. Excellent problem-solving, communication, and collaboration skills.