Smart IT Frame LLC
Get AI-powered advice on this job and more exclusive features.
Direct message the job poster from Smart IT Frame LLC
Responsibilities
Architect and implement a scalable data hub solution on AWS using best practices for data ingestion, transformation, storage, and access control.
Define data models, data lineage, and data quality standards for the DataHub.
Select appropriate AWS services (S3, Glue, Redshift, Athena, Lambda) based on data volume, access patterns, and performance requirements.
Come up with a design that accommodates AI/ML applications in the next phase
Data Ingestion and Integration:
Design and build data pipelines to extract, transform, and load data from various sources (databases, APIs, flat files) into the DataHub using AWS Glue, AWS Batch, or custom ETL processes.
Implement data cleansing and normalization techniques to ensure data quality.
Manage data ingestion schedules and error handling mechanisms.
Data Governance and Access Control:
Establish data access controls and security policies to protect sensitive data within the DataHub using IAM roles and policies.
Develop data governance frameworks including data quality checks, data lineage tracking, and data retention policies.
Data Analytics Enablement:
Create data catalogs and metadata management systems to facilitate data discovery and understanding by business users and data analysts.
Design and implement data views and dashboards using Power BI to enable data exploration and visualization.
Create data warehouses and data marts to meet the needs of the business.
Monitoring and Optimization:
Monitor data pipeline performance, data quality, and system health to identify and resolve issues proactively.
Optimize data storage and processing costs by leveraging AWS cost optimization features.
Data Exchange
Develop the required governance, security, monitoring and guard rails to enable efficient data exchange between internal application and their external vendors, partners, and SaaS providers.
Develop intake process, SLAs, and usage rules for internal and external data set producers and consumers.
Required Skills and Experience
AWS Expertise:
Deep understanding of AWS data services including S3, Glue, Redshift, Athena, Lake Formation, Sep Functions, CloudWatch and EventBridge.
Data Modeling:
Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes.
Data Engineering Skills:
Experience with ETL/ELT processes, data cleansing, data transformation, and data quality checks. Experience with Informatica IICS and ICDQ is a plus.
Programming Languages:
Proficiency in Python, SQL, and potentially PySpark for data processing and manipulation.
Data Governance:
Knowledge of data governance best practices including data classification, access control, and data lineage tracking.
Preferred Qualifications
Experience with data lakehouse architectures and the ability to leverage both structured and unstructured data.
Familiarity with data visualization tools like Tableau or Power BI.
Strong communication and collaboration skills to work with stakeholders across business and technical teams.
AWS certifications related to data analytics and architecture.
Seniority level:
Director
Employment type:
Contract
Job function:
Information Technology
Industries:
Software Development
#J-18808-Ljbffr
Direct message the job poster from Smart IT Frame LLC
Responsibilities
Architect and implement a scalable data hub solution on AWS using best practices for data ingestion, transformation, storage, and access control.
Define data models, data lineage, and data quality standards for the DataHub.
Select appropriate AWS services (S3, Glue, Redshift, Athena, Lambda) based on data volume, access patterns, and performance requirements.
Come up with a design that accommodates AI/ML applications in the next phase
Data Ingestion and Integration:
Design and build data pipelines to extract, transform, and load data from various sources (databases, APIs, flat files) into the DataHub using AWS Glue, AWS Batch, or custom ETL processes.
Implement data cleansing and normalization techniques to ensure data quality.
Manage data ingestion schedules and error handling mechanisms.
Data Governance and Access Control:
Establish data access controls and security policies to protect sensitive data within the DataHub using IAM roles and policies.
Develop data governance frameworks including data quality checks, data lineage tracking, and data retention policies.
Data Analytics Enablement:
Create data catalogs and metadata management systems to facilitate data discovery and understanding by business users and data analysts.
Design and implement data views and dashboards using Power BI to enable data exploration and visualization.
Create data warehouses and data marts to meet the needs of the business.
Monitoring and Optimization:
Monitor data pipeline performance, data quality, and system health to identify and resolve issues proactively.
Optimize data storage and processing costs by leveraging AWS cost optimization features.
Data Exchange
Develop the required governance, security, monitoring and guard rails to enable efficient data exchange between internal application and their external vendors, partners, and SaaS providers.
Develop intake process, SLAs, and usage rules for internal and external data set producers and consumers.
Required Skills and Experience
AWS Expertise:
Deep understanding of AWS data services including S3, Glue, Redshift, Athena, Lake Formation, Sep Functions, CloudWatch and EventBridge.
Data Modeling:
Proficiency in designing dimensional and snowflake data models for data warehousing and data lakes.
Data Engineering Skills:
Experience with ETL/ELT processes, data cleansing, data transformation, and data quality checks. Experience with Informatica IICS and ICDQ is a plus.
Programming Languages:
Proficiency in Python, SQL, and potentially PySpark for data processing and manipulation.
Data Governance:
Knowledge of data governance best practices including data classification, access control, and data lineage tracking.
Preferred Qualifications
Experience with data lakehouse architectures and the ability to leverage both structured and unstructured data.
Familiarity with data visualization tools like Tableau or Power BI.
Strong communication and collaboration skills to work with stakeholders across business and technical teams.
AWS certifications related to data analytics and architecture.
Seniority level:
Director
Employment type:
Contract
Job function:
Information Technology
Industries:
Software Development
#J-18808-Ljbffr