Johnson Lambert
Johnson Lambert is a leading provider of audit, tax, and advisory services with a specialized focus on the
insurance, nonprofit, and employee benefit plan
sectors. For
35+ years
, we've built a reputation for deep industry knowledge, exceptional client service, and a culture grounded in
agility, respect, and trust
. We're passionate about serving our clients, growing our firm, and developing our people.
About the Team & Role
You'll join our
Business Automation
team-a group dedicated to delivering business outcomes through
data-centric solutions
-as our
first data-focused hire
. You'll collaborate daily with highly skilled colleagues (process automation, analytics, and domain experts) who don't yet have formal data engineering experience. As a hands-on architect-builder, you'll set direction, establish standards, and deliver value from day one. Our firm continues to grow and our clients' data becomes more complex, we are making a strategic investment in a modern data foundation to unlock new efficiencies, enhance our service delivery, and lay the groundwork for a future where data quality, context and availability are paramount.
Our mandate: design and implement our next-generation data foundation on
AWS
, applying
modern data lake/lakehouse patterns
and open approaches to data layout, governance, and reliability-while staying flexible to evaluate the best tools over time.
Note:
Our current needs are
batch-first
; we are
not
building near real-time pipelines today.
What You'll Do
Own the modern data foundation on AWS:
Design secure, scalable, and cost-aware lake/lakehouse patterns using open, interoperable formats and layered architecture (rawstandardizedcurated/analytics-ready). Build dependable batch pipelines:
Implement ingestion, transformation, validation, and orchestration to move data from source systems to governed, analytics-ready datasets with clear SLAs/SLOs. Translate messy files into trusted data:
Create robust, repeatable processes to extract and normalize data from
Excel
(multi-sheet, merged cells, header variations, hidden rows, cross-tab layouts) and
PDF
documents (including OCR and table extraction), mapping to standardized schemas. Integrate key SaaS sources:
Ingest data via APIs/exports from business apps-
Salesforce, Slack, Tableau
(and similar)-and keep them in sync on reliable schedules. Structure data for AI/ML accessibility:
Prepare datasets for analytics, ML, and LLM workloads-e.g., semantic/feature layers, curated text corpora, and
vector indexes/databases for LLM retrieval
(RAG), with appropriate metadata and access controls. Model for the business:
Implement pragmatic dimensional/lakehouse models aligned to how our audit, tax, and advisory teams work-especially across
insurance
,
nonprofit
, and
employee benefit plan
domains. Raise data quality & trust:
Embed tests and contracts, schema checks, and observability; maintain lineage, documentation, and data dictionaries that non-engineers can use. Harden security & governance:
Apply AWS identity, access controls, encryption, classification/tagging, and right-sized governance appropriate for client-serving environments-explicitly protecting client data used in AI/ML contexts. Automate and templatize:
Use infrastructure-as-code and CI/CD to make environments reproducible; publish templates/patterns that teammates can reuse without deep data engineering expertise. Enable and mentor:
Partner with analysts/automation engineers; run reviews, workshops, and coaching to uplevel the team and make data self-service where practical. Required Qualifications
5-7 years
in progressively complex
data engineering/data architecture
roles. Strong experience building on
AWS
(storage, compute/serverless, identity, orchestration, monitoring) and operating secure, production data workloads. Proven success designing and implementing
modern lake/lakehouse
architectures using
open, interoperable approaches
(transactional tables, partitioning, governance, performance optimization). Expert data wrangling
in
Python
and
SQL
for
structured and semi-structured
data (CSV, JSON, Excel) and practical experience with
PDF extraction
(OCR, layout detection, table parsing). Hands-on experience building and deploying data infrastructure using infrastructure-as-code (e.g., Terraform, AWS CDK), CI/CD practices, and modern data testing/observability tooling. Practical experience implementing data governance solutions for cataloging, lineage, and documentation suitable for sensitive, client-service environments. Experience with
ETL/ELT tools
(e.g.,
Airflow, Spark
) and data platforms such as
Databricks or Snowflake
; we prioritize
open approaches
and thoughtful tool selection. Ability to
ingest from SaaS apps
(e.g.,
Salesforce, Slack, Tableau
) via APIs/exports and normalize these feeds into curated datasets. Comfortable as a
player-coach
and first-of-its-kind hire: setting standards, making build-vs-buy decisions, and delivering under ambiguity. Excellent communication skills to translate business requirements into clear technical plans, and vice versa. Bachelor's in Computer Science or related field preferred. AWS certifications a plus. Nice to Have
Familiarity with insurance/nonprofit/EBP data (e.g., policy, claims, loss registers; donor/grant; plan/participant). Big data technologies
(e.g.,
Hadoop, Kafka
)-even though our current workloads are batch and not near real-time. Experience with
LLM-assisted
extraction or classification for document normalization (with governance/guardrails). How You'll Succeed (Outcomes & Measures)
First 90 days Stand up or harden a secure AWS baseline and initial lake/lakehouse layout with CI/CD. Deliver a production
batch pipeline
converting one high-value
Excel/PDF
process into a standardized, validated dataset with documentation and lineage.
By 6 months Operationalize
2-3 priority SaaS integrations
(e.g., Salesforce, Slack, Tableau) feeding curated layers on a dependable schedule. Reduce manual prep for target stakeholders by
30-50%
through standardized schemas and self-service access.
By 12 months Publish reusable
ingestion and document-processing templates
; establish data quality SLAs/SLOs adopted by multiple teams. Demonstrate measurable improvements in reliability, freshness, and adoption across analytics use cases.
How We Work
Our culture prizes
agility, respect, and trust
. We iterate in short cycles, document what we build, and keep stakeholders close. We choose
modern, open, and maintainable
solutions and believe governance should enable-not hinder-delivery. For more information on our benefits please visit https://www.johnsonlambert.com/careers/why-jl/
Equity note: Research suggests that women and Black, Indigenous, and other persons of color are less likely than men or White job seekers to apply for positions unless they are confident they meet 100% of the qualifications. We strongly encourage interested individuals to apply, and allow us to evaluate the knowledge, skills, and abilities you demonstrate, using an internal equity lens.
Johnson Lambert prides itself for the hands-on approach and relationships we build with future employees, employees, and clients. We believe each application is the potential for a future relationship with JL. Therefore, a member of our HR team personally reviews all applications submitted.
The pay range for this role is:
120,000 - 150,000 USD per year (USA)
insurance, nonprofit, and employee benefit plan
sectors. For
35+ years
, we've built a reputation for deep industry knowledge, exceptional client service, and a culture grounded in
agility, respect, and trust
. We're passionate about serving our clients, growing our firm, and developing our people.
About the Team & Role
You'll join our
Business Automation
team-a group dedicated to delivering business outcomes through
data-centric solutions
-as our
first data-focused hire
. You'll collaborate daily with highly skilled colleagues (process automation, analytics, and domain experts) who don't yet have formal data engineering experience. As a hands-on architect-builder, you'll set direction, establish standards, and deliver value from day one. Our firm continues to grow and our clients' data becomes more complex, we are making a strategic investment in a modern data foundation to unlock new efficiencies, enhance our service delivery, and lay the groundwork for a future where data quality, context and availability are paramount.
Our mandate: design and implement our next-generation data foundation on
AWS
, applying
modern data lake/lakehouse patterns
and open approaches to data layout, governance, and reliability-while staying flexible to evaluate the best tools over time.
Note:
Our current needs are
batch-first
; we are
not
building near real-time pipelines today.
What You'll Do
Own the modern data foundation on AWS:
Design secure, scalable, and cost-aware lake/lakehouse patterns using open, interoperable formats and layered architecture (rawstandardizedcurated/analytics-ready). Build dependable batch pipelines:
Implement ingestion, transformation, validation, and orchestration to move data from source systems to governed, analytics-ready datasets with clear SLAs/SLOs. Translate messy files into trusted data:
Create robust, repeatable processes to extract and normalize data from
Excel
(multi-sheet, merged cells, header variations, hidden rows, cross-tab layouts) and
documents (including OCR and table extraction), mapping to standardized schemas. Integrate key SaaS sources:
Ingest data via APIs/exports from business apps-
Salesforce, Slack, Tableau
(and similar)-and keep them in sync on reliable schedules. Structure data for AI/ML accessibility:
Prepare datasets for analytics, ML, and LLM workloads-e.g., semantic/feature layers, curated text corpora, and
vector indexes/databases for LLM retrieval
(RAG), with appropriate metadata and access controls. Model for the business:
Implement pragmatic dimensional/lakehouse models aligned to how our audit, tax, and advisory teams work-especially across
insurance
,
nonprofit
, and
employee benefit plan
domains. Raise data quality & trust:
Embed tests and contracts, schema checks, and observability; maintain lineage, documentation, and data dictionaries that non-engineers can use. Harden security & governance:
Apply AWS identity, access controls, encryption, classification/tagging, and right-sized governance appropriate for client-serving environments-explicitly protecting client data used in AI/ML contexts. Automate and templatize:
Use infrastructure-as-code and CI/CD to make environments reproducible; publish templates/patterns that teammates can reuse without deep data engineering expertise. Enable and mentor:
Partner with analysts/automation engineers; run reviews, workshops, and coaching to uplevel the team and make data self-service where practical. Required Qualifications
5-7 years
in progressively complex
data engineering/data architecture
roles. Strong experience building on
AWS
(storage, compute/serverless, identity, orchestration, monitoring) and operating secure, production data workloads. Proven success designing and implementing
modern lake/lakehouse
architectures using
open, interoperable approaches
(transactional tables, partitioning, governance, performance optimization). Expert data wrangling
in
Python
and
SQL
for
structured and semi-structured
data (CSV, JSON, Excel) and practical experience with
PDF extraction
(OCR, layout detection, table parsing). Hands-on experience building and deploying data infrastructure using infrastructure-as-code (e.g., Terraform, AWS CDK), CI/CD practices, and modern data testing/observability tooling. Practical experience implementing data governance solutions for cataloging, lineage, and documentation suitable for sensitive, client-service environments. Experience with
ETL/ELT tools
(e.g.,
Airflow, Spark
) and data platforms such as
Databricks or Snowflake
; we prioritize
open approaches
and thoughtful tool selection. Ability to
ingest from SaaS apps
(e.g.,
Salesforce, Slack, Tableau
) via APIs/exports and normalize these feeds into curated datasets. Comfortable as a
player-coach
and first-of-its-kind hire: setting standards, making build-vs-buy decisions, and delivering under ambiguity. Excellent communication skills to translate business requirements into clear technical plans, and vice versa. Bachelor's in Computer Science or related field preferred. AWS certifications a plus. Nice to Have
Familiarity with insurance/nonprofit/EBP data (e.g., policy, claims, loss registers; donor/grant; plan/participant). Big data technologies
(e.g.,
Hadoop, Kafka
)-even though our current workloads are batch and not near real-time. Experience with
LLM-assisted
extraction or classification for document normalization (with governance/guardrails). How You'll Succeed (Outcomes & Measures)
First 90 days Stand up or harden a secure AWS baseline and initial lake/lakehouse layout with CI/CD. Deliver a production
batch pipeline
converting one high-value
Excel/PDF
process into a standardized, validated dataset with documentation and lineage.
By 6 months Operationalize
2-3 priority SaaS integrations
(e.g., Salesforce, Slack, Tableau) feeding curated layers on a dependable schedule. Reduce manual prep for target stakeholders by
30-50%
through standardized schemas and self-service access.
By 12 months Publish reusable
ingestion and document-processing templates
; establish data quality SLAs/SLOs adopted by multiple teams. Demonstrate measurable improvements in reliability, freshness, and adoption across analytics use cases.
How We Work
Our culture prizes
agility, respect, and trust
. We iterate in short cycles, document what we build, and keep stakeholders close. We choose
modern, open, and maintainable
solutions and believe governance should enable-not hinder-delivery. For more information on our benefits please visit https://www.johnsonlambert.com/careers/why-jl/
Equity note: Research suggests that women and Black, Indigenous, and other persons of color are less likely than men or White job seekers to apply for positions unless they are confident they meet 100% of the qualifications. We strongly encourage interested individuals to apply, and allow us to evaluate the knowledge, skills, and abilities you demonstrate, using an internal equity lens.
Johnson Lambert prides itself for the hands-on approach and relationships we build with future employees, employees, and clients. We believe each application is the potential for a future relationship with JL. Therefore, a member of our HR team personally reviews all applications submitted.
The pay range for this role is:
120,000 - 150,000 USD per year (USA)