Caterpillar Inc.
Overview
Principal AI Data Scientist / Engineer, Accounting Systems at Caterpillar Inc.
Location: Irving, TX or Peoria, IL. Travel: less than 10%. This position may require onsite work five days a week. Sponsorship is NOT available. Domestic relocation is available for those who qualify.
What You Will Do
- IT Architecture Experience: Leverage end-to-end architecture for a multi-sourced data platform, evaluating scalability, performance, and resilience.
- AI-Driven Entity Resolution: Develop and implement entity resolution strategies using Large Language Models (LLMs) and Graph Databases to map, reconcile, and standardize accounting data across sources.
- Advanced RAG Implementation: Architect and deploy production-grade Retrieval-Augmented Generation pipelines for data interpretation and standardization, including managing vector databases and optimizing prompt/context engineering for high accuracy.
- Performance Optimization: Understand performance SLAs; utilize OLAP databases (e.g., DuckDB, ClickHouse) and in-memory/column stores (e.g., Redis) for rapid analytics and low-latency access.
- Cloud Infrastructure and Deployment: Collaborate on cloud deployment strategy (AWS/Azure), containerization (Docker), and orchestration (Kubernetes) for robust, scalable, observable deployments.
- Cross-Functional Collaboration: Work with Accounting, ERP knowledge owners, IT, MDM, and Data Quality teams to translate complex accounting requirements into scalable, automated technical solutions.
What You Will Need
- Self-Starting, high accountability, execution-focused mindset with strong initiative and communication skills.
- Core Engineering And Architecture: Proficiency in Python (AI/ML) and a high-performance language (Go, Java, or C#/.NET) for backend services; extensive AWS or Azure experience.
- Containerization & Orchestration: Docker and Kubernetes in production; experience with Kafka or similar messaging systems (Kinesis, RabbitMQ).
- Advanced AI/LLM Deployment: Experience deploying LLMs in production for data-centric tasks; building RAG pipelines; managing Vector Databases (e.g., Pinecone, Weaviate, PGVector); advanced prompt/context engineering; experience with PyTorch, TensorFlow, and HuggingFace; familiarity with agentic frameworks (e.g., LangGraph, AutoGen).
- Modern Data Stack And Databases: Entity resolution at scale; graph databases (Neo4J, AWS Neptune, CosmosDB) for ER/MDM; Snowflake; fast analytics with DuckDB; Redis for in-memory caches.
- Domain Knowledge: Understanding of corporate accounting principles, consolidation processes, ERP data structures (e.g., SAP, Oracle), and accounting data nuances.
- Top Candidates Will Also Have: AWS/Azure architecture certifications; IaC tools (e.g., Terraform); track record in Fortune 500 environments.
What You Will Get
- Rewards package on day one, including medical, dental, vision, 401K, and potential annual bonus; paid vacation and holidays.
- Inclusive statement: All qualified individuals including minorities, females, veterans and individuals with disabilities are encouraged to apply.
Additional Information
- Posting dates: September 3, 2025 – September 17, 2025.
- This position is located in Irving, TX or Peoria, IL. Travel < 10%.
- No direct reports. Domestic relocation available. Sponsorship is NOT available.
- Relocation is available for those who qualify. This employer is not currently hiring foreign national applicants requiring sponsorship.
- Caterpillar is an Equal Opportunity Employer, including Veterans and Individuals with Disabilities. Qualified applicants of any age are encouraged to apply.
About Caterpillar
Caterpillar Inc. is a leading manufacturer of construction and mining equipment, engines, turbines, and locomotives. We are committed to a reduced-carbon future and helping customers build a better world.
#J-18808-Ljbffr