Logo
Caterpillar Inc.

Principal AI Data Scientist / Engineer

Caterpillar Inc., Irving, Texas, United States, 75084

Save Job

Overview

Principal AI Data Scientist / Engineer, Accounting Systems at Caterpillar Inc. Location: Irving, TX or Peoria, IL. Travel: less than 10%. This position may require onsite work five days a week. Sponsorship is NOT available. Domestic relocation is available for those who qualify. What You Will Do

IT Architecture Experience: Leverage end-to-end architecture for a multi-sourced data platform, evaluating scalability, performance, and resilience. AI-Driven Entity Resolution: Develop and implement entity resolution strategies using Large Language Models (LLMs) and Graph Databases to map, reconcile, and standardize accounting data across sources. Advanced RAG Implementation: Architect and deploy production-grade Retrieval-Augmented Generation pipelines for data interpretation and standardization, including managing vector databases and optimizing prompt/context engineering for high accuracy. Performance Optimization: Understand performance SLAs; utilize OLAP databases (e.g., DuckDB, ClickHouse) and in-memory/column stores (e.g., Redis) for rapid analytics and low-latency access. Cloud Infrastructure and Deployment: Collaborate on cloud deployment strategy (AWS/Azure), containerization (Docker), and orchestration (Kubernetes) for robust, scalable, observable deployments. Cross-Functional Collaboration: Work with Accounting, ERP knowledge owners, IT, MDM, and Data Quality teams to translate complex accounting requirements into scalable, automated technical solutions. What You Will Need

Self-Starting, high accountability, execution-focused mindset with strong initiative and communication skills. Core Engineering And Architecture: Proficiency in Python (AI/ML) and a high-performance language (Go, Java, or C#/.NET) for backend services; extensive AWS or Azure experience. Containerization & Orchestration: Docker and Kubernetes in production; experience with Kafka or similar messaging systems (Kinesis, RabbitMQ). Advanced AI/LLM Deployment: Experience deploying LLMs in production for data-centric tasks; building RAG pipelines; managing Vector Databases (e.g., Pinecone, Weaviate, PGVector); advanced prompt/context engineering; experience with PyTorch, TensorFlow, and HuggingFace; familiarity with agentic frameworks (e.g., LangGraph, AutoGen). Modern Data Stack And Databases: Entity resolution at scale; graph databases (Neo4J, AWS Neptune, CosmosDB) for ER/MDM; Snowflake; fast analytics with DuckDB; Redis for in-memory caches. Domain Knowledge: Understanding of corporate accounting principles, consolidation processes, ERP data structures (e.g., SAP, Oracle), and accounting data nuances. Top Candidates Will Also Have: AWS/Azure architecture certifications; IaC tools (e.g., Terraform); track record in Fortune 500 environments. What You Will Get

Rewards package on day one, including medical, dental, vision, 401K, and potential annual bonus; paid vacation and holidays. Inclusive statement: All qualified individuals including minorities, females, veterans and individuals with disabilities are encouraged to apply. Additional Information

Posting dates: September 3, 2025 – September 17, 2025. This position is located in Irving, TX or Peoria, IL. Travel

No direct reports. Domestic relocation available. Sponsorship is NOT available. Relocation is available for those who qualify. This employer is not currently hiring foreign national applicants requiring sponsorship. Caterpillar is an Equal Opportunity Employer, including Veterans and Individuals with Disabilities. Qualified applicants of any age are encouraged to apply. About Caterpillar

Caterpillar Inc. is a leading manufacturer of construction and mining equipment, engines, turbines, and locomotives. We are committed to a reduced-carbon future and helping customers build a better world.

#J-18808-Ljbffr