Logo
Walmart

Distinguished, Software Engineer - Observability

Walmart, Sunnyvale, California, United States, 94087

Save Job

Position Summary

As an observability Distinguished Engineer, you will be a key researcher and technical lead expert in the architecture and development of cloud native observability designs, managed services, and real-time telemetry software systems. You will use your depth of engineering and experience to create visionary software architectures and telemetry systems to achieve an observability software product portfolio. Additionally, you will design, develop and implement large-scale distributed systems that process large volumes of data focusing on scalability, latency, and fault-tolerance in every system built. You must be able to effectively communicate and build collaboration at all areas and levels of the business and engineering. An ideal candidate will be adept at architecting large scale distributed systems and proficient in coding Java. Furthermore, experience in socializing architectural designs and roadmaps to internal and external customers. To achieve software solutions and designs, you will utilize multiple telemetry technologies such as: data models, metric libraries, data logging, distributed tracing, datalakes, data correlation, rule based alerting engines, real-time data streaming pipelines, TSDBs, and application performance management (APM). While working in a cloud infrastructure ecosystem consisting of VMs, Kubernetes, and containers, you will create metric software designs and solutions enabling real-time monitoring and alerting of system and application metrics. You will lead research initiatives for cloud native designs and implementation within public and private clouds. You will also utilize TSDBs and correlation and data fusion of multiple data types and heterogenous data streams coupled with Artificial intelligence (AI) and Learned Behaviors for anomaly detection, and forward projections of system and application expected behaviors. This role will involve collaboration with enterprise architects, product managers, data scientist, engineers and business managers to bring telemetry R&D projects into production. To achieve this effect, you will use a combination of open source and COTS technologies to solve real-time telemetry problems at an enterprise-wide scale. In parallel, you will lead the design of new systems and the redesign of existing systems to meet business requirements, changing needs, and integration of state-of-the-art technology. You will be an evangelist for the Observability foundation socialization technology designs and implementations to engineering and business customers. Location:

Open to Sunnyvale CA, Seattle WA, and Bentonville AR Minimum Qualifications BS/MS in Computer Science, Engineering, or equivalent, with 15+ or more years in software engineering, design and architecture This role requires a deep understanding of the

Java language

and associated frameworks and previous development of Java applications, Libs, SDK or services. Strong architecture leadership with demonstrated enterprise level software implementations. Previous demonstrated architectural leadership in research, evaluation, creation of software designs, and distributed software implementations in production. Experience with technical leadership, software roadmaps, research and development, new software initiatives and customer and engineering coordination and engagement. Full stack cloud software development experience. Experience with the following: API development, integration, and utilization Cloud technologies and cloud native designs Cloud infrastructures and technologies, such as OpenStack, Azure, GCP or AWS Large scale distributed systems experience including scalability and fault tolerance. TSDBs (InfluxDB, Kairos, Cortex, Thanos, Prometheus) or equivalent Extract, transform, and load (ETL) processes Real-time telemetry pipelines and publish/subscribe models (Kafka or equivalent) Data warehousing, datalakes, processing and data analytics SQL (AzureSQL, Postgress or equivalent) Unix/Linux shell scripting or similar programming/scripting knowledge Real-time time monitoring and alerting: metric agents, real-time dashboards, alerting rules Excellent written and verbal communication skills for diverse audiences based on engineering subject matter Ability to document requirements, architectural designs, and analysis findings in both business and technical terminology Software development in an Agile iterative CI/CD development environment Promote and support company policies, procedures, mission, values, and standards of ethics and integrity Preferred Qualifications Knowledge and/or use of agentic AI – Model context protocol (MCP) servers, Large Language Models (LLMs), Retrieval-Augmented Generation (RAG), Natural Language processing (NLP) Fluency in Python, JavaScript, advanced shell scripting, Configuration management - Ansible, chef, puppet Experience with the following: Application Performance Monitoring (APM) and/or Distributed Tracing Deployment of Kubernetes, containers, service meshes, and micro services Micro services architectures, Istio, and micrometer Open Telemetry standards and protocols Go development Observability tools and system architectures Experience in creating and maintaining managed metric services NoSQL (Cassandra, CosmosDB or equivalent) Storm, Spark or similar real-time streaming software Knowledge of UI development - JavaScript, HTML, CSS and experience with frameworks like React and AngularJS Involvement and contribution with open-source software communities Demonstrated background in developing software systems Primary Location Location: 1345 Crossman Ave, Sunnyvale, CA 94089-1114, United States of America At Walmart, we offer competitive pay as well as performance-based bonus awards and other great benefits for a happier mind, body, and wallet. Health benefits include medical, vision and dental coverage. Financial benefits include 401(k), stock purchase and company-paid life insurance. Paid time off benefits include PTO (including sick leave), parental leave, family care leave, bereavement, jury duty, and voting. Other benefits include short-term and long-term disability, company discounts, Military Leave Pay, adoption and surrogacy expense reimbursement, and more. Additional compensation includes annual or quarterly performance bonuses. Additional compensation for certain positions may also include: Stock For information about PTO, see One.Walmart These statements are intended to describe the general nature and level of work being performed by employees assigned to this job. They are not intended to be construed as an exhaustive list of all responsibilities, duties and qualifications.

#J-18808-Ljbffr