Logo
Washington Staffing

Site Reliability Engineer

Washington Staffing, Seattle, Washington, us, 98127

Save Job

Site Reliability Engineer Opportunity

About DAT: DAT is an award-winning employer of choice and a next-generation SaaS technology company that has been at the leading edge of innovation in transportation supply chain logistics for 45 years. We continue to transform the industry year over year, by deploying a suite of software solutions to millions of customers every day - customers who depend on DAT for the most relevant data and most accurate insights to help them make smarter business decisions and run their companies more profitably. We operate the largest marketplace of its kind in North America, with 400 million freights posted in 2022, and a database of $150 billion of annual global shipment market transaction data. Our headquarters are in Denver, CO, and Beaverton, OR, with additional offices in Seattle, WA; Springfield, MO; and Bangalore, India. The Opportunity: DAT is looking for a Site Reliability Engineer to join our SRE platform team. This position will work hybrid or remote in Seattle, WA. Candidate Profile

DAT is seeking an experienced Site Reliability Engineer to help grow our SRE practices. In this role, you will be responsible for contributing to technical initiatives and enhancing your skills. You'll work closely with development teams and platform architects to achieve critical reliability goals and help scale our platform. DAT is actively seeking a highly skilled and experienced Site Reliability Engineer (SRE) to play a pivotal role in the expansion and maturation of our SRE practices. In this critical position, the successful candidate will be instrumental in driving key technical initiatives, fostering a culture of continuous improvement, and significantly enhancing their own professional expertise. This role necessitates close collaboration with various stakeholders, including our dedicated development teams and platform architects. The primary objective of these partnerships is to collectively achieve ambitious reliability goals and strategically scale our platform to meet the evolving growth of the company. What You'll Do

Contribute to the design, implementation, and maintenance of scalable and reliable systems. Collaborate with engineering teams to ensure reliability targets are met. Identify and troubleshoot complex issues across distributed systems, ensuring minimal downtime and optimal performance. Advocate for and implement SRE best practices, including automation, monitoring, and incident response, to enhance system resilience. Participate in capacity planning and performance tuning to proactively address potential bottlenecks and support future growth. Leverage new AI tools to assist with coding and observability tasks. Assist and respond to critical engineering incidents. Improve your engineering skills within the SRE team. Provide technical guidance and best practices for use of cloud infrastructure and tooling. Contribute to Infrastructure-as-Code within the platform. We strive to automate all the things! Contribute to reliability-focused initiatives and projects. Help optimize our work to be customer-focused. Continually seek feedback from our customers on how we can improve. Assist in migrating legacy systems to modern, scalable cloud environments. Help develop and drive a culture of continuous improvement with the platform engineering and software engineering groups. Participate in an on-call rotation. The Skills and Experience You'll Bring

Strong collaboration skills