TikTok
Site Reliability Engineer, Cloud Infrastructure - USDS
TikTok, Seattle, Washington, us, 98127
Site Reliability Engineer, Cloud Infrastructure - USDS
Responsibilities
Team Intro: The Systems and Networking team ensures seamless operation of TikTok's US physical infrastructure. We provision physical servers, maintain the TikTok US physical network, and collaborate with vendors such as OCI and Akamai to manage physical hardware, networks, and uphold assurance and compliance objectives.
We also work closely with global colleagues to build and support various platforms within our US region, including internal platforms that support daily operations. Primary goal: ensure uninterrupted functionality of TikTok's US Physical Infrastructure, facilitating other internal middleware teams (Product, e‑Commerce, Ads/Monetization, etc.) while strictly adhering to compliance standards.
Hybrid work schedule: employees work in the office 3 days a week or as directed by manager/department.
Drive infrastructure automation and tooling: Design, develop, and maintain solutions for efficient operation, optimization, and comprehensive monitoring of global infrastructure, minimizing manual intervention.
Collaborate on service lifecycle management: Partner with engineering teams to design, deploy, operate, and continuously improve robust and scalable systems and services, from inception to refinement.
Ensure service reliability and performance: Proactively monitor system health, conduct performance testing, and manage incidents to maximize uptime, availability, and adherence to defined SLAs/SLOs.
Execute core SRE practices: Perform on‑call duties and production operations, including change management, capacity planning, and disaster recovery, while contributing to documentation and process improvements across teams.
Qualifications
Minimum: Proficient in one or more programming languages (Python, Go, Java, C++). Strong understanding of Linux OS and open‑source technologies. Experience in network architecture, troubleshooting, database modelling, cloud systems, and large‑scale distributed systems. Knowledge of monitoring tools and methodologies (Prometheus, Grafana), AIOPS, APM, Disaster Recovery. Experience designing, analysing, and building automation and tools for large‑scale systems. Experience building solutions with AWS, GCP, Azure, and other cloud services.
Preferred: Expertise in Kubernetes, ElasticSearch, ClickHouse, Message Queue, OpenTSDB, Service Mesh, MySQL, Redis, etc. Master’s degree in Computer Science, Engineering, or a related field.
As a condition of employment, all successful candidates must establish authorization to work in the United States. The Company does not provide sponsorship for immigration‑related benefits.
About USDS TikTok is the leading destination for short‑form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security (“USDS”) is a subsidiary of TikTok in the U.S. This new, security‑first division focuses on heightened governance of data protection policies and content assurance protocols to keep U.S. users safe. The teams within USDS that deliver on this commitment span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.
Why Join Us Inspiring creativity is at the core of TikTok’s mission. Our innovative product helps people authentically express themselves, discover and connect. Our global, diverse teams make that possible. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. Every challenge is an opportunity to learn and innovate as one team. Join us.
Diversity & Inclusion TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. We celebrate diverse voices and create an environment reflecting the communities we reach.
USDS Reasonable Accommodation USDS is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out at https://tinyurl.com/USDS-RA.
Job Information Compensation range: $129,960 – $246,240 annually. Benefits include medical, dental, vision insurance, 401(k) with company match, paid parental leave, disability coverage, life insurance, wellbeing benefits, 10 paid holidays, 10 paid sick days, 17 days paid personal time (prorated upon hire).
Senior Level: Mid‑Senior level
Employment Type: Full‑time
Job Function: Engineering and Information Technology
Industries: Software Development
Referrals increase your chances of interviewing at TikTok by 2x.
#J-18808-Ljbffr
Team Intro: The Systems and Networking team ensures seamless operation of TikTok's US physical infrastructure. We provision physical servers, maintain the TikTok US physical network, and collaborate with vendors such as OCI and Akamai to manage physical hardware, networks, and uphold assurance and compliance objectives.
We also work closely with global colleagues to build and support various platforms within our US region, including internal platforms that support daily operations. Primary goal: ensure uninterrupted functionality of TikTok's US Physical Infrastructure, facilitating other internal middleware teams (Product, e‑Commerce, Ads/Monetization, etc.) while strictly adhering to compliance standards.
Hybrid work schedule: employees work in the office 3 days a week or as directed by manager/department.
Drive infrastructure automation and tooling: Design, develop, and maintain solutions for efficient operation, optimization, and comprehensive monitoring of global infrastructure, minimizing manual intervention.
Collaborate on service lifecycle management: Partner with engineering teams to design, deploy, operate, and continuously improve robust and scalable systems and services, from inception to refinement.
Ensure service reliability and performance: Proactively monitor system health, conduct performance testing, and manage incidents to maximize uptime, availability, and adherence to defined SLAs/SLOs.
Execute core SRE practices: Perform on‑call duties and production operations, including change management, capacity planning, and disaster recovery, while contributing to documentation and process improvements across teams.
Qualifications
Minimum: Proficient in one or more programming languages (Python, Go, Java, C++). Strong understanding of Linux OS and open‑source technologies. Experience in network architecture, troubleshooting, database modelling, cloud systems, and large‑scale distributed systems. Knowledge of monitoring tools and methodologies (Prometheus, Grafana), AIOPS, APM, Disaster Recovery. Experience designing, analysing, and building automation and tools for large‑scale systems. Experience building solutions with AWS, GCP, Azure, and other cloud services.
Preferred: Expertise in Kubernetes, ElasticSearch, ClickHouse, Message Queue, OpenTSDB, Service Mesh, MySQL, Redis, etc. Master’s degree in Computer Science, Engineering, or a related field.
As a condition of employment, all successful candidates must establish authorization to work in the United States. The Company does not provide sponsorship for immigration‑related benefits.
About USDS TikTok is the leading destination for short‑form mobile video. Our mission is to inspire creativity and bring joy. U.S. Data Security (“USDS”) is a subsidiary of TikTok in the U.S. This new, security‑first division focuses on heightened governance of data protection policies and content assurance protocols to keep U.S. users safe. The teams within USDS that deliver on this commitment span across Trust & Safety, Security & Privacy, Engineering, User & Product Ops, Corporate Functions and more.
Why Join Us Inspiring creativity is at the core of TikTok’s mission. Our innovative product helps people authentically express themselves, discover and connect. Our global, diverse teams make that possible. We lead with curiosity, humility, and a desire to make impact in a rapidly growing tech company. Every challenge is an opportunity to learn and innovate as one team. Join us.
Diversity & Inclusion TikTok is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. We celebrate diverse voices and create an environment reflecting the communities we reach.
USDS Reasonable Accommodation USDS is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out at https://tinyurl.com/USDS-RA.
Job Information Compensation range: $129,960 – $246,240 annually. Benefits include medical, dental, vision insurance, 401(k) with company match, paid parental leave, disability coverage, life insurance, wellbeing benefits, 10 paid holidays, 10 paid sick days, 17 days paid personal time (prorated upon hire).
Senior Level: Mid‑Senior level
Employment Type: Full‑time
Job Function: Engineering and Information Technology
Industries: Software Development
Referrals increase your chances of interviewing at TikTok by 2x.
#J-18808-Ljbffr