Jobs via Dice
Site Reliability Engineer II - CTJ - Top Secret
Jobs via Dice, Redmond, Washington, United States, 98052
Site Reliability Engineer II - CTJ - Top Secret
Join to apply for the
Site Reliability Engineer II - CTJ - Top Secret
role at
Jobs via Dice . Summary Microsoft is seeking a Software Engineer with software development, on-line services experience, and a focus on quality to envision, design, and deliver Office 365 government cloud service offerings. Office 365 combines cloud versions of trusted collaboration products and services, with a strong emphasis on feature quality, security, reliability, availability, and performance for government customers. Responsibilities
Independently creates, tests, and deploys changes through a safe deployment process (SDP) to enhance code quality and improve the observability, security, reliability and operability of one or more platforms, systems, or products at scale. Leverages cloud technology expertise and production telemetry data to suggest changes or automations that improve availability, security, quality, observability, reliability, efficiency, and performance of product components. Participates in code/design reviews, on-call rotations, and incident responses, collaborating with product engineering teams to improve code and system designs for scalability and reliability. Driving Operational Excellence
Uses telemetry and AI/ML insights to identify patterns and implement configuration and data changes for production platforms, systems, or products. Writes code or scripts to automate scalable operations processes (monitoring, alerting, deployments) across components and features of products at scale. Documents insights and best practices to improve development and operations, and participates in reviews, drills, and post-mortems. Develops alerts and instrumentation to monitor capacity, security risk, and resource demands; analyzes telemetry to optimize code and resource usage. Troubleshoots issues using existing tools/models, proposes solutions to prevent recurrence, and communicates resolutions to SRE and product teams. Monitors performance and resource data to determine if code or resource changes are needed; models effects of changes and drives implementation. Identifies opportunities to leverage SDP and automation to increase velocity of safe production changes; monitors effects across components. Responds to incidents during on-call rotations, mitigates impact, deploys fixes, and communicates resolutions through post-mortems and reviews. Designs, develops, and maintains telemetry pipelines and monitoring tools to track operations metrics such as availability, reliability, performance, and efficiency. Technical Knowledge and Domain Expertise
Demonstrates expertise in distributed systems, cloud technology layers, and scalable code to improve security, quality, reliability, and operability of products. Keeps aware of industry trends and cloud tech advances; contributes to new solutions within the team. Develops domain-specific expertise to improve product availability, security, and performance; participates in onboarding, code/design reviews, and regular meetings. Additional Responsibilities
Design, develop, and deliver software features to serve and protect Office 365 government clouds. Identify and reduce issues through design, testing, and software-based solutions. Collaborate with Engineering and Program Management to translate requirements into architectural designs. Drive efficiencies through software improvements and root-cause analysis to improve service delivery and scalability. Qualifications
Required/Minimum Qualifications
Master's Degree in Computer Science, Information Technology, or related field with 1+ year of technical experience in software engineering, network engineering, or systems administration, or Bachelor's Degree in Computer Science, Information Technology, or related field with 2+ years of technical experience in software engineering, network engineering, or systems administration, or equivalent experience. Security Clearance Requirements
Active TS/SCI with polygraph required; willingness to upgrade to TS/SCI with polygraph as needed. Ability to meet Microsoft, customer, and government screening requirements. Clearance verification: verification of the stated security clearance prior to offer. Microsoft Cloud Background Check: required at hire/transfer and every two years thereafter. Citizenship verification: U.S. citizenship required due to government customer restrictions; verification via passport or approved documents. Additional/Preferred Qualifications
Master's or Bachelor's degree with extended technical experience in software/network/system administration, or equivalent. 2+ years of experience with large-scale cloud or distributed systems. Base pay range for Site Reliability Engineer IC3 in the U.S. is USD $100,600 - $199,000 per year; location-specific ranges apply (e.g., San Francisco Bay Area / NYC). Application deadline: Microsoft will accept applications until September 4, 2025. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, disability, political affiliation, protected veteran status, race, religion, sex, or any characteristic protected by law. Reasonable accommodations available on request. #M365Core #J-18808-Ljbffr
Join to apply for the
Site Reliability Engineer II - CTJ - Top Secret
role at
Jobs via Dice . Summary Microsoft is seeking a Software Engineer with software development, on-line services experience, and a focus on quality to envision, design, and deliver Office 365 government cloud service offerings. Office 365 combines cloud versions of trusted collaboration products and services, with a strong emphasis on feature quality, security, reliability, availability, and performance for government customers. Responsibilities
Independently creates, tests, and deploys changes through a safe deployment process (SDP) to enhance code quality and improve the observability, security, reliability and operability of one or more platforms, systems, or products at scale. Leverages cloud technology expertise and production telemetry data to suggest changes or automations that improve availability, security, quality, observability, reliability, efficiency, and performance of product components. Participates in code/design reviews, on-call rotations, and incident responses, collaborating with product engineering teams to improve code and system designs for scalability and reliability. Driving Operational Excellence
Uses telemetry and AI/ML insights to identify patterns and implement configuration and data changes for production platforms, systems, or products. Writes code or scripts to automate scalable operations processes (monitoring, alerting, deployments) across components and features of products at scale. Documents insights and best practices to improve development and operations, and participates in reviews, drills, and post-mortems. Develops alerts and instrumentation to monitor capacity, security risk, and resource demands; analyzes telemetry to optimize code and resource usage. Troubleshoots issues using existing tools/models, proposes solutions to prevent recurrence, and communicates resolutions to SRE and product teams. Monitors performance and resource data to determine if code or resource changes are needed; models effects of changes and drives implementation. Identifies opportunities to leverage SDP and automation to increase velocity of safe production changes; monitors effects across components. Responds to incidents during on-call rotations, mitigates impact, deploys fixes, and communicates resolutions through post-mortems and reviews. Designs, develops, and maintains telemetry pipelines and monitoring tools to track operations metrics such as availability, reliability, performance, and efficiency. Technical Knowledge and Domain Expertise
Demonstrates expertise in distributed systems, cloud technology layers, and scalable code to improve security, quality, reliability, and operability of products. Keeps aware of industry trends and cloud tech advances; contributes to new solutions within the team. Develops domain-specific expertise to improve product availability, security, and performance; participates in onboarding, code/design reviews, and regular meetings. Additional Responsibilities
Design, develop, and deliver software features to serve and protect Office 365 government clouds. Identify and reduce issues through design, testing, and software-based solutions. Collaborate with Engineering and Program Management to translate requirements into architectural designs. Drive efficiencies through software improvements and root-cause analysis to improve service delivery and scalability. Qualifications
Required/Minimum Qualifications
Master's Degree in Computer Science, Information Technology, or related field with 1+ year of technical experience in software engineering, network engineering, or systems administration, or Bachelor's Degree in Computer Science, Information Technology, or related field with 2+ years of technical experience in software engineering, network engineering, or systems administration, or equivalent experience. Security Clearance Requirements
Active TS/SCI with polygraph required; willingness to upgrade to TS/SCI with polygraph as needed. Ability to meet Microsoft, customer, and government screening requirements. Clearance verification: verification of the stated security clearance prior to offer. Microsoft Cloud Background Check: required at hire/transfer and every two years thereafter. Citizenship verification: U.S. citizenship required due to government customer restrictions; verification via passport or approved documents. Additional/Preferred Qualifications
Master's or Bachelor's degree with extended technical experience in software/network/system administration, or equivalent. 2+ years of experience with large-scale cloud or distributed systems. Base pay range for Site Reliability Engineer IC3 in the U.S. is USD $100,600 - $199,000 per year; location-specific ranges apply (e.g., San Francisco Bay Area / NYC). Application deadline: Microsoft will accept applications until September 4, 2025. Microsoft is an equal opportunity employer. All qualified applicants will receive consideration without regard to age, ancestry, color, family or medical care leave, gender identity or expression, genetic information, marital status, medical condition, national origin, disability, political affiliation, protected veteran status, race, religion, sex, or any characteristic protected by law. Reasonable accommodations available on request. #M365Core #J-18808-Ljbffr