Energy Jobline ZR
Site Reliability Engineer in Aurora
Energy Jobline ZR, Aurora, Colorado, United States, 80012
Energy Jobline is the largest and fastest growing global Energy Job Board and Energy Hub. We have an audience reach of over 7 million energy professionals, 400,000+ monthly advertised global energy and engineering jobs, and work with the leading energy companies worldwide.
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Site Reliability Engineer (Cleared)
$100,000 to $120,000 USD (with up to 10% bonus potential) + Paid Relocation
Denver Metro Area, Colorado
Security Clearance:
Active TS/SCI Clearance is REQUIRED.
Hybrid Remote (2-3 days on site) OR 9/80 work week available
Role Summary and Position Objectives This critical role applies software engineering principles to operations to build and run large-scale, fault-tolerant systems. You will be responsible for the continuous availability, scalability, and performance of mission-critical platforms used to support security.
This position involves working with highly sensitive and classified information, requiring an
Active Top Secret/SCI Security Clearance . A
relocation package
is available for this position.
Core Responsibilities
System Reliability & Resiliency:
Ensuring the survivability and $24/7$ uptime of mission-critical systems through robust design, proactive monitoring, and disaster recovery planning.
Automation and Toil Reduction:
Designing, developing, and deploying automation tools and scripts to eliminate repetitive manual tasks (toil) across system administration, deployment, and configuration.
Infrastructure as Code (IaC):
Developing and maintaining infrastructure using declarative tools (e.g., Terraform, Ansible) to ensure consistency, repeatability, and version control across all environments.
Configuration Management:
Implementing and enforcing best practices for configuration using
Policy as Code
and
Configuration as Code
methodologies across large Linux environments.
Monitoring and Observability:
Implementing advanced monitoring, logging, and alerting solutions to detect and resolve system issues based on symptoms, not just outages, and define key Service Level Indicators (SLIs).
Incident Management:
Serving as a technical leader during production incidents, conducting root cause analysis (RCA), and implementing preventative measures to drive continuous improvement.
Collaboration:
Working closely with Software Development, Cyber Security, and Mission Operations teams across the entire Software Development Lifecycle (SDLC) to ensure services are designed for scalability and reliability.
What Sets You Apart
Clearance & Experience:
An
Active TS/SCI Clearance
combined with $5+$ years of experience in a mission-critical SRE, DevOps, or highly-available Systems Engineering role.
Technical Depth:
Expert-level administration and troubleshooting of
Linux systems
and strong proficiency in scripting (e.g., Python, Bash).
Leadership:
Demonstrated success providing technical leadership, mentoring junior team members, and championing new ideas and SRE/DevOps best practices.
Communication:
Strong presentation, documentation, and communication skills, with proven experience in negotiating technical solutions to meet challenging customer requirements.
Proactive Mindset:
A commitment to ongoing learning and applying technology trends to solve operational challenges, always seeking win-win solutions.
Our Commitment to You
Work/Life Balance:
Flexible schedules, including the option for a
9/80 work schedule
(every other Friday off).
Career Growth:
An exciting career path with continuous learning, development, and advanced training opportunities.
Benefits:
Competitive benefits, including $401k$ matching, flex time off, paid parental leave, comprehensive healthcare, health & wellness programs, and more.
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
#J-18808-Ljbffr
We focus on the Oil & Gas, Renewables, Engineering, Power, and Nuclear markets as well as emerging technologies in EV, Battery, and Fusion. We are committed to ensuring that we offer the most exciting career opportunities from around the world for our jobseekers.
Site Reliability Engineer (Cleared)
$100,000 to $120,000 USD (with up to 10% bonus potential) + Paid Relocation
Denver Metro Area, Colorado
Security Clearance:
Active TS/SCI Clearance is REQUIRED.
Hybrid Remote (2-3 days on site) OR 9/80 work week available
Role Summary and Position Objectives This critical role applies software engineering principles to operations to build and run large-scale, fault-tolerant systems. You will be responsible for the continuous availability, scalability, and performance of mission-critical platforms used to support security.
This position involves working with highly sensitive and classified information, requiring an
Active Top Secret/SCI Security Clearance . A
relocation package
is available for this position.
Core Responsibilities
System Reliability & Resiliency:
Ensuring the survivability and $24/7$ uptime of mission-critical systems through robust design, proactive monitoring, and disaster recovery planning.
Automation and Toil Reduction:
Designing, developing, and deploying automation tools and scripts to eliminate repetitive manual tasks (toil) across system administration, deployment, and configuration.
Infrastructure as Code (IaC):
Developing and maintaining infrastructure using declarative tools (e.g., Terraform, Ansible) to ensure consistency, repeatability, and version control across all environments.
Configuration Management:
Implementing and enforcing best practices for configuration using
Policy as Code
and
Configuration as Code
methodologies across large Linux environments.
Monitoring and Observability:
Implementing advanced monitoring, logging, and alerting solutions to detect and resolve system issues based on symptoms, not just outages, and define key Service Level Indicators (SLIs).
Incident Management:
Serving as a technical leader during production incidents, conducting root cause analysis (RCA), and implementing preventative measures to drive continuous improvement.
Collaboration:
Working closely with Software Development, Cyber Security, and Mission Operations teams across the entire Software Development Lifecycle (SDLC) to ensure services are designed for scalability and reliability.
What Sets You Apart
Clearance & Experience:
An
Active TS/SCI Clearance
combined with $5+$ years of experience in a mission-critical SRE, DevOps, or highly-available Systems Engineering role.
Technical Depth:
Expert-level administration and troubleshooting of
Linux systems
and strong proficiency in scripting (e.g., Python, Bash).
Leadership:
Demonstrated success providing technical leadership, mentoring junior team members, and championing new ideas and SRE/DevOps best practices.
Communication:
Strong presentation, documentation, and communication skills, with proven experience in negotiating technical solutions to meet challenging customer requirements.
Proactive Mindset:
A commitment to ongoing learning and applying technology trends to solve operational challenges, always seeking win-win solutions.
Our Commitment to You
Work/Life Balance:
Flexible schedules, including the option for a
9/80 work schedule
(every other Friday off).
Career Growth:
An exciting career path with continuous learning, development, and advanced training opportunities.
Benefits:
Competitive benefits, including $401k$ matching, flex time off, paid parental leave, comprehensive healthcare, health & wellness programs, and more.
If you are interested in applying for this job please press the Apply Button and follow the application process. Energy Jobline wishes you the very best of luck in your next career move.
#J-18808-Ljbffr