Diversity Resource Staffing, Inc.
Senior Site Reliability Engineer
Diversity Resource Staffing, Inc., Sandy Springs, Georgia, United States
This is an exciting opportunity for a Senior Site ReliabilityEngineer in the Consumer SRE Team at IMT division, to provide secure, resilient, scalable and maintainable services for mortgage borrowers and lenders. IMT is a division of our client based in Atlanta, which operates numerous financial and commodity marketplaces and exchanges, including the New York Stock Exchange (NYSE).
Automation is a big part of what we do – we use infrastructure-as-code within our hybrid cloud to bring stability and scalability to Windows, Linux, Docker and Serverless applications in AWS, On-Prem and Azure environments. We reduce toil through scripting and automation of repetitive tasks. You will collaborate with Developers to deliver robust services, build actionable alerts to detect / avoid incidents and to detect performance bottlenecks, as well as automation to remediate issues.
Responsibilities
Employ deep troubleshooting skills to improve the availability, performance, and security of Ellie Mae Services. Ensure services are designed with 24/7 availability and operational readiness and rigor Implement proactive monitoring, alerting, trend analysis and self-healing systems Define and measure KPIs and SLOs Build automated deployments, automated tests, and operational tools Participate in on-call rotation for Production support Collaborate with Product and Support teams to plan and deploy product releases Partner with other SREs and lead by example Knowledge and Experience
10+ years of Application/Systems engineering in 24x7 Production Services environments BS in Computer Science, Computer Engineering, Math, or equivalent professional experience Excellent troubleshooter, utilizing a systematic problem-solving approach Demonstrate the ability to lead Incident Response and root cause analysis (RCA) Fluency with one or more current generation scripting language used by SRE/DevOps professionals (Powershell, Python, Perl, PHP, Ruby) + Java/.NET development Experience running a SaaS application in a public cloud, on-prem or hybrid cloud environment Additional credit for:
Proficiency in Windows and on-prem environments Experience with Continuous Integration and Continuous Delivery concepts. Automation in RunDeck or Jenkins Infrastructure-as-code or Configuration Management, utilizing tools like Terraform,
CloudFormation or Chef/SaltStack/Puppet/DSC Containers/Docker/Micro-Services
#J-18808-Ljbffr
Automation is a big part of what we do – we use infrastructure-as-code within our hybrid cloud to bring stability and scalability to Windows, Linux, Docker and Serverless applications in AWS, On-Prem and Azure environments. We reduce toil through scripting and automation of repetitive tasks. You will collaborate with Developers to deliver robust services, build actionable alerts to detect / avoid incidents and to detect performance bottlenecks, as well as automation to remediate issues.
Responsibilities
Employ deep troubleshooting skills to improve the availability, performance, and security of Ellie Mae Services. Ensure services are designed with 24/7 availability and operational readiness and rigor Implement proactive monitoring, alerting, trend analysis and self-healing systems Define and measure KPIs and SLOs Build automated deployments, automated tests, and operational tools Participate in on-call rotation for Production support Collaborate with Product and Support teams to plan and deploy product releases Partner with other SREs and lead by example Knowledge and Experience
10+ years of Application/Systems engineering in 24x7 Production Services environments BS in Computer Science, Computer Engineering, Math, or equivalent professional experience Excellent troubleshooter, utilizing a systematic problem-solving approach Demonstrate the ability to lead Incident Response and root cause analysis (RCA) Fluency with one or more current generation scripting language used by SRE/DevOps professionals (Powershell, Python, Perl, PHP, Ruby) + Java/.NET development Experience running a SaaS application in a public cloud, on-prem or hybrid cloud environment Additional credit for:
Proficiency in Windows and on-prem environments Experience with Continuous Integration and Continuous Delivery concepts. Automation in RunDeck or Jenkins Infrastructure-as-code or Configuration Management, utilizing tools like Terraform,
CloudFormation or Chef/SaltStack/Puppet/DSC Containers/Docker/Micro-Services
#J-18808-Ljbffr