General Motors
Cloud Platform Software Engineer Engineer, Site Reliability Engineer
General Motors, Austin, Texas, us, 78716
The Role:
At General Motors, the Cloud Platform team is key to enabling our organizations to safely and efficiently adopt the cloud. Our team builds secure, reliable and cost effective services that enable our peers to confidently deliver products; we ensure that best practices are baked in and effortless. As Site Reliability Engineers, the team also uses our expertise to contribute to General Motors’ transition to a reliability‑first software organisation, by collaborating with our peers to deliver Production Readiness Reviews, SLOs, and Incident Reviews across teams. By joining us, you will have a rare opportunity to contribute to building a multi‑cloud Platform as a Service that enables millions of General Motors customers to interact with their vehicles and services.
This is an SRE‑flavored software engineering role, and we are looking for individuals who are passionate about building reliable, observable services and understanding best practices for running services in the cloud. The Cloud Platform team owns our platform end to end, delivering new features, supporting our peers, and ensuring we meet our SLAs.
As an engineer you will participate in a scrum team to deliver high quality software to production. You’ll be involved in designing features, working with peer teams to help them use our services effectively, and be part of our 24/7 support process.
What You’ll Do:
Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.
Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.
Responding to incidents, conducting root cause analysis, participating in post‑incident review, and implementing corrective actions to prevent similar incidents in the future.
Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.
Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.
Your Skills & Abilities (Required Qualifications)
Bachelor’s degree in Computer Science or a related field, or equivalent work experience.
Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems.
Hands‑on experience with Cloud platforms.
Demonstrated ability to clearly communicate technical and non‑technical information verbally and in writing.
Ability to resolve issues and complete tasks effectively in a team environment.
What Will Give You A Competitive Edge (Preferred Qualifications)
Experience with Git/source code management, CI/CD development, open‑source development.
Experience with event‑driven architectures or services such as Kafka.
Hands‑on experience in Infrastructure as Code tools like Terraform, Terragrunt, Azure Resource Manager (ARM) templates, YAML pipelines, or Bicep.
Working knowledge of AWS and Azure services such as Event Hubs, or AKS/EKS.
Experience with observability using OpenTelemetry, Prometheus, or services such as DataDog.
Kubernetes experience, including app deployment, service meshes such as Istio or Consul, networking.
Knowledge of Cloud networking i.e. VPC, VNET, Subnet, DNS, Load Balancer, including troubleshooting, diagnostics.
Knowledge of Azure Security, Cryptography, Vault integration and TLS certificates.
#J-18808-Ljbffr
This is an SRE‑flavored software engineering role, and we are looking for individuals who are passionate about building reliable, observable services and understanding best practices for running services in the cloud. The Cloud Platform team owns our platform end to end, delivering new features, supporting our peers, and ensuring we meet our SLAs.
As an engineer you will participate in a scrum team to deliver high quality software to production. You’ll be involved in designing features, working with peer teams to help them use our services effectively, and be part of our 24/7 support process.
What You’ll Do:
Developing and maintaining automation tools and infrastructure to streamline software deployment, configuration management, and system monitoring.
Monitoring the performance and availability of software systems, identifying and resolving issues, and implementing proactive measures to prevent future incidents.
Responding to incidents, conducting root cause analysis, participating in post‑incident review, and implementing corrective actions to prevent similar incidents in the future.
Collaborating with software development teams to ensure that reliability and scalability considerations are incorporated into the software design and implementation.
Identifying opportunities for process improvement, implementing best practices, and driving initiatives to enhance the reliability and performance of software systems.
Your Skills & Abilities (Required Qualifications)
Bachelor’s degree in Computer Science or a related field, or equivalent work experience.
Proficiency in at least one programming language (e.g., Python, Go, Java) and familiarity with multiple language ecosystems.
Hands‑on experience with Cloud platforms.
Demonstrated ability to clearly communicate technical and non‑technical information verbally and in writing.
Ability to resolve issues and complete tasks effectively in a team environment.
What Will Give You A Competitive Edge (Preferred Qualifications)
Experience with Git/source code management, CI/CD development, open‑source development.
Experience with event‑driven architectures or services such as Kafka.
Hands‑on experience in Infrastructure as Code tools like Terraform, Terragrunt, Azure Resource Manager (ARM) templates, YAML pipelines, or Bicep.
Working knowledge of AWS and Azure services such as Event Hubs, or AKS/EKS.
Experience with observability using OpenTelemetry, Prometheus, or services such as DataDog.
Kubernetes experience, including app deployment, service meshes such as Istio or Consul, networking.
Knowledge of Cloud networking i.e. VPC, VNET, Subnet, DNS, Load Balancer, including troubleshooting, diagnostics.
Knowledge of Azure Security, Cryptography, Vault integration and TLS certificates.
#J-18808-Ljbffr