TCS USAAvance Consulting

OpenShift Administrator (with Linux administration, Automation & DevOps Expertis

TCS USAAvance Consulting, Dallas, Texas, United States, 75215

Skill: OpenShift Administrator (with Linux administration, Automation & DevOps Expertise) Job Summary: We are seeking a skilled OpenShift Administrator with a strong background in Linux system administration, scripting, automation and DevOps practices.

The ideal candidate will be responsible for maintaining and supporting RedHat OpenShift clusters in production environment, providing 24*7 support, incident triage, and participating in release activities.

You will work closely with cross-functional teams to ensure system reliability, scalability, and performance.

5+ years of hands-on experience with RedHat OpenShift (v4.x preferred).

Strong proficiency in Linux (RedHat Linux, CoreOS) system administration.

Solid experience in Shell scripting, bash, and/or Python.

Knowledge of DevOps tools such as Jenkins, Git, Ansible, Helm, Docker, and GitOps workflows.

Understanding of container orchestration, pod scheduling, and cluster level debugging

Experience working in production support environments with on-call responsibilities.

Familiarity with monitoring and logging solutions such as Prometheus, Grafana, ELK, FluentD.

Good problem-solving skills and experience in incident triage and root cause analysis.

Nice To have: RedHat Certified Specialist in OpenShift Administration or equivalent.

Certified Kubernetes Administration (CKA).

Knowledge of Kubernetes, Istio, or service mesh technologies.

Experience with Azure cloud platform.

Experience with ArgoCD workflows.

Roles & Responsibilities: Administer and maintain RedHat OpenShift clusters in production and non-production environment.

Upgrade Clusters.

Perform Linux system administration, troubleshooting, patching.

Monitor cluster health, manage node scalability, resource optimization and ensure cluster availability (No downtime).

Troubleshoot and resolve issues related to infrastructure, network, pods, containers, and services.

Participate in 24x7 production support (on-call rotation), handle critical incidents, and ensure minimal platform downtime.

Develop and maintain shell/python scripts for automation of routine tasks and monitoring.

Implement and support CI/CD pipelines using tools like Jenkins, GitOps Approach, or ArgoCD.

Collaborate with development and release teams to support deployment and release activities.

Create and maintain documentation for systems, processes, and standard operating procedures (SOPs).

Ensure compliance with security, backup, and disaster recovery policies.

Platform monitoring using Dynatrace.