Jobgether
Linux Systems Administrator -(Red Hat) AI ML Based
Jobgether, New Bremen, Ohio, United States
We are looking for a senior Linux Systems Administrator (Red HAT) for an AI/ML role based in Germany. This position focuses on managing and optimizing large-scale Linux infrastructure in production and development environments, ensuring system reliability, security, and performance while collaborating closely with DevOps, network, and software teams. The role offers opportunities to implement automation, streamline processes, and support mission‑critical AI/ML workloads in a fully remote setting with flexible scheduling and overlap with U.S. IST hours.
Accountabilities
Manage, monitor, and maintain Ubuntu and Red Hat Linux servers across production and staging environments.
Perform system upgrades, kernel updates, patch management, and performance tuning to ensure optimal reliability.
Implement and enforce security policies, user access controls, and backup/recovery strategies.
Troubleshoot hardware, OS, and network‑related issues; ensure minimal service disruption.
Maintain configuration management and deployment pipelines using Ansible, Puppet, or similar tools.
Monitor system health, resource utilization, and AI/ML workloads to guarantee uptime and performance.
Collaborate with DevOps, Cloud, and Software teams for environment provisioning and infrastructure scaling (AWS, Azure, or on‑prem).
Participate in capacity planning, disaster recovery, and incident response activities.
Maintain detailed documentation, SOPs, and audit reports to support compliance and operational transparency.
Requirements
5+ years of hands‑on experience in Linux system administration (Ubuntu & Red Hat).
Strong expertise in Bash scripting and automation tools such as Ansible, Terraform, or Python basics.
Experience with monitoring tools like Nagios, Zabbix, or Prometheus.
Solid understanding of networking fundamentals, including DNS, DHCP, NFS, SSH, and firewalls.
Knowledge of virtualization and containerization technologies (VMware, KVM, Docker, etc.).
Troubleshooting skills for system logs, kernel issues, and service failures.
Familiarity with version control systems (Git) and CI/CD pipeline environments.
Exposure to cloud platforms (AWS, GCP, Azure) is advantageous.
Red Hat Certified System Administrator (RHCSA) or Engineer (RHCE) preferred.
Experience with high‑availability clusters, load balancing, and RAID management is a plus.
Excellent communication, documentation, and coordination skills for supporting global teams.
Strong ownership, accountability, and attention to detail; able to maintain SLAs under pressure.
Benefits
Fully remote work with flexible scheduling, including part‑time or full‑time options.
Competitive compensation aligned with experience and market standards.
Opportunities to work on AI/ML infrastructure projects and cutting‑edge technologies.
Professional growth and skill development in Linux systems, automation, and cloud environments.
Collaborative, high‑performing team environment with global impact.
#J-18808-Ljbffr
Accountabilities
Manage, monitor, and maintain Ubuntu and Red Hat Linux servers across production and staging environments.
Perform system upgrades, kernel updates, patch management, and performance tuning to ensure optimal reliability.
Implement and enforce security policies, user access controls, and backup/recovery strategies.
Troubleshoot hardware, OS, and network‑related issues; ensure minimal service disruption.
Maintain configuration management and deployment pipelines using Ansible, Puppet, or similar tools.
Monitor system health, resource utilization, and AI/ML workloads to guarantee uptime and performance.
Collaborate with DevOps, Cloud, and Software teams for environment provisioning and infrastructure scaling (AWS, Azure, or on‑prem).
Participate in capacity planning, disaster recovery, and incident response activities.
Maintain detailed documentation, SOPs, and audit reports to support compliance and operational transparency.
Requirements
5+ years of hands‑on experience in Linux system administration (Ubuntu & Red Hat).
Strong expertise in Bash scripting and automation tools such as Ansible, Terraform, or Python basics.
Experience with monitoring tools like Nagios, Zabbix, or Prometheus.
Solid understanding of networking fundamentals, including DNS, DHCP, NFS, SSH, and firewalls.
Knowledge of virtualization and containerization technologies (VMware, KVM, Docker, etc.).
Troubleshooting skills for system logs, kernel issues, and service failures.
Familiarity with version control systems (Git) and CI/CD pipeline environments.
Exposure to cloud platforms (AWS, GCP, Azure) is advantageous.
Red Hat Certified System Administrator (RHCSA) or Engineer (RHCE) preferred.
Experience with high‑availability clusters, load balancing, and RAID management is a plus.
Excellent communication, documentation, and coordination skills for supporting global teams.
Strong ownership, accountability, and attention to detail; able to maintain SLAs under pressure.
Benefits
Fully remote work with flexible scheduling, including part‑time or full‑time options.
Competitive compensation aligned with experience and market standards.
Opportunities to work on AI/ML infrastructure projects and cutting‑edge technologies.
Professional growth and skill development in Linux systems, automation, and cloud environments.
Collaborative, high‑performing team environment with global impact.
#J-18808-Ljbffr