Logo
Tata Consultancy Services

System Administrator

Tata Consultancy Services, Jacksonville, Florida, United States, 32290

Save Job

Cloud Platform Engineer

Roles & Responsibilities

Platform Engineering •Application Infrastructure Provisioning & support •Building OS image (Golden Image) in cloud infrastructure •Ensuring reliability, stability & recoverability of overall IT Infrastructure •Capacity planning to optimize server & application performance •Management of Server space and remotely managed shared storage •Hardware & Software updates, patching & OS upgrade •Validation & execution of Firewall policies, troubleshooting error •Backup, restoration along with automated reporting task •Working on IaaC for server provisioning via Terraform & GitHub •Monitoring and Alerting: •Continuously monitor the performance, availability, and health of cloud resources (virtual machines, containers, databases, Load Balancers, networks, applications). •Set up and manage comprehensive monitoring tools (e.g., CloudWatch, Azure Monitor) or onboard cloud components to Dynatrace or Enterprise Monitoring Tool. •Define and implement alerting mechanisms to proactively identify and notify relevant teams of issues or anomalies. •Analyze logs, metrics, and traces to gain insights into system behavior. •Incident Management and Troubleshooting: •Act as the first line of defense for cloud-related incidents, quickly identifying, troubleshooting, and resolving issues. •Participate in on-call rotations to ensure 24/7 coverage for critical systems. •Collaborate with development, security, and other IT teams to diagnose and resolve complex problems. •Document incident resolutions and contribute to post-incident analysis (PIRs/RCAs) to prevent recurrence. •Infrastructure Management: •Manage the day-to-day operations of cloud infrastructure, including provisioning, deprovisioning, and scaling of resources. •Perform regular maintenance tasks such as patching, updates, and backups. •Implement and manage cloud configuration through Terraform to maintain consistency across environments. •Optimize resource utilization and performance. •Automation and Scripting: •Develop and implement automation scripts (e.g., Python, PowerShell, Bash) to streamline repetitive operational tasks. •Automa te provisioning, deployment, and scaling of cloud resources using Infrastructure as Code (IaC) tools like Terraform. •Integrate automation into CI/CD pipelines for faster and more reliable deployments. •Security and Compliance: •Implement and enforce security best practices within the cloud environment (e.g., IAM, network security groups, encryption). •Monitor for security vulnerabilities and threats and respond to security incidents. •Cost Optimization and Financial Management (FinOps): •Monitor and analyze cloud spending to identify cost-saving opportunities. •Implement strategies for cost optimization, such as rightsizing resources, utilizing reserved instances, and identifying idle or underutilized resources. •Generate cost reports and provide recommendations to management for budget optimization. •Cloud Governance: •Define and enforce cloud governance policies, standards, and procedures for resource usage, security, and cost. •Develop and maintain documentation for cloud operations processes, runbooks, and best practices. •Collaboration and Communication: •Work closely with development teams (DevOps), cloud architects, security teams, and other stakeholders to ensure alignment and effective operations. •Communicate clearly and concisely with both technical and non-technical audiences regarding system status, incidents, and planned changes. •Participate in cross-functional meetings and discussions to provide operational insights. •Capacity Planning: •Monitor resource utilization trends and forecast future capacity needs to ensure scalability and avoid performance bottlenecks. •Plan for scaling up or down based on anticipated demand or business requirements. •Continuous Improvement: •Proactively identify areas for improvement in cloud operations processes, tools, and infrastructure. •Implement automation and process enhancements to increase efficiency and reduce manual effort. •Stay up-to-date with the latest cloud technologies and best practices. •Vendor Management: •Work with cloud service providers (AWS, Azure, GCP) to resolve complex issues, stay informed about new services, and optimize service utilization.

Salary Range- $120,000-$145,000 a year

#LI-OJ1 #LI-DR1