InfoVista S.A
Senior Infrastructure & Cloud Platform Engineer
InfoVista S.A, Lowell, Massachusetts, United States, 01856
The Role & Team
The successful candidate will manage and oversee the Infovistas Platform infrastructure, making sure it runs smoothly and efficiently. The Infrastructure Team is charged with Development, Design, and the implementation of new strategies to increase the performance and reliability of the cloud infrastructure. Responsibilities
Monitor cloud and on-prem infrastructure for errors or problems and resolve them in a timely manner. Work with Development and Product to design and implement strategies to increase performance, reliability, and scalability of the infrastructure. Identify single points of failure in platform design and make cost-effective recommendations for remediation. Stay up to date with the latest technologies and advancements in cloud computing and infrastructure operations to improve resiliency and security. Develop, document, and enforce policies, standards, and procedures for cloud and infrastructure maintenance, change management, and security. Participate in organizational Change Management activities, including risk assessments, change approval reviews, and post-change validation. Ensure timely patching and lifecycle management across hardware, operating systems, virtualization platforms, and cloud resources, following security and compliance requirements. Collaborate with Security and Compliance teams to remediate vulnerabilities outside normal patch cycles, including emergency fixes, configuration changes, and compensating controls. Manage cloud-based data backup and disaster recovery procedures to ensure business continuity. Maintain and support physical server infrastructure (Dell PowerEdge and related hardware), ensuring firmware, drivers, and hardware components are kept current. Document and maintain infrastructure architecture diagrams, configuration details, SOPs, and runbooks. Ensure all knowledge is captured in Atlassian (Jira/Confluence) and ITGlue. Manage vendor relationships and contracts related to infrastructure and cloud services, ensuring SLA adherence and effective escalation. Optimize telephony carrier network for cost effectiveness, capacity, flexibility, and resiliency. Create and manage budgets for cloud and infrastructure, recommending cost-savings where appropriate. Contribute to monitoring, alerting, and observability practices using tools such as Grafana, Zabbix, Site24x7, or similar with the goal of reducing MTTD/MTTR. Provide point of contact, technical support, and guidance to other employees on infrastructure-related issues. Actively participate in incident response, root cause analysis, and post-incident reviews; ensure lessons learned feed into continuous improvement. Contribute to capacity planning and forecasting to anticipate future growth and resource needs. Mentor and delegate work to IT staff; provide training on infrastructure tools, processes, and best practices. Maintain overall accountability for the performance, availability, and security of the cloud and infrastructure platform. Need to be local NH, MA. Hands-on engineer who can also bring a strategic mindset. Ability to prioritize effectively and recognize opportunities to delegate. Collaborative mindset in working with product, engineering, IT, and security teams. Requirements
Experience with Active Directory and Windows/Linux system administration. Strong knowledge of VMware technologies (ESXi, vCenter, vSphere) for virtualization and datacenter management. Experience with Nutanix Enterprise Cloud (AHV, Prism, cluster management) for virtualization and hyperconverged infrastructure. Hands-on experience with Dell PowerEdge server hardware including installation, firmware updates, lifecycle management, and integration with Nutanix/VMware environments. Advanced understanding of Microsoft SQL. Automation technologies experience (Terraform, Ansible, or similar). Hands-on scripting experience (Python, PowerShell, or Bash) to support automation and integration. Network engineering experience (routing, switching, firewalls, VPN/IPSec). Knowledge of cloud technologies (AWS, Azure) and hybrid integration. Telephony technologies experience (SIP, DID). Experience with patch management tools and processes (WSUS, SCCM, Ansible, or equivalent) for both Windows and Linux environments. Familiarity with vulnerability management tools and ability to work with security teams on remediation. Experience with Atlassian tools (Jira, Confluence) and ITGlue or equivalent knowledge/documentation platforms. Familiarity with monitoring and observability tools (Grafana, Zabbix, Site24x7, or similar). Understanding of ITIL practices in Change, Incident, and Problem Management. Exposure to compliance frameworks (ISO 27001, SOC2, NIST) a plus. Excellent documentation and communication skills; ability to explain technical issues in business terms. 24/7 on-call availability.
#J-18808-Ljbffr
The successful candidate will manage and oversee the Infovistas Platform infrastructure, making sure it runs smoothly and efficiently. The Infrastructure Team is charged with Development, Design, and the implementation of new strategies to increase the performance and reliability of the cloud infrastructure. Responsibilities
Monitor cloud and on-prem infrastructure for errors or problems and resolve them in a timely manner. Work with Development and Product to design and implement strategies to increase performance, reliability, and scalability of the infrastructure. Identify single points of failure in platform design and make cost-effective recommendations for remediation. Stay up to date with the latest technologies and advancements in cloud computing and infrastructure operations to improve resiliency and security. Develop, document, and enforce policies, standards, and procedures for cloud and infrastructure maintenance, change management, and security. Participate in organizational Change Management activities, including risk assessments, change approval reviews, and post-change validation. Ensure timely patching and lifecycle management across hardware, operating systems, virtualization platforms, and cloud resources, following security and compliance requirements. Collaborate with Security and Compliance teams to remediate vulnerabilities outside normal patch cycles, including emergency fixes, configuration changes, and compensating controls. Manage cloud-based data backup and disaster recovery procedures to ensure business continuity. Maintain and support physical server infrastructure (Dell PowerEdge and related hardware), ensuring firmware, drivers, and hardware components are kept current. Document and maintain infrastructure architecture diagrams, configuration details, SOPs, and runbooks. Ensure all knowledge is captured in Atlassian (Jira/Confluence) and ITGlue. Manage vendor relationships and contracts related to infrastructure and cloud services, ensuring SLA adherence and effective escalation. Optimize telephony carrier network for cost effectiveness, capacity, flexibility, and resiliency. Create and manage budgets for cloud and infrastructure, recommending cost-savings where appropriate. Contribute to monitoring, alerting, and observability practices using tools such as Grafana, Zabbix, Site24x7, or similar with the goal of reducing MTTD/MTTR. Provide point of contact, technical support, and guidance to other employees on infrastructure-related issues. Actively participate in incident response, root cause analysis, and post-incident reviews; ensure lessons learned feed into continuous improvement. Contribute to capacity planning and forecasting to anticipate future growth and resource needs. Mentor and delegate work to IT staff; provide training on infrastructure tools, processes, and best practices. Maintain overall accountability for the performance, availability, and security of the cloud and infrastructure platform. Need to be local NH, MA. Hands-on engineer who can also bring a strategic mindset. Ability to prioritize effectively and recognize opportunities to delegate. Collaborative mindset in working with product, engineering, IT, and security teams. Requirements
Experience with Active Directory and Windows/Linux system administration. Strong knowledge of VMware technologies (ESXi, vCenter, vSphere) for virtualization and datacenter management. Experience with Nutanix Enterprise Cloud (AHV, Prism, cluster management) for virtualization and hyperconverged infrastructure. Hands-on experience with Dell PowerEdge server hardware including installation, firmware updates, lifecycle management, and integration with Nutanix/VMware environments. Advanced understanding of Microsoft SQL. Automation technologies experience (Terraform, Ansible, or similar). Hands-on scripting experience (Python, PowerShell, or Bash) to support automation and integration. Network engineering experience (routing, switching, firewalls, VPN/IPSec). Knowledge of cloud technologies (AWS, Azure) and hybrid integration. Telephony technologies experience (SIP, DID). Experience with patch management tools and processes (WSUS, SCCM, Ansible, or equivalent) for both Windows and Linux environments. Familiarity with vulnerability management tools and ability to work with security teams on remediation. Experience with Atlassian tools (Jira, Confluence) and ITGlue or equivalent knowledge/documentation platforms. Familiarity with monitoring and observability tools (Grafana, Zabbix, Site24x7, or similar). Understanding of ITIL practices in Change, Incident, and Problem Management. Exposure to compliance frameworks (ISO 27001, SOC2, NIST) a plus. Excellent documentation and communication skills; ability to explain technical issues in business terms. 24/7 on-call availability.
#J-18808-Ljbffr