ZipRecruiter
Job DescriptionJob Description Job Description:
We are seeking an experienced
Cloud Infrastructure Splunk Specialist
to join our Datacenter Engineering team. You will play a key role in managing cloud infrastructure, security, automation, billing dashboards, and Splunk analytics. The role requires deep experience with Splunk (Splunk Enterprise, SPL/SPLUNK SEARCH PROCESSING ), cloud platforms (AWS, Azure, GCP, OCI), infrastructure-as-code (Terraform, CloudFormation), container orchestration (Kubernetes, Docker, AWS Fargate), scripting (Python, Bash, PowerShell), and
monitoring/observability
tools (Prometheus, Grafana, CloudWatch, Azure Monitor). Experience with security tooling (Chef InSpec/Automate, TrendMicro Deep Security, CyberArk, Tripwire), CI/CD, and cloud cost optimization (FinOps) is highly desirable. Responsibilities: Design, develop, and maintain
Splunk Dashboards
for monitoring and reporting, including creating advanced SPL queries, saved searches, alerts, and visualizations.
Perform
trend and data analysis
on cloud resources, infrastructure, and billing systems to identify optimization and cost-savings opportunities.
Oversee
data collection systems
and Cloud CLI scripting for automation (AWS CLI, Azure CLI, gcloud, OCI CLI). Manage Splunk forwarders, indexers, clustering and ingestion pipelines.
Manage and maintain
Cloud Billing Dashboards
across AWS, Azure, Oracle Cloud, and Google Cloud, integrating billing APIs and usage reports.
Integrate and monitor billing APIs from multiple cloud providers and implement cost-allocation, tagging, and FinOps best practices.
Collaborate with security teams to ensure
infrastructure integrity
and compliance (IAM, vulnerability management, SOC2/PCI readiness).
Perform proactive monitoring, troubleshooting, and resolution of system issues using observability tooling (Prometheus, Grafana, ELK, Splunk).
Work closely with cross-functional teams to support
cloud automation and containerized workloads , contributing to IaC templates
(Terraform/CloudFormation),
CI/CD pipelines (Jenkins/GitHub Actions/GitLab CI), and deployment automation.
Provide documentation, runbooks, best practices, and recommendations for infrastructure and monitoring improvements; ensure reproducible and auditable configurations (YAML, JSON).
Ensure continuous improvements in
monitoring, automation, and reporting systems , including alert tuning, capacity planning and performance optimization.
Requirements 8–10 years of hands-on experience
in Wintel Administration and enterprise datacenter/cloud operations.
Strong understanding of
host-based firewalls, intrusion protection, data integrity, and vulnerability scanners ; hands-on with EDR/IDS/IPS tools and vulnerability remediation workflows.
Expertise in
AWS and Azure security
concepts including IAM, KMS, VPC/Networking, security groups, and cloud security best practices.
Hands-on experience with
Chef InSpec / Automate, TrendMicro Deep Security, CyberArk, and Tripwire
and integration of these tools into
monitoring/alerting
systems.
Working knowledge of
containers
(Kubernetes, AWS Fargate, Docker) and related ecosystem tools (Helm, Kustomize).
Strong
Splunk skills
with proven ability to create searches, reports, dashboards, manage indexing, clustering, and optimize ingest pipelines.
Ability to
diagnose and resolve technical issues
efficiently under pressure, including root cause analysis and incident response.
Excellent
communication skills
(verbal written) with cross-team collaboration and documentation capabilities.
Familiarity with
cloud billing systems
and cost-optimization practices, including experience with billing APIs and cost reporting tools.
Self-starter with
proactive approach
and thorough documentation skills; experience with Git-based workflows and ticketing systems.
Qualifications: Experience across
Azure, AWS, OCI, and GCP , with multi-cloud operational experience.
Familiarity with
DevOps practices
and automation workflows, IaC and CI/CD integration.
Exposure to
enterprise-scale datacenter engineering
and hybrid environments.
#J-18808-Ljbffr
Cloud Infrastructure Splunk Specialist
to join our Datacenter Engineering team. You will play a key role in managing cloud infrastructure, security, automation, billing dashboards, and Splunk analytics. The role requires deep experience with Splunk (Splunk Enterprise, SPL/SPLUNK SEARCH PROCESSING ), cloud platforms (AWS, Azure, GCP, OCI), infrastructure-as-code (Terraform, CloudFormation), container orchestration (Kubernetes, Docker, AWS Fargate), scripting (Python, Bash, PowerShell), and
monitoring/observability
tools (Prometheus, Grafana, CloudWatch, Azure Monitor). Experience with security tooling (Chef InSpec/Automate, TrendMicro Deep Security, CyberArk, Tripwire), CI/CD, and cloud cost optimization (FinOps) is highly desirable. Responsibilities: Design, develop, and maintain
Splunk Dashboards
for monitoring and reporting, including creating advanced SPL queries, saved searches, alerts, and visualizations.
Perform
trend and data analysis
on cloud resources, infrastructure, and billing systems to identify optimization and cost-savings opportunities.
Oversee
data collection systems
and Cloud CLI scripting for automation (AWS CLI, Azure CLI, gcloud, OCI CLI). Manage Splunk forwarders, indexers, clustering and ingestion pipelines.
Manage and maintain
Cloud Billing Dashboards
across AWS, Azure, Oracle Cloud, and Google Cloud, integrating billing APIs and usage reports.
Integrate and monitor billing APIs from multiple cloud providers and implement cost-allocation, tagging, and FinOps best practices.
Collaborate with security teams to ensure
infrastructure integrity
and compliance (IAM, vulnerability management, SOC2/PCI readiness).
Perform proactive monitoring, troubleshooting, and resolution of system issues using observability tooling (Prometheus, Grafana, ELK, Splunk).
Work closely with cross-functional teams to support
cloud automation and containerized workloads , contributing to IaC templates
(Terraform/CloudFormation),
CI/CD pipelines (Jenkins/GitHub Actions/GitLab CI), and deployment automation.
Provide documentation, runbooks, best practices, and recommendations for infrastructure and monitoring improvements; ensure reproducible and auditable configurations (YAML, JSON).
Ensure continuous improvements in
monitoring, automation, and reporting systems , including alert tuning, capacity planning and performance optimization.
Requirements 8–10 years of hands-on experience
in Wintel Administration and enterprise datacenter/cloud operations.
Strong understanding of
host-based firewalls, intrusion protection, data integrity, and vulnerability scanners ; hands-on with EDR/IDS/IPS tools and vulnerability remediation workflows.
Expertise in
AWS and Azure security
concepts including IAM, KMS, VPC/Networking, security groups, and cloud security best practices.
Hands-on experience with
Chef InSpec / Automate, TrendMicro Deep Security, CyberArk, and Tripwire
and integration of these tools into
monitoring/alerting
systems.
Working knowledge of
containers
(Kubernetes, AWS Fargate, Docker) and related ecosystem tools (Helm, Kustomize).
Strong
Splunk skills
with proven ability to create searches, reports, dashboards, manage indexing, clustering, and optimize ingest pipelines.
Ability to
diagnose and resolve technical issues
efficiently under pressure, including root cause analysis and incident response.
Excellent
communication skills
(verbal written) with cross-team collaboration and documentation capabilities.
Familiarity with
cloud billing systems
and cost-optimization practices, including experience with billing APIs and cost reporting tools.
Self-starter with
proactive approach
and thorough documentation skills; experience with Git-based workflows and ticketing systems.
Qualifications: Experience across
Azure, AWS, OCI, and GCP , with multi-cloud operational experience.
Familiarity with
DevOps practices
and automation workflows, IaC and CI/CD integration.
Exposure to
enterprise-scale datacenter engineering
and hybrid environments.
#J-18808-Ljbffr