IT Manager - MS Azure, VMware & Linux

6 Minutes ago • 10 Years + • IT & Infrastructure

Job Summary

Job Description

Blue Yonder is seeking an Observability and Automation Manager to build and manage enterprise-grade monitoring, observability, and automation frameworks. This role involves defining strategy, implementing tools, and driving adoption of observability and automation practices across infrastructure, applications, and business services. The manager will lead a team, collaborate with various IT and engineering teams, and ensure end-to-end visibility and automation for proactive issue detection and resolution, focusing on efficiency and reliability.
Must have:
  • Define and drive observability and automation strategy.
  • Lead a team of engineers and specialists in monitoring, observability, and automation initiatives.
  • Partner with application, infrastructure, and DevOps teams to ensure observability and automation standards are adopted enterprise wide.
  • Own the design, implementation, and operations of enterprise observability platforms.
  • Ensure end-to-end visibility of applications, infrastructure, and cloud environments.
  • Define and implement IT automation and orchestration strategies.
  • Build and maintain automation frameworks (provisioning, remediation, workflows, runbooks, self-healing systems).
  • Manage the deploy and maintenance of Windows, Unix, Linux, VMware systems infrastructure in OnPrem and MS Azure.
  • Establish KPIs, SLIs, SLOs, and dashboards to measure and report system reliability and performance.
  • 10+ years of combined related work experience.
  • Minimum 5 years of experience in observability, monitoring, or automation leadership roles.
  • Strong hands-on knowledge of observability tools such as Datadog, Dynatrace, AppDynamics, Splunk, Elastic, Prometheus, Grafana.
  • Expertise in automation tools/frameworks: Ansible, Terraform, Puppet, ServiceNow Orchestration, scripting (Python, PowerShell, Shell).
  • Experience with cloud platforms preferably Azure and hybrid environments.
  • Strong understanding of DevOps, SRE practices, CI/CD, and ITIL processes.
  • Excellent leadership, communication, and global stakeholder management skills.
  • Knowledge of Unix/Linux or Windows operating systems, VMware, Network, Backup and Storage.
  • Experience in Microsoft technologies – MS Azure, Azure AD & O365.
Good to have:
  • Relevant certifications such as PMP, Six Sigma, and ITIL are preferred.
  • Knowledge in Cloud Technologies – Private, Public, Hybrid, IaaS+, PaaS, SaaS.
  • Basic Knowledge of Palo Alto SDWAN & Prisma.
  • Good Knowledge of ticketing tools like ServiceNow.

Job Details

Overview:

Blue Yonder is the proven leader in artificial intelligence and machine learning (AI/ML)-driven supply chain and retail solutions for 4,000 of the world’s leading retail, manufacturing, and logistics companies. Blue Yonder’s world-class client brands include 75 of the top 100 retailers, 77 of the top 100 consumer goods companies, and 8 of the top 10 global 3PLs. Running Blue Yonder, you can plan to deliver.

The Observability and Automation Manager will be responsible in building and managing enterprise-grade monitoring, observability, and automation frameworks. This role will be responsible for defining strategy, implementing tools, and driving adoption of observability and automation practices across infrastructure, applications, and business services. This role requires strong technical expertise in monitoring platforms, AIOps, automation frameworks, and a proven ability to collaborate across engineering, operations, and business teams.

Scope:

  • Collaborate with Product owners, Engineering, and internal IT teams to achieve business objectives.
  • Define and drive the observability and automation strategy aligned with business and IT objectives.
  • Lead a team of engineers and specialists responsible for monitoring, observability, and automation initiatives.
  • Partner with application, infrastructure, and DevOps teams to ensure observability and automation standards are adopted enterprise wide.
  • Own the design, implementation, and operations of enterprise observability platforms (APM, log analytics, metrics, tracing, synthetic monitoring).
  • Ensure end-to-end visibility of applications, infrastructure, and cloud environments to proactively detect and resolve issues.
  • Define and implement IT automation and orchestration strategies across infrastructure and operations.
  • Build and maintain automation frameworks (provisioning, remediation, workflows, runbooks, self-healing systems).
  • Partner with ITSM and DevOps teams to automate incident, problem, and change management processes.
  • Continuously identify opportunities to reduce manual effort, improve efficiency, and enhance service reliability.
  • Work closely with business stakeholders to ensure observability and automation meet compliance and security standards.
  • Develop governance models and best practices for monitoring, alerting, and automation usage.
  • Provide executive-level reporting on system reliability, performance trends, and automation outcomes.
  • Manage the deploy and maintenance of Windows, Unix, Linux, VMware systems infrastructure in OnPrem and MS Azure.
  • Establish KPIs, SLIs, SLOs, and dashboards to measure and report system reliability and performance.
  • Works with senior leadership and Architecture and Engineering team in the planning, development, and execution of short term and long-term goals.
  • Developing processes to streamline and drive team to automate routine tasks.
  • Assisting in writing technical documentation and Work Instructions.
  • Stay current with emerging technologies and trends in the IT industry and recommend innovative solutions to improve operational efficiency and effectiveness.

Our current technical environment:

  • Operating System: Windows & Linux
  • Hyper converged Environment: VMWare
  • Programming languages: Python, PowerShell, and Shell scripting
  • Cloud Architecture: MS Azure (Terraform, ARM templates, AKS, Virtual Networks, Azure AD)
  • Configuration management tools: Ansible and Terraform
  • DevOps Tools: GIT, GitLab/GitHub and Docker
  • Storage : NetApp

What you’ll do:

  • Collaborate with Product owners, Engineering, and internal IT teams to achieve business objectives.
  • Define and drive the observability and automation strategy aligned with business and IT objectives.
  • Lead a team of engineers and specialists responsible for monitoring, observability, and automation initiatives.
  • Partner with application, infrastructure, and DevOps teams to ensure observability and automation standards are adopted enterprise wide.
  • Own the design, implementation, and operations of enterprise observability platforms (APM, log analytics, metrics, tracing, synthetic monitoring).
  • Ensure end-to-end visibility of applications, infrastructure, and cloud environments to proactively detect and resolve issues.
  • Define and implement IT automation and orchestration strategies across infrastructure and operations.
  • Build and maintain automation frameworks (provisioning, remediation, workflows, runbooks, self-healing systems).
  • Partner with ITSM and DevOps teams to automate incident, problem, and change management processes.
  • Continuously identify opportunities to reduce manual effort, improve efficiency, and enhance service reliability.
  • Work closely with business stakeholders to ensure observability and automation meet compliance and security standards.
  • Develop governance models and best practices for monitoring, alerting, and automation usage.
  • Provide executive-level reporting on system reliability, performance trends, and automation outcomes.
  • Manage the deploy and maintenance of Windows, Unix, Linux, VMware systems infrastructure in OnPrem and MS Azure.
  • Establish KPIs, SLIs, SLOs, and dashboards to measure and report system reliability and performance.
  • Works with senior leadership and Architecture and Engineering team in the planning, development, and execution of short term and long-term goals.
  • Developing processes to streamline and drive team to automate routine tasks.

What we are looking for:

  • Bachelor’s degree in computer science, MIS or engineering related field or equivalent work experience.
  • 10+ years of combined related work experience and minimum of 5 years of experience observability, monitoring, or automation leadership roles.
  • Strong hands-on knowledge of observability tools such as Datadog, Dynatrace, AppDynamics, Splunk, Elastic, Prometheus, Grafana.
  • Expertise in automation tools/frameworks: Ansible, Terraform, Puppet, ServiceNow Orchestration, scripting (Python, PowerShell, Shell).
  • Experience with cloud platforms preferably Azure and hybrid environments.
  • Strong understanding of DevOps, SRE practices, CI/CD, and ITIL processes.
  • Excellent leadership, communication, and global stakeholder management skills.
  • Knowledge of Unix/Linux or Windows operating systems, VMware, Network, Backup and Storage experience with supporting and troubleshooting stability and performance issues.
  • Demonstrated problem-solving and decision-making capabilities to meet the organizations developing needs and growth.
  • Demonstrate agility and responsiveness.
  • Ability to work in a fast-paced environment and meet tight periodic reporting deadlines.
  • Ability to work under strict deadlines to meet or exceed team goals.
  • Experience in Microsoft technologies – MS Azure, Azure AD & O365.
  • Experience working with virtual and remote team members and stakeholders.
  • Knowledge of Information Security regulations and compliance standards.
  • Basic knowledge on Network switching, routing, firewalls and MPLS circuits.
  • Strong focus on people development.
  • Strong technical experience with IT Infrastructure, systems administration, Platform Sizing, capacity planning and Infrastructure Cost Reduction.
  • Good Knowledge of ticketing tools like ServiceNow.
  • Relevant certifications such as PMP, Six Sigma, and ITIL are preferred.
  • Knowledge in Cloud Technologies – Private, Public, Hybrid, IaaS+, PaaS, SaaS.
  • Basic Knowledge of Palo Alto SDWAN & Prisma.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Hyderabad, Telangana, India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

IT & Infrastructure Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

We are a proven, passionate bunch of disruptors. Our work is all about tapping into your potential so we can deliver the best solutions and customer experiences on the planet. Collaboration, respect, and a great work-life balance earned us the title of "Best Place to Work- Employees' Choice" by Glassdoor. Our people are smart, creative, rock stars with over 400 patents and 10,000 people years of domain expertise. Blue Yonder is the world leader in digital supply chain and omni-channel commerce fulfillment. Our intelligent, end-to-end platform enables retailers, manufacturers and logistics providers to seamlessly predict, pivot and fulfill customer demand. With Blue Yonder, you can make more automated, profitable business decisions that deliver greater growth and re-imagined customer experiences. Blue Yonder - Fulfill your Potential.™

Bengaluru, Karnataka, India (On-Site)

Bengaluru, Karnataka, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Dallas, Texas, United States (Hybrid)

United States (Remote)

View All Jobs

Get notified when new jobs are added by Blue Yonder

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug
Contact Us
hello@outscal.com
Made in INDIA 💛💙