Job Summary: The IT Monitoring Administrator is responsible for ensuring the reliability and health of the Operational Support System (OSS), ensuring incidents and events are managed within agreed service level agreements (SLAs) and escalated appropriately. In addition, they will promote and contribute to continuous improvement initiatives and knowledge sharing with the team.
Essential Job Duties and Responsibilities:
- Monitoring of the Microsoft System Centre suite, Prometheus, AppDynamics, SolarWinds Orion, Microsoft Azure platforms, and AWS.
- Good to have scripting skills (bash, shell, PowerShell, Oracle, SQL queries)
- Should know about configuring the Container & API monitoring.
- Maintain and upgrade existing monitoring systems as required.
- Manage the incident and problem queues, ensuring resolution within agreed SLAs.
- Tune and improve the monitoring systems to allow for better incident management and problem resolution.
- Maintain register of monitors and system design documentation.
- Maintain IT documentation and work in accordance with internal procedures and industry best practices.
- Contribute to knowledge base articles.
- Assist and provide information for periodic reports on performance and availability.
- Provide analysis of capacity management to assist other resolver teams.
- Provide Windows Server support as required.
- Participate in the on-call Rota and provide out-of-hours support as required.
- Work from other Cubic or customer sites or data centers as required.
- Comply with Cubic’s values and adhere to all company policy and procedures. Comply with the code of conduct, quality, security, and occupational health, safety, and environmental policies and procedures.
- In addition to the duties and responsibilities listed, the job holder is required to perform other duties assigned by their manager from time-to-time, as may be reasonably required of them.
Minimum Job Requirements:
Essential
- Degree, or equivalent years of relevant experience in a technical or technical management discipline.
- Two (2)+ years of experience in Microsoft System Centre suite, Prometheus, App Dynamics, SolarWinds Orion, and Microsoft Azure/AWS platforms.
- Experience supporting customers within a technical capacity.
- Formal training with a Monitoring product.
- A self-starter who can drive through change at all levels, recognized by their peers as inspirational and the go-to person for solving organizational problems.
- Motivation to improve skillset to ensure we are challenging system design.
- Excellent attention to detail and an unfaltering desire to deliver exceptional service.
- Customer-focused and comfortable with dealing with executive-level managers.
- Able to prioritize multiple tasks with deftness and adhere to deadlines.
- Strong analytical and influencing skills to assess demand for change and ensure that the necessary controls are in place to deliver successfully.
Desirable
- Networking, Unix, or Microsoft qualifications.
- Experience of using SCSM 2012 in a corporate environment.
- Experience in scripting in PowerShell in a corporate environment.
- Experience working within an IT Helpdesk.
- Up-to-date understanding of cloud technologies (AWS/Azure).