The Service Operations Centre (SOC) provides 24x7 Site Reliability Operations support to globally distributed product teams. As part of this team, you will assist in monitoring systems, detecting potential issues, and supporting application teams in resolving them. This role focuses on learning, supporting day-to-day operations, and developing skills in troubleshooting and service improvement.
We are looking for a Graduate SOC Analyst who will contribute to operational efficiency, assist in service monitoring, and gain hands-on experience with advanced monitoring and troubleshooting techniques under the guidance of senior team members. You will collaborate with other teams to ensure high-quality service delivery while continuing to learn and grow in the field.
Key Responsibilities:
Service Monitoring and Troubleshooting:
- Learn and utilize basic monitoring tools such as SCOM, Zabbix, Kibana to observe system and application performance.
- Assist in monitoring servers, network, systems, and storage environments, working with more experienced team members to identify and escalate issues.
- Support proactive monitoring efforts to help detect potential issues before they impact operations.
- Apply fundamental troubleshooting skills and tools to assist in resolving system and application issues.
- Contribute to the creation and maintenance of monitoring dashboards to aid in real-time system oversight.
- Learn to handle incidents under supervision, ensuring that they are documented and resolved in a timely manner.
Customer Service Support:
- Provide support for Customer Business Units (CBUs) and Product teams, escalating issues as necessary.
- Assist in the creation of Service Level Objectives (SLOs) with guidance from senior team members.
- Collaborate with internal teams to help ensure customer issues are addressed promptly.
Continuous Service Improvement (Learning & Development Focus):
- Contribute to maintaining the Knowledge Base by documenting common issues and resolutions.
- Learn from senior team members about enterprise applications and how to bridge knowledge gaps.
- Research application metrics and help contribute to alert creation or modification suggestions.
- Work with more experienced team members to help identify recurring incidents and suggest improvements.
- Assist in maintaining processes and documentation for monitoring, troubleshooting, and escalation procedures.
Collaboration and Communication:
- Collaborate with other IT departments and teams under the guidance of senior SOC analysts to resolve issues and improve operations.
- Participate in discussions and meetings to provide input and suggestions as you develop your expertise.
- Provide basic updates and support to management as needed, learning how to communicate effectively in operational scenarios.
Learning and Development:
- Develop a strong understanding of monitoring and troubleshooting techniques with the support of experienced team members.
- Participate in on-the-job training and development opportunities to enhance technical skills and knowledge.
- Take initiative in learning new technologies and tools as needed to support the SOC’s operations.
- Actively engage in team activities, contributing to a collaborative work environment while growing in your role.
Other Tasks as Necessary:
- Demonstrate a professional attitude and develop positive relationships within the team.
- Follow instructions from senior analysts and team leads while gradually gaining more responsibility.
- Contribute to team problem-solving and improve your communication skills.
Knowledge, Skills, and Abilities:
- Familiarity with Agile, DevOps practices, ITIL and SRE practices.
- Ability to work a variety of shifts, including weekends, holidays, extra/extended shifts, and on-call rotation.
- Excellent written and oral communication skills.
- Ability to read, write, speak, and understand the English language in a business environment.