About the job
The Cloud Operation Engineering business function is responsible for ensuring that Basware continues to provide its customers with industry-leading SaaS solutions to simplify operations and spend smarter. You will be responsible for the execution of incident management process while ensuring system availability and performance. In this role, you will lead troubleshooting efforts, perform investigations into complex operational issues, and contribute to the continuous improvement of processes. This role requires sound knowledge of cloud environments, computer networking, system administration, and security practices.
Key Responsibilities:
- Incident Management
- Good understanding of ServiceNow ITSM tool for the processing of catalogued service requests in our SaaS offerings.
- Problem Management:
- Take ownership of cloud incident response, including troubleshooting, root-cause analysis, and resolution of complex issues impacting service availability.
- Collaborate with cross-functional teams to drive resolution of service outages.
- Cloud Technology:
- Ensure high availability and scalability of cloud-based services by implementing best practices for monitoring, performance tuning, and incident response.
- Networking:
- Lead troubleshooting of complex networking issues affecting cloud performance.
- System Administration:
- Perform administration of Linux and Windows Server environments in cloud platforms.
- Database:
- Proficiency in SQL and Oracle for implementing database change requests.
- SFTP and Secure Transfers:
- Lead the troubleshooting of SFTP issue, ensuring secure and seamless file transfer processes.
- SSO and Authentication:
- Oversee the implementation and maintenance of Single Sign-On (SSO) systems, ensuring secure and efficient user access management across cloud environments.
- Monitoring and Optimization:
- Utilize Splunk and Dynatrace monitoring tools to ensure optimal performance, availability, and security of cloud systems.
- Proactively identify opportunities for performance improvements and cost optimization.
,
- Bachelor's/Master’s degree in a Technology field with at least 3 years of experience in Cloud Operations supporting production environments and IT management.
- Experience in handling production servers, Problem and Incident Management.
- Experience in troubleshooting platform and infrastructure issues.
- Experience in driving investigations in time-critical situations such as production service outages.
- Proven analytical and investigative skills.
- Self-sufficient, with a pro-active attitude and ability to prioritize and manage workload effectively.
- Expertise in networking (IP subnetting, routing, VPNs, DNS) in cloud environments.
- Proficiency in Linux and Windows Server administration and troubleshooting.
- Strong experience in SQL.
- Proficiency with SFTP, ensuring secure and reliable file transfers.
- Experience with cloud-native monitoring and logging tools (Splunk, Dynatrace, etc.).
- Experience working with and coordinating cross functional teams operating across different regions and time zones.
- Proficient English language skills.
Technical competencies:
- Incident Management tools – ServiceNow
- Project Management tools – Jira
- Documentation tools – Confluence
- Cloud Platform – Good understanding of Amazon Web Services
- Operating systems – Windows Server, Linux/Unix
- Database –MySQL, SQL and Oracle
- Experience with cloud monitoring tools like Splunk or Dynatrace.
- Extensive experience with ADFS and SSO implementation
Soft Skills:
- Good leadership abilities with the capacity to manage multiple priorities and guide junior engineers.
- Excellent communication and collaboration skills, with the ability to work with cross-functional teams and external stakeholders.
- Proactive, self-motivated attitude with a focus on continuous learning and improvement.