MLOps and Engineering

CyberArk

Job Summary

The AI, Data & Research unit at CyberArk is seeking a passionate MLOps and Engineering professional to build data-driven, ML-powered, and intelligent security solutions. This role involves designing, building, and maintaining a multi-tenant PaaS for ML pipelines and inference, ensuring scalability, reliability, and security. Responsibilities include infrastructure-as-code, Docker-based services, AWS solutions, CI/CD pipeline enhancement, security, observability, and collaboration with various teams.

Must Have

  • Design, build, and maintain infrastructure-as-code using Python and AWS services for deployment.
  • Architect, build, and manage Docker-based services.
  • Lead the design and implementation of solutions using AWS services such as SageMaker, Lambda, Step Functions, SageMaker Pipelines, Batch Transform, and Real-Time Endpoints.
  • Enhance and maintain CI/CD pipelines (Jenkins and shared libraries).
  • Ensure multi-tenant security and tenant isolation across the platform.
  • Define and implement observability and monitoring practices with Datadog and other tools.
  • Collaborate closely with Data Scientists, Data engineers, MLEs, Product Managers, and other engineering teams to integrate ML workflows.
  • Mentor junior engineers and promote engineering best practices.
  • Bachelor’s degree in computer science, Software Engineering, or a related field.
  • 4+ years of hands-on development experience with Python and AWS.
  • Proven experience with infrastructure as code (preferably AWS CDK, Terraform, or CloudFormation).
  • Strong knowledge of AWS architecture and services, particularly in data/ML workloads.
  • Deep experience with CI/CD pipelines (Jenkins or similar).
  • Strong expertise in Docker and containerized applications.
  • Demonstrated knowledge of cloud security, scalability, and tenant isolation.
  • Hands-on experience with observability platforms (preferably Datadog).

Good to Have

  • Background in MLOps, Data Platforms, or Machine Learning workflows.
  • Experience with additional monitoring and logging tools (CloudWatch, Prometheus, ELK).
  • Leadership experience in scaling cloud-native platforms.
  • Experience in information security.
  • Understanding of identity & access management, secrets management, or zero-trust architecture.

Job Description

##### Job Description

The AI, Data & Research unit is at the forefront of CyberArk’s innovation, building data-driven, ML-powered, and intelligent security solutions. We are looking for a passionate to join our team of seasoned ML engineers.

You will play a critical role in building a multi-tenant PaaS for ML pipelines and inference, ensuring scalability, reliability, and security. You will take ownership of critical platform components, drive best practices, and mentor other engineers.

  • Design, build, and maintain infrastructure-as-code using Python and AWS services for deployment.
  • Architect, build, and manage Docker-based services.
  • Lead the design and implementation of solutions using AWS services such as SageMaker, Lambda, Step Functions, SageMaker Pipelines, Batch Transform, and Real-Time Endpoints.
  • Enhance and maintain CI/CD pipelines (Jenkins and shared libraries).
  • Ensure multi-tenant security and tenant isolation across the platform.
  • Define and implement observability and monitoring practices with Datadog and other tools.
  • Collaborate closely with Data Scientists, Data engineers, MLEs, Product Managers, and other engineering teams to integrate ML workflows.
  • Mentor junior engineers and promote engineering best practices.

#LI-Hybrid

#LI-OS1

##### Qualifications

  • Bachelor’s degree in computer science, Software Engineering, or a related field.
  • 4+ years of hands-on development experience with Python and AWS.
  • Proven experience with infrastructure as code (preferably AWS CDK, Terraform, or CloudFormation).
  • Strong knowledge of AWS architecture and services, particularly in data/ML workloads.
  • Deep experience with CI/CD pipelines (Jenkins or similar).
  • Strong expertise in Docker and containerized applications.
  • Demonstrated knowledge of cloud security, scalability, and tenant isolation.
  • Hands-on experience with observability platforms (preferably Datadog).
  • Self-motivated and goal-oriented with a high work ethic.

##### Additional Information

  • Background in MLOps, Data Platforms, or Machine Learning workflows.
  • Experience with additional monitoring and logging tools (CloudWatch, Prometheus, ELK).
  • Leadership experience in scaling cloud-native platforms.
  • Experience in information security – an advantage
  • Understanding of identity & access management, secrets management, or zero-trust architecture - Bonus.

12 Skills Required For This Role

Leadership Game Texts Aws Prometheus Terraform Elk Cloud Security Ci Cd Docker Python Jenkins Machine Learning

Similar Jobs