DevOps Professional

BOLD

Job Summary

BOLD is seeking a DevOps Professional to manage production ML environments and MLOps infrastructure within the infrastructure and data science teams. This role involves day-to-day support, cross-team projects, and architecting scalable MLOps pipelines using AWS services like SageMaker, EMR, and OpenSearch. The infrastructure team focuses on automation, observability, cloud architecture, CI/CD, and security to ensure efficient and reliable application releases while continuously learning and implementing new technologies.

Must Have

  • Own production ML environments and MLOps infrastructure.
  • Handle day-to-day support, ad-hoc requests, and cross-team projects.
  • Architect scalable MLOps pipelines using AWS services like SageMaker, EMR, and OpenSearch.
  • Productionize uServices, design and maintain DevOps pipelines.
  • Collaborate with data scientists on model serving and workflow optimization.
  • Implement monitoring, alerting, and observability to reduce MTTR.
  • Automate CI/CD for ML models and infrastructure with governance and security compliance.
  • Handle security patching, cost optimization, and 24x7 on-call rotations.
  • Coordinate cross-functionally with development, QA, ops, and data teams.
  • 5+ years (Sr Engineer) / 7+ years (Module Lead) in AWS and DevOps.
  • Expertise in CI/CD pipelines using Jenkins, Spinnaker, Groovy, CodePipeline, CodeBuild, Gradle, Artifactory, Docker/npm registries.
  • Deep containerization/orchestration with Kubernetes (EKS), ECS, Docker.
  • Infrastructure-as-Code via Terraform.
  • Python/Bash scripting for CI/CD, provisioning, monitoring of web services and Linux servers.
  • Experience with AWS services (S3, DynamoDB, Lambda, Step Functions), MySQL, MongoDB.
  • Monitoring with CloudWatch, Prometheus, Grafana, ELK, Datadog, Splunk.
  • Strong Linux and networking fundamentals.

Perks & Benefits

  • Competitive salary
  • Tax-friendly compensation structure
  • Bi-annual bonus
  • Annual Appraisal
  • Equity in company
  • Group Mediclaim, personal accident, & term life insurance
  • Group Mediclaim benefit (including parents' coverage)
  • Practo Plus health membership for employees and family
  • Personal accident and term life insurance coverage
  • 24 days paid leaves
  • Declared fixed holidays
  • Paternity and maternity leave
  • Compassionate and marriage leave
  • Covid leave (up to 7 days)
  • Internet and home office reimbursement
  • In-office catered lunch, meals, and snacks
  • Certification policy
  • Cab pick-up and drop-off facility

Job Description

ABOUT THIS TEAM

Infrastructure team provides various services including automation, observability, cloud/server/network architectures, CICD, infrastructure as code, database administration, incident management, vendor management, security and compliance, and acquiring new skills. These services help to improve efficiency, reduce errors, and ensure fast and reliable application releases while maintaining security and compliance. Techops help teams monitor applications and infrastructure, create resilient infrastructure, identify and resolve IT service issues, manage vendors, and ensure cloud security and compliance. The team also focuses on continuous learning and implementing new technologies to provide better value to the organization.

WHAT YOU’LL DO

  • Productionize uServices, design and maintain Devops pipelines, collaborating with data scientists on model serving and workflow optimization.
  • Implement monitoring, alerting, and observability to reduce MTTR and ensure production reliability.
  • Automate CI/CD for ML models and infrastructure with governance and security compliance.
  • Handle security patching, cost optimization, and 24x7 on-call rotations for critical services.
  • Coordinate cross-functionally with development, QA, ops, and data teams to innovate build/deployment processes.

WHAT YOU’LL NEED

  • 5+ years (Sr Engineer)/7+ years (Module Lead) in AWS and Devops with hands-on
  • Expertise in CI/CD pipelines using Jenkins, Spinnaker, Groovy, CodePipeline, CodeBuild; Gradle, Artifactory, Docker/npm registries.
  • Deep containerization/orchestration with Kubernetes (EKS), ECS, Docker; Infrastructure-as-Code via Terraform.
  • Python/Bash scripting for CI/CD, provisioning, monitoring of FastAPI/Spring Boot web services; and Linux servers (Solr/OpenSearch).
  • AWS services (S3, DynamoDB, Lambda, Step Functions) with cost control, reporting; databases (MySQL, MongoDB).
  • Monitoring with CloudWatch, Prometheus, Grafana, ELK, Datadog, Splunk for health/performance analytics.
  • Strong Linux and networking fundamentals;

EXPERIENCE-

  • Engineer, DevOps - 4.5+ years
  • DevOps Lead, DevOps – 7+ years

#LI-SV1

Benefits

Outstanding Compensation

  • Competitive salary
  • Tax-friendly compensation structure
  • Bi-annual bonus
  • Annual Appraisal
  • Equity in company

100% Full Health Benefits

  • Group Mediclaim, personal accident, & term life insurance
  • Group Mediclaim benefit (including parents' coverage)
  • Practo Plus health membership for employees and family
  • Personal accident and term life insurance coverage

Flexible Time Away

  • 24 days paid leaves
  • Declared fixed holidays
  • Paternity and maternity leave
  • Compassionate and marriage leave
  • Covid leave (up to 7 days)

Additional Benefits

  • Internet and home office reimbursement
  • In-office catered lunch, meals, and snacks
  • Certification policy
  • Cab pick-up and drop-off facility

27 Skills Required For This Role

Performance Analysis Game Texts Quality Control Mysql Networking Linux Aws Spring Boot Model Serving Prometheus Terraform Grafana Elk Solr Spinnaker Gradle Fastapi Cloud Security Mongodb Npm Ci Cd Docker Kubernetes Python Splunk Bash Jenkins

Similar Jobs