Cloud Security Engineer (DevSec OPS)

1 Month ago • 3-8 Years

Job Summary

Job Description

As a Site Reliability Engineer, you will be crucial in designing and refining cloud infrastructure with a focus on reliability, security, and scalability. You will manage live production environments, contribute to automation, and ensure operational resilience. Responsibilities include cloud infrastructure management, enhancing service reliability, driving automation, incident response, and collaboration with cross-functional teams. The role involves applying software engineering principles to solve operational challenges, and requires a blend of technical expertise and proactive problem-solving skills.
Must have:
  • Expertise in cloud platforms (AWS, Azure, or GCP)
  • Proficiency in on-premises hosting and virtualization
  • Experience with containerization and orchestration (Kubernetes)
  • Proficiency in scripting languages (shell and Python)
  • Experience with Infrastructure as Code (IaC) tools
Good to have:
  • Experience with DevOps toolchain elements
  • Experience with database management, particularly MySQL and Hadoop
  • Knowledge of cloud cost management and optimization strategies
  • Understanding of cloud security best practices
  • Experience implementing disaster recovery and business continuity plans

Job Details

Service Reliability - Cloud Operations Engineering (SR)

Location : Hyderabad

Work Mode : Work from Office (All 5 days a week)

Shift Timings : 24/7 (Monthly Rotational)

Experience : 3-8 years

We are looking for a highly skilled and adaptable Site Reliability Engineer to become a key member of our Cloud Engineering team. In this crucial role, you will be instrumental in designing and refining our cloud infrastructure with a strong focus on reliability, security, and scalability. As an SRE, you'll apply software engineering principles to solve operational challenges, ensuring the overall operational resilience and continuous stability of our systems. This position requires a blend of managing live production environments and contributing to engineering efforts such as automation and system improvements.

Key Responsibilities:

  • Cloud Infrastructure Architecture and Management: Design, build, and maintain resilient cloud infrastructure solutions to support the development and deployment of scalable and reliable applications. This includes managing and optimizing cloud platforms for high availability, performance, and cost efficiency.
  • Enhancing Service Reliability: Lead reliability best practices by establishing and managing monitoring and alerting systems to proactively detect and respond to anomalies and performance issues. Utilize SLI, SLO, and SLA concepts to measure and improve reliability. Identify and resolve potential bottlenecks and areas for enhancement.
  • Driving Automation and Efficiency: Contribute to the automation, provisioning, and standardization of infrastructure resources and system configurations. Identify and implement automation for repetitive tasks to significantly reduce operational overhead. Develop Standard Operating Procedures (SOPs) and automate workflows using tools like Rundeck or Jenkins.
  • Incident Response and Resolution: Participate in and help resolve major incidents, conduct thorough root cause analyses, and implement permanent solutions. Effectively manage incidents within the production environment using a systematic problem-solving approach.
  • Collaboration and Innovation: Work closely with diverse stakeholders and cross-functional teams, including software engineers, to integrate cloud solutions, gather requirements, and execute Proof of Concepts (POCs). Foster strong collaboration and communication. Guide designs and processes with a focus on resilience and minimizing manual effort. Promote the adoption of common tooling and components, and implement software and tools to enhance resilience and automate operations. Be open to adopting new tools and approaches as needed.

Required Skills and Experience:

  • Cloud Platforms: Demonstrated expertise in at least one major cloud platform (AWS, Azure, or GCP).
  • Infrastructure Management: Proven proficiency in on-premises hosting and virtualization platforms (VMware, Hyper-V, or KVM). Solid understanding of storage internals (NAS, SAN, EFS, NFS) and protocols (FTP, SFTP, SMTP, NTP, DNS, DHCP). Experience with networking and firewall technologies. Strong hands-on experience with Linux internals and operating systems (RHEL, CentOS, Rocky Linux). Experience with Windows operating systems to support varied environments.
  • Extensive experience with containerization (Docker) and orchestration (Kubernetes) technologies.
  • Automation & IaC: Proficiency in scripting languages (shell and Python). Experience with configuration management tools (Ansible or Puppet). Must have exposure to Infrastructure as Code (IaC) tools (Terraform or CloudFormation).
  • Monitoring & Observability: Experience setting up and configuring monitoring tools (Prometheus, Grafana, or the ELK stack). Hands-on experience implementing OpenTelemetry for observability. Familiarity with monitoring and logging tools for cloud-based applications.
  • Service Reliability Concepts: A strong understanding of SLI, SLO, SLA, and error budgeting.
  • Soft Skills & Mindset: Excellent communication and interpersonal skills for effective teamwork. We value proactive individuals who are eager to learn and adapt in a dynamic environment. Must possess a pragmatic and adaptable mindset, with a willingness to step outside comfort zones and acquire new skills. Ability to consider the broader system impact of your work. Must be a change advocate for reliability initiatives.

Desired/Bonus Skills:

  • Experience with DevOps toolchain elements like Git, Jenkins, Rundeck, ArgoCD, or Crossplane.
  • Experience with database management, particularly MySQL and Hadoop.
  • Knowledge of cloud cost management and optimization strategies.
  • Understanding of cloud security best practices, including data encryption, access controls, and identity management.
  • Experience implementing disaster recovery and business continuity plans.
  • Familiarity with ITIL (Information Technology Infrastructure Library) processes

Similar Jobs

Loft Orbital - Team Lead - Cloud Infrastructure Team

Loft Orbital

Golden, Colorado, United States (Hybrid)
1 Month ago
Daxko - Manager, Security Operations & Engineering

Daxko

Birmingham, Alabama, United States (Remote)
1 Month ago
Nexon America - Associate Security Engineer

Nexon America

El Segundo, California, United States (Hybrid)
1 Month ago
Google - Network Security Engineer

Google

Austin, Texas, United States (On-Site)
1 Month ago
Treelix - Software Development Engineer

Treelix

Bengaluru, Karnataka, India (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Google - Software Engineer III, Google Cloud Security and Privacy

Google

Sunnyvale, California, United States (On-Site)
7 Months ago
Jumio - Senior Detect & Respond Engineer

Jumio

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
ION - Senior Security Architect

ION

Collecchio, Emilia-Romagna, Italy (On-Site)
7 Months ago
Aryaka - Product Marketing Manager

Aryaka

Santa Clara, California, United States (On-Site)
1 Month ago
Treelix - Senior Analyst Relations Manager

Treelix

Plano, Texas, United States (Hybrid)
3 Weeks ago
Zscaler - Principal Network Engineer

Zscaler

(Remote)
3 Weeks ago
ION - Senior Security Architect

ION

Pisa, Tuscany, Italy (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Hyderabad, Telangana, India

Philips - UX Designer

Philips

Bengaluru, Karnataka, India (Hybrid)
2 Weeks ago
Aspire - Senior Product Manager - Payments & Fx

Aspire

Gurugram, India (Hybrid)
3 Weeks ago
Glean - Tech Lead Manager

Glean

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Alphasense - Content Analyst II

Alphasense

Delhi, India (On-Site)
2 Weeks ago
Assystems - Middle Level - Port Structural Engineer

Assystems

Chennai, Tamil Nadu, India (On-Site)
7 Months ago
Microsoft - ROP - Senior Software Engineer

Microsoft

Hyderabad, Telangana, India (On-Site)
1 Month ago
FICO - Demand Generation Manager

FICO

Bengaluru, Karnataka, India (On-Site)
2 Weeks ago
Cubic Corporation - Senior Software Engineer (Mobile Back Office Dev)

Cubic Corporation

Hyderabad, Telangana, India (Hybrid)
2 Weeks ago
Saama - Senior Financial Analyst

Saama

Pune, Maharashtra, India (On-Site)
1 Month ago
GoTo Group - Data Scientist

GoTo Group

Bengaluru, Karnataka, India (On-Site)
7 Months ago

Get notifed when new similar jobs are uploaded

Similar Category Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

We are Pragmatists. Today, there are 3 distinct types of companies: the Pretenders, the Fairytale Startups, and the Pragmatists. At our core, we embody the latter. We prefer to keep it real. Unlike the Pretenders, we want our core values to guide decisioning and show up in the way people think, feel and act on a daily basis. Instead of being a Fairytale Startup, we want our people to think of us as their work home away from home (not a theme park) and to feel that they are making a huge impact. Our employees use their creativity and talent to invent new solutions, meet demands, and offer the most effective services/products.

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Amsterdam, North Holland, Netherlands (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

View All Jobs

Get notified when new jobs are added by high radius

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug