Cloud Engineer III-Observability

6 Months ago • 4-6 Years • Devops

Job Summary

Job Description

The Cloud Engineer III-Observability will be responsible for building and managing the telemetry and observability service used by all product teams on the Smarsh platform. This role involves developing automation, creating integrations for third-party tools, ensuring efficient resource utilization, and supporting the features by debugging and creating RCA for production issues. The engineer will also participate in an on-call rotation to provide 24/7 support for critical systems.
Must have:
  • 4-6 years of professional experience in DevOps or software engineering roles.
  • Proficiency in infrastructure as code (IaC) using Terraform or similar tools.
  • Experience with scripting and automation using Python.
  • Experience with CI/CD pipelines and automation tools.
  • Experience with observability and telemetry tools (Prometheus, Grafana, ELK stack).
  • Understanding SRE principles including monitoring, alerting, and automation.

Job Details

Who are we?


Smarsh empowers its customers to manage risk and unleash intelligence in their digital communications. Our growing community of over 6500 organizations in regulated industries counts on Smarsh every day to help them spot compliance, legal or reputational risks in 80+ communication channels before those risks become regulatory fines or headlines.  Relentless innovation has fueled our journey to consistent leadership recognition from analysts like Gartner and Forrester, and our sustained, aggressive growth has landed Smarsh in the annual Inc. 5000 list of fastest-growing American companies since 2008.


About the team: The Observability team builds and manages the single telemetry and observability service used by all product teams on the Smarsh platform. It provides "as a service" telemetry, monitoring, and visualization capabilities that enable our product teams to operate, support, and triage the applications and services under their product portfolio.

 

We are seeking a rigorous, problem-solving, and curious Platform Engineer (who codes!) to join our Fabric Insight group. Fabric teams at Smarsh combine software and systems engineering to build and run products that equip our engineering teams with secure tools and infrastructure to do their best work. We are looking for someone who can build Observability systems that engineers love to work with. In this role, you will play a key part in shaping the future of our platform by developing tooling and providing hands-on technical expertise to design, deploy, and optimize our services in a compliant and cost-effective way in the cloud. The ideal candidate will have a programming background in a cloud environment, a strong understanding of cloud automation, Observability, and security best practices, as well as the ability to collaborate effectively with cross-functional teams.

Roles & Responsibilities

    • Develop and analyze various business and technical scenarios to drive the highest levels of executive decision-making around Observability resources. Drive consensus and decisions with stakeholders.
    • Develop and implement automation to provision, configure, deploy, and monitor Observability services.
    • Create reusable integrations for third-party tools (e.g., CI/CD systems, monitoring platforms, container registries and many more) to consolidate workflows.
    • Communicate risks and progress in a timely manner to reporting supervisor
    • Ensure efficient resource utilization and continuously improve processes leveraging automation and internal tools resulting in enhanced Product delivery, maturity, and scalability.
    • Support the features delivered by debugging and creating RCA for production issues and subsequently work towards short term and long-term fix
    • On-Call Rotation: Participate in an on-call rotation to provide 24/7 support for critical systems.

Required Experience/Skills

    • Professional degree in Computer Science from a reputed college with consistent academic record.
    • 4-6 years of professional experience in DevOps or software engineering roles, with a focus on configuring, deploying, and maintaining Kubernetes in AWS
    • Strong proficiency in infrastructure as code (IaC) using Terraform, AWS CloudFormation, or similar tools.
    • Experience with scripting and automation using languages such as Python
    • Experience with CI/CD pipelines and automation tools such as Concourse, Jenkins, or Ansible.
    • Experience with teams having delivered observability and telemetry tools and practices, such as Prometheus, Grafana, ELK stack, distributed tracing, and performance monitoring.
    • Experience with cloud-native tools such as Istio, Argo CD, External Secrets Operator, Keda, Karpenter, etc
    • Understanding SRE principles includes monitoring, alerting, error budgets, fault analysis, and automation.
    • Concepts of SLI, SLO, SLA, Define SLIs (Service Level Indicators), SLOs (Service Level Objectives), and error budgets.
    • Excellent problem-solving skills and attention to detail.
About our culture

Smarsh hires lifelong learners with a passion for innovating with purpose, humility and humor. Collaboration is at the heart of everything we do. We work closely with the most popular communications platforms and the world’s leading cloud infrastructure platforms. We use the latest in AI/ML technology to help our customers break new ground at scale. We are a global organization that values diversity, and we believe that providing opportunities for everyone to be their authentic self is key to our success. Smarsh leadership, culture, and commitment to developing our people have all garnered Comparably.com Best Places to Work Awards. Come join us and find out what the best work of your career looks like.

Similar Jobs

fluence - Senior Sales Manager/Leader - Energy Storage

fluence

Melbourne, Victoria, Australia (Hybrid)
1 Month ago
Spaulding Ridge - Finance Transformation Lead

Spaulding Ridge

Chicago, Illinois, United States (On-Site)
3 Months ago
Epic Games - Legal Operations Lead

Epic Games

(On-Site)
7 Months ago
Riot Games - Business Systems Analyst II, Finance Technology

Riot Games

Los Angeles, California, United States (On-Site)
1 Month ago
Survay Monkey - Staff Site Reliability Engineer

Survay Monkey

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Rippling - Senior Software Engineer - Global Payroll Platform

Rippling

Bengaluru, Karnataka, India (On-Site)
7 Months ago
DevRev - Partner Solutions Engineer

DevRev

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Perplexity - Infrastructure Capacity Engineer

Perplexity

Palo Alto, California, United States (On-Site)
1 Week ago
Flexra Software - Senior Site Reliability Engineer

Flexra Software

Bengaluru, Karnataka, India (Hybrid)
2 Weeks ago
Brillio - PCF to Azure AKS Migration Architect - R01531191

Brillio

Bengaluru, Karnataka, India (Hybrid)
9 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Mozilla - Director, Enterprise Resource Planning

Mozilla

United States (Remote)
2 Months ago
Coupa - Regional Sales VP - Mid-Market - West

Coupa

Foster City, California, United States (Hybrid)
1 Month ago
Lionbridge Games - Software Testing Associate

Lionbridge Games

Masovian Voivodeship, Poland (On-Site)
4 Months ago
Alpha Sense - Product Specialist

Alpha Sense

New York, New York, United States (On-Site)
8 Months ago
Activate Games - Shipper/Receiver

Activate Games

Rosser, Manitoba, Canada (On-Site)
1 Month ago
PwC - Berater:in CRM - SAP Customer Experience

PwC

Zürich, Zurich, Switzerland (On-Site)
9 Months ago
Autodesk - Strategic Territory Sales Manager

Autodesk

Barcelona, Catalonia, Spain (On-Site)
1 Year ago
Rovio  - Principal Data Analyst, Team Manager

Rovio

Uusimaa, Finland (Hybrid)
1 Month ago
USE Insider - Account Director - United States

USE Insider

United States (Remote)
1 Week ago
EvenUp - Engineering Manager, AI Document Generation

EvenUp

San Francisco, California, United States (Hybrid)
2 Weeks ago

Get notifed when new similar jobs are uploaded

Jobs in India

ever stage - Enterprise Marketing Specialist

ever stage

Chennai, Tamil Nadu, India (On-Site)
4 Weeks ago
Qualcomm - HW Program Manager, Sr

Qualcomm

Bengaluru, Karnataka, India (On-Site)
1 Month ago
Accenture - Customer Service New Associate

Accenture

Bengaluru, Karnataka, India (On-Site)
3 Months ago
GoDaddy - Principal Security Engineer

GoDaddy

India (Remote)
1 Month ago
Capgemini - Record to Analyze Process Expert

Capgemini

Noida, Uttar Pradesh, India (On-Site)
3 Months ago
Paytm - KAM - Vijayawada

Paytm

Vijayawada, Andhra Pradesh, India (On-Site)
7 Months ago
Capgemini - Reltio - C2

Capgemini

Bhubaneswar, Odisha, India (On-Site)
4 Months ago
Rackspace Technology - Customer Success Associate

Rackspace Technology

India (Remote)
1 Month ago
broadcom - VCF SAM Manager

broadcom

Bengaluru, Karnataka, India (On-Site)
1 Year ago
Toppan MErril - Systems Engineer

Toppan MErril

Chennai, Tamil Nadu, India (On-Site)
1 Year ago

Get notifed when new similar jobs are uploaded

Devops Jobs

CyberArk - Solutions Engineer, Enterprise Accounts - Central

CyberArk

United States (On-Site)
2 Months ago
Domo - Senior DevOps Engineer

Domo

American Fork, Utah, United States (On-Site)
3 Weeks ago
Apple - CAD Automation and Infrastructure Engineer

Apple

Sunnyvale, California, United States (On-Site)
1 Month ago
Riot Games - Staff Software Engineer (Build) - Teamfight Tactics

Riot Games

Los Angeles, California, United States (On-Site)
6 Months ago
Agate studios - DevOps Intern

Agate studios

Indonesia (On-Site)
2 Months ago
IO Interactive - Senior Platform Engineer

IO Interactive

Malmö, Skåne County, Sweden (Hybrid)
1 Month ago
NVIDIA - Senior Solution Architect - Hardware

NVIDIA

Beijing, Beijing, China (On-Site)
6 Months ago
Nagarro - System Engineer Infrastructure Services

Nagarro

Germany (Remote)
6 Months ago
Shield AI - Engineer II, Software Infrastructure (R3493)

Shield AI

San Diego, California, United States (On-Site)
1 Week ago
bytedance - Site Reliability Engineer (Cloud) - Infrastructure Engineering

bytedance

Singapore (On-Site)
9 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Bengaluru, Karnataka, India (Hybrid)

Heredia, Costa Rica (Hybrid)

Boca Raton, Florida, United States (Remote)

Portland, Oregon, United States (Hybrid)

United States (Remote)

Atlanta, New York, United States (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Bengaluru, Karnataka, India (Hybrid)

Johannesburg, Gauteng, South Africa (Remote)

New York, United States (Remote)

View All Jobs

Get notified when new jobs are added by smarsh

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug