Senior Technical Operations Engineer

3 Months ago • 10 Years + • $185,000 PA - $222,000 PA
Software Development & Engineering

Job Description

The IT Platform team at Zoox is seeking a Senior Technical Operations Engineer to join their expanding team. This role focuses on IT Technical Operations for commercial services, emphasizing real-time command center support, monitoring services, and Site Reliability Engineering (SRE) principles. Responsibilities include developing and supporting a 24/7 real-time technical team, overseeing operational observability, and implementing real-time monitoring solutions. The engineer will also provide strong technical support for deployed WAN, POP, and IoT solutions, collaborate with various technical teams, and champion SRE principles for enhanced reliability and continuous improvement.
Good To Have:
  • IoT and Cellular experience
  • EKS experience
Must Have:
  • 10+ years of IT experience
  • 5+ years in real-time operations
  • Expert-level real-time IT operations management
  • Proven success in critical real-time operations
  • Demonstrable networking concepts and troubleshooting
  • Lead observability initiatives and build dashboards
  • Full-stack knowledge of production IT environments
  • AWS production environments experience
Perks:
  • Paid time off
  • Zoox Stock Appreciation Rights
  • Amazon RSUs
  • Health insurance
  • Long-term care insurance
  • Long-term and short-term disability insurance
  • Life insurance

Add these skills to join the top 1% applicants for this job

communication
problem-solving
live-operations
networking
incident-response
aws

The IT Platform team at Zoox is expanding to include IT Technical Operations for our commercial service, with a focus on real-time command center support, including monitoring services, and embracing Site Reliability Engineering (SRE) principles. As a Senior Technical Operations Engineer, you will join a new Technical Operations Engineering team, while integrating with existing operations teams, to ensure the stability and success of live robot missions. This role is an opportunity to shape the future of Zoox's real-time operations, drive strategic initiatives, and implement innovative solutions that enhance reliability and performance.

In this role, you will:

    • Real-Time Command Center Operations & Strategy: Develop and support  a 24/7 real-time technical team, serving as the primary point of contact for IT’s mission-critical operations in our Fusion (Operations) Centers. Execute the strategic vision for IT TechOps, ensuring alignment with business objectives, while implementing scalable processes for real-time troubleshooting and operational support to maintain seamless live operations. Initial scope will be focused heavily on WAN, POP, and IoT support. Integration with existing operational teams will be key.
    • Real-Time Monitoring & Incident Response: Oversee and optimize operational observability for Zoox’s active robot fleet’s IT dependencies. Prioritizing proactive issue detection, and rapid incident response, this role will implement and manage real-time monitoring solutions, ingesting multiple data sources, integrating with stakeholders on best practices for observability, enhancing automation, and improving operational efficiency.
    • Technical Strategy: Deliver strong technical support for deployed WAN, POP, and IoT solutions.  Later expanding support into other areas, such as AWS EKS environments. Ensure system reliability and performance. Leverage expertise in networking, cloud computing, and troubleshooting to maintain operational efficiency and resolve issues effectively. Triage and troubleshooting issues from desktop, datacenter, network, provider, cloud and software.
    • Collaboration: Coordinate with multiple technical and support teams to integrate a seamless real-time IT operations framework, enhancing capabilities through cross-team alignment. Proactively discover, document, and optimize IT technical operations processes, ensuring workflows are well-defined, refined, and automated for efficiency and scalability. Strong communication skills and ability to lead escalation calls to resolution. 
    • Site Reliability & Continuous Improvement: Champion SRE principles to enhance system reliability, scalability, and performance by implementing self-healing mechanisms, automated incident response, and continuous improvement initiatives that adapt to evolving business and technical needs.  Closed lopped communication with engineering and SRE teams who own the broader solutions (e.g. WAN, internal apps).

Qualifications:

    • 10+ years of IT experience, including 5+ years in real-time operations
    • Expert-level experience in building and managing real-time IT operations and processes
    • Proven track record of success in critical real-time operations (e.g., life-safety, financial, transportation)
    • Demonstrable knowledge of networking concepts and troubleshooting, including OSI model, TCP/IP, network security, IoT and Cellular experience is highly desirable
    • Ability to lead observability initiatives, including building of real-time dashboards
    • Full-stack knowledge: understanding of production IT environments and end-to-end service delivery
    • Exposure to operating production environments in AWS; EKS also beneficial
$185,000 - $222,000 a year
Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

Set alerts for more jobs like Senior Technical Operations Engineer
Set alerts for new jobs by zoox
Set alerts for new Software Development & Engineering jobs in United States
Set alerts for new jobs in United States
Set alerts for Software Development & Engineering (Remote) jobs
Contact Us
hello@outscal.com
Made in INDIA 💛💙