Senior Technical Operations Engineer

2 Hours ago • 10 Years + • Software Development & Engineering • $185,000 PA - $222,000 PA

Job Summary

Job Description

Zoox is seeking a Senior Technical Operations Engineer to join their expanding IT Platform team. This role focuses on real-time command center support, monitoring services, and implementing Site Reliability Engineering (SRE) principles to ensure the stability of live robot missions. You will be responsible for developing and supporting a 24/7 technical team, serving as the primary contact for critical IT operations, and driving strategic initiatives for IT TechOps. Key responsibilities include overseeing operational observability, incident response, managing real-time monitoring solutions, and enhancing automation. You will also provide strong technical support for WAN, POP, and IoT solutions, eventually expanding to AWS EKS environments, and collaborating with various technical teams to create a seamless IT operations framework. Championing SRE principles for improved reliability, scalability, and performance is crucial.
Must have:
  • 10+ years of IT experience
  • 5+ years in real-time operations
  • Expert experience in building/managing real-time IT operations
  • Success in critical real-time operations
  • Networking concepts & troubleshooting
  • Full-stack knowledge
  • Lead observability initiatives
Good to have:
  • IoT and Cellular experience
  • AWS EKS experience
Perks:
  • Paid time off
  • Zoox Stock Appreciation Rights
  • Amazon RSUs
  • Health insurance
  • Long-term care insurance
  • Disability insurance
  • Life insurance
  • Sign-on bonus may be offered

Job Details

The IT Platform team at Zoox is expanding to include IT Technical Operations for our commercial service, with a focus on real-time command center support, including monitoring services, and embracing Site Reliability Engineering (SRE) principles. As a Senior Technical Operations Engineer, you will join a new Technical Operations Engineering team, while integrating with existing operations teams, to ensure the stability and success of live robot missions. This role is an opportunity to shape the future of Zoox's real-time operations, drive strategic initiatives, and implement innovative solutions that enhance reliability and performance.

In this role, you will:

    • Real-Time Command Center Operations & Strategy: Develop and support  a 24/7 real-time technical team, serving as the primary point of contact for IT’s mission-critical operations in our Fusion (Operations) Centers. Execute the strategic vision for IT TechOps, ensuring alignment with business objectives, while implementing scalable processes for real-time troubleshooting and operational support to maintain seamless live operations. Initial scope will be focused heavily on WAN, POP, and IoT support. Integration with existing operational teams will be key.
    • Real-Time Monitoring & Incident Response: Oversee and optimize operational observability for Zoox’s active robot fleet’s IT dependencies. Prioritizing proactive issue detection, and rapid incident response, this role will implement and manage real-time monitoring solutions, ingesting multiple data sources, integrating with stakeholders on best practices for observability, enhancing automation, and improving operational efficiency.
    • Technical Strategy: Deliver strong technical support for deployed WAN, POP, and IoT solutions.  Later expanding support into other areas, such as AWS EKS environments. Ensure system reliability and performance. Leverage expertise in networking, cloud computing, and troubleshooting to maintain operational efficiency and resolve issues effectively. Triage and troubleshooting issues from desktop, datacenter, network, provider, cloud and software.
    • Collaboration: Coordinate with multiple technical and support teams to integrate a seamless real-time IT operations framework, enhancing capabilities through cross-team alignment. Proactively discover, document, and optimize IT technical operations processes, ensuring workflows are well-defined, refined, and automated for efficiency and scalability. Strong communication skills and ability to lead escalation calls to resolution. 
    • Site Reliability & Continuous Improvement: Champion SRE principles to enhance system reliability, scalability, and performance by implementing self-healing mechanisms, automated incident response, and continuous improvement initiatives that adapt to evolving business and technical needs.  Closed lopped communication with engineering and SRE teams who own the broader solutions (e.g. WAN, internal apps).

Qualifications:

    • 10+ years of IT experience, including 5+ years in real-time operations
    • Expert-level experience in building and managing real-time IT operations and processes
    • Proven track record of success in critical real-time operations (e.g., life-safety, financial, transportation)
    • Demonstrable knowledge of networking concepts and troubleshooting, including OSI model, TCP/IP, network security, IoT and Cellular experience is highly desirable
    • Ability to lead observability initiatives, including building of real-time dashboards
    • Full-stack knowledge: understanding of production IT environments and end-to-end service delivery
    • Exposure to operating production environments in AWS; EKS also beneficial
$185,000 - $222,000 a year
Base Salary Range

There are three major components to compensation for this position: salary, Amazon Restricted Stock Units (RSUs), and Zoox Stock Appreciation Rights. A sign-on bonus may be offered as part of the compensation package. The listed range applies only to the base salary. Compensation will vary based on geographic location and level. Leveling, as well as positioning within a level, is determined by a range of factors, including, but not limited to, a candidate's relevant years of experience, domain knowledge, and interview performance. The salary range listed in this posting is representative of the range of levels Zoox is considering for this position.

Zoox also offers a comprehensive package of benefits, including paid time off (e.g. sick leave, vacation, bereavement), unpaid time off, Zoox Stock Appreciation Rights, Amazon RSUs, health insurance, long-term care insurance, long-term and short-term disability insurance, and life insurance.

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Foster City, California, United States

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Software Development & Engineering Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!