Senior Site Reliability Engineer - Applied Machine Learning

4 Months ago • 4-8 Years • Devops • $194,000 PA - $410,000 PA

Job Summary

Job Description

The Senior Site Reliability Engineer (SRE) will join ByteDance's Applied Machine Learning (AML) team, supporting and advancing next-generation recommendation algorithms and platforms. Responsibilities include ensuring high availability of crucial machine learning services, developing highly automated systems and pipelines, and contributing to hardware/capacity decision-making. The role requires expertise in analyzing and troubleshooting distributed systems, proficiency in coding (Python, C/C++, or Go), and a strong background in algorithms and data structures. The SRE will collaborate closely with the AML team to build and maintain massively distributed AI/recommendation systems worldwide, enhancing performance and scalability.
Must have:
  • Expertise in distributed systems analysis and troubleshooting
  • Proficiency in Python, C/C++, or Go
  • Strong algorithms and data structures background
  • Experience with large-scale system design and maintenance
  • Experience with code optimization and automation
Good to have:
  • SRE experience on large-scale distributed systems
Perks:
  • Medical, dental, and vision insurance
  • 401(k) savings plan with company match
  • Paid parental leave
  • Short-term and long-term disability coverage
  • Life insurance
  • Wellbeing benefits
  • Paid holidays, sick days, and personal time

Job Details

Responsibilities
Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content. Why Join Us Creation is the core of ByteDance's purpose. Our products are built to help imaginations thrive. This is doubly true of the teams that make our innovations possible. Together, we inspire creativity and enrich life - a mission we aim towards achieving every day. To us, every challenge, no matter how ambiguous, is an opportunity; to learn, to innovate, and to grow as one team. Status quo? Never. Courage? Always. At ByteDance, we create together and grow together. That's how we drive impact-for ourselves, our company, and the users we serve. Join us. The mission of our AML team is to push next-generation recommendation-based algorithms and platform for the company. We also drive substantial impact for core businesses of the company. Currently we are looking for Site Reliability Engineers to join our team to support and advance that mission What You'll Do Site Reliability Engineering (SRE) of AML (Applied Machine Learning) team combines system engineering and the art of machine learning to develop and run massively distributed AI/recommendation system around the world. On the SRE team, you'll have the opportunity to sharpen your expertise in coding, performance analysis and large system operation, and get heavily involved in the process of hardware/capacity decision-making. SRE ensures that the very centric machine learning services at ByteDance have the highest level of availability, as well as creating highly automated systems and pipelines.
Qualifications
Minimum Qualifications: 1. Expertise in analyzing and troubleshooting distributed systems. 2. Bachelor/Master's degree in Computer Science, a related technical field involving software develop or systems engineering. 3. Experience programming in at least one of the following languages: Python, C/C++ or Go. 4. With solid background of algorithms and data structures. Preferred qualifications: 1. Ability to design and maintain large-scale systems. 2. Strong understanding of code optimizing and routine tasks automation. 3. SRE experience on large scale distributed system. ByteDance is committed to creating an inclusive space where employees are valued for their skills, experiences, and unique perspectives. Our platform connects people from across the globe and so does our workplace. At ByteDance, our mission is to inspire creativity and enrich life. To achieve that goal, we are committed to celebrating our diverse voices and to creating an environment that reflects the many communities we reach. We are passionate about this and hope you are too. ByteDance Inc. is committed to providing reasonable accommodations in our recruitment processes for candidates with disabilities, pregnancy, sincerely held religious beliefs or other reasons protected by applicable laws. If you need assistance or a reasonable accommodation, please reach out to us at https://shorturl.at/cdpT2
Job Information
【For Pay Transparency】Compensation Description (Annually)

The base salary range for this position in the selected city is $194000 - $410000 annually.

Compensation may vary outside of this range depending on a number of factors, including a candidate’s qualifications, skills, competencies and experience, and location. Base pay is one part of the Total Package that is provided to compensate and recognize employees for their work, and this role may be eligible for additional discretionary bonuses/incentives, and restricted stock units.

Benefits may vary depending on the nature of employment and the country work location. Employees have day one access to medical, dental, and vision insurance, a 401(k) savings plan with company match, paid parental leave, short-term and long-term disability coverage, life insurance, wellbeing benefits, among others. Employees also receive 10 paid holidays per year, 10 paid sick days per year and 17 days of Paid Personal Time (prorated upon hire with increasing accruals by tenure).

The Company reserves the right to modify or change these benefits programs at any time, with or without notice.

For Los Angeles County (unincorporated) Candidates:

Qualified applicants with arrest or conviction records will be considered for employment in accordance with all federal, state, and local laws including the Los Angeles County Fair Chance Ordinance for Employers and the California Fair Chance Act. Our company believes that criminal history may have a direct, adverse and negative relationship on the following job duties, potentially resulting in the withdrawal of the conditional offer of employment:

1. Interacting and occasionally having unsupervised contact with internal/external clients and/or colleagues;

2. Appropriately handling and managing confidential information including proprietary and trade secret information and access to information technology systems; and

3. Exercising sound judgment.

Similar Jobs

Palo Alto Networks - Senior Analyst, IT SOX and External Reporting Assurance

Palo Alto Networks

Santa Clara, California, United States (On-Site)
1 Month ago
bytedance - Software Engineer, Multi Cloud CDN

bytedance

San Jose, California, United States (On-Site)
3 Months ago
Synechron - Senior Python Developer (Machine Learning, Data Analysis, Visualization)

Synechron

Pune, Maharashtra, India (On-Site)
2 Months ago
Crunchyroll - Senior Data Analyst

Crunchyroll

Hyderabad, Telangana, India (On-Site)
4 Months ago
Qualcomm - AI SDK Software Engineer

Qualcomm

Shanghai, China (On-Site)
2 Months ago
Perplexity - Site Reliability Engineer

Perplexity

Belgrade, Serbia (Hybrid)
2 Months ago
Rebellion - Senior DevOps Engineer (AWS/Azure)

Rebellion

Oxford, England, United Kingdom (Hybrid)
4 Months ago
The Walt Disney Company - Principal Software Engineer - Ad Platform

The Walt Disney Company

Glendale, California, United States (On-Site)
6 Months ago
Simcorp - System Architect DevOps

Simcorp

Warsaw, Masovian Voivodeship, Poland (Hybrid)
1 Month ago
Alpha Sense - Lead AI Platform Engineer

Alpha Sense

New York, United States (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

HCL Tech - C++ Senior Developer

HCL Tech

Michigan, United States (On-Site)
1 Month ago
Keywords Studios - Customer Support Shift Lead

Keywords Studios

Tokyo, Japan (Remote)
4 Months ago
HoYoverse - Senior Gameplay Programmer AI

HoYoverse

Québec City, Quebec, Canada (Remote)
4 Months ago
Qualcomm - Firmware Development - Senior Engineer

Qualcomm

Chennai, Tamil Nadu, India (On-Site)
1 Month ago
Amber - Senior 2D Animator

Amber

Guadalajara, Jalisco, Mexico (Remote)
1 Month ago
Haleon - Analytical Scientist

Haleon

Suzhou, Jiangsu, China (On-Site)
1 Month ago
London stock Exchange - Software Engineer

London stock Exchange

Bangkok, Thailand (On-Site)
1 Month ago
PayPal - Staff Engineer, Agentic AI

PayPal

San Jose, California, United States (Hybrid)
1 Month ago
Wrike - Renewal Manager

Wrike

Prague, Prague, Czechia (Hybrid)
3 Months ago
Riot Games - Manager, Software Engineering - Payments

Riot Games

Los Angeles, California, United States (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Jobs in San Jose, California, United States

Roblox - Director, Ads and Brands Communications

Roblox

San Mateo, California, United States (Hybrid)
1 Month ago
Star schema - Assistant Manager

Star schema

Edgewater, Florida, United States (On-Site)
1 Month ago
Apple - HI Designer

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Passive Logic - Data Scientist - AI Benchmark Architect

Passive Logic

Holladay, Utah, United States (On-Site)
5 Months ago
Luxoft - Senior Software Support Engineer

Luxoft

Italy, New York, United States (Remote)
8 Months ago
NBC Universal - Fleet Service Technician

NBC Universal

Doraville, Georgia, United States (On-Site)
2 Months ago
Apple - Software Engineer, Machine Learning

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Rippling - Product Lead, Employee Experience

Rippling

New York, United States (On-Site)
3 Months ago
Nexon - Global MapleStory Player Support Specialist (Temporary)

Nexon

El Segundo, California, United States (Hybrid)
2 Months ago
Blue wire software - FIG Restaurant Host

Blue wire software

Santa Monica, California, United States (On-Site)
1 Month ago

Get notifed when new similar jobs are uploaded

Devops Jobs

velotio technologies  - Senior DevOps Engineer (GCP)

velotio technologies

Pune, Maharashtra, India (Remote)
4 Months ago
clevertap - Staff Engineer - DevOps

clevertap

Mumbai, Maharashtra, India (Hybrid)
6 Months ago
Google - Software Engineer III, Full Stack, Google Cloud Business Platforms

Google

Kirkland, Washington, United States (On-Site)
3 Months ago
Progress - DevOps Engineer

Progress

Sofia, Sofia City Province, Bulgaria (Hybrid)
2 Months ago
London stock Exchange - Site Reliability Engineer

London stock Exchange

Buffalo, New York, United States (Hybrid)
2 Months ago
Apple - Sr Software Engineer - Infrastructure and operations

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Capgemini - Cloud Network security Engineers

Capgemini

Noida, Uttar Pradesh, India (On-Site)
1 Month ago
NVIDIA - Senior Site Reliability Engineer, HPC and LSF

NVIDIA

Durham, North Carolina, United States (On-Site)
4 Months ago
bytedance - Site Reliability Engineer, Edge Services

bytedance

Boston, Massachusetts, United States (On-Site)
5 Months ago
bytedance - Cloud Site Reliability Engineer

bytedance

San Jose, California, United States (On-Site)
4 Months ago

Get notifed when new similar jobs are uploaded

About The Company

Founded in 2012, ByteDance's mission is to inspire creativity and enrich life. With a suite of more than a dozen products, including TikTok as well as platforms specific to the China market, including Toutiao, Douyin, and Xigua, ByteDance has made it easier and more fun for people to connect with, consume, and create content.
View All Jobs

Get notified when new jobs are added by bytedance

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug