AI Infrastructure Engineer

4 Months ago • All levels • $118,657 PA - $177,000 PA
Devops

Job Description

The AI Infrastructure Engineer will join the TIGE CDN Platform Dev team to develop a high-performing global multi-cloud CDN platform. The role involves integrating AI solutions to enhance platform automation, configuration, and incident management. Responsibilities include designing and developing AI-powered solutions for log analysis, root cause analysis, and automated troubleshooting. The engineer will also contribute to creating AI-driven configuration assistants, optimize AI model performance, and collaborate with various teams to apply AI technologies to solve business problems. This role demands strong analytical and communication skills, along with the ability to work in a dynamic, multi-cloud environment.
Good To Have:
  • Internship experience with AI solutions
  • Familiarity with log analytics platforms
  • Experience with infrastructure-as-code tools
  • Coursework in MLOps practices
Must Have:
  • Bachelor's or Master's in related fields
  • Proficient in Python and deep learning frameworks
  • Experience with AI model deployment and optimization
  • Basic understanding of distributed systems and cloud technologies
  • Excellent analytical and problem-solving skills
  • Strong communication and teamwork abilities

Add these skills to join the top 1% applicants for this job

team-management
problem-solving
prometheus
terraform
elk
model-deployment
pytorch
deep-learning
kubernetes
python
tensorflow

Team Description The TIGE CDN Platform Dev team provides a highly available, cost-efficient, and top-performing global multi-cloud CDN platform for ByteDance’s internal customers by integrating both self-built and commercial CDNs. The team continuously evolves the platform to include more cloud services beyond CDN, delivering a unified, secure, reliable, and high-performance multi-cloud PaaS solution. We are now expanding our capabilities by integrating AI-driven solutions to significantly enhance platform automation, configuration, and intelligent incident management. Job Description 1. Participate in exploring, designing, and developing AI-powered solutions for intelligent log analysis, root cause analysis (RCA), and automated troubleshooting to enhance platform reliability and reduce MTTR. 2. Contribute to designing and implementing AI-driven multi-cloud configuration assistants, enabling intuitive and automated interfaces for platform configuration and customer self-service scenarios. 3. Work closely with senior engineers, product managers, and operations teams to identify business pain points and apply AI technologies to deliver rapid, tangible improvements. 4. Assist in optimizing AI model inference performance and deployment efficiency within a cloud-native, edge computing environment.
Qualifications Minimum Qualifications: 1. Bachelor’s or Master’s degree in Computer Science, Electronics, Communication, Artificial Intelligence, or related fields. 2. Strong programming skills, proficient in Python, and familiarity with at least one deep learning framework such as PyTorch, TensorFlow, or JAX. 3. Academic or project-based experience with AI model deployment and optimization, especially large language models (LLMs), vector databases, RAG techniques, or prompt engineering. 4. Basic understanding of distributed systems, cloud-native technologies, and Kubernetes-based infrastructure. 5. Excellent analytical and problem-solving skills, with coursework or projects demonstrating the application of data-driven AI solutions. 6. Strong communication and teamwork abilities, comfortable collaborating across diverse teams. Preferred Qualifications 1. Internship experience or academic projects related to AI solutions within CDN, edge computing, or multi-cloud platforms. 2. Familiarity with log analytics platforms (e.g., Prometheus, ClickHouse, ELK) or automated incident management systems. 3. Experience with infrastructure-as-code tools (Terraform, OpenAPI) or automation frameworks. 4. Coursework or projects in MLOps practices or using deployment tools such as Kubeflow, Ray, or BentoML.

Set alerts for more jobs like AI Infrastructure Engineer
Set alerts for new jobs by bytedance
Set alerts for new Devops jobs in United States
Set alerts for new jobs in United States
Set alerts for Devops (Remote) jobs

Contact Us
hello@outscal.com
Made in INDIA 💛💙