Broadcom is looking for a Kubernetes Platform Engineer to join VMware Cloud Foundation’s (VCF) AI and Advanced Services team. This position is key to building a best in class private cloud AI platform. You will have a high impact by playing a critical role designing and implementing scalable solutions along with a team of talented and enthusiastic engineers.
This role will be a member of the Private AI Services’ Control Plane team, which is a Kubernetes based control plane that automates the lifecycle of AI Services: Model Gallery, Model Runtime and ML API Gateway, Data Indexing and Retrieval, and Agent Builder. The successful candidate must have experience building Kubernetes based services, and experience participating in upstream Kubernetes and related CNCF projects as a contributor is a major plus.
The AI & Advanced Services team is responsible for building AI platform capabilities into the VMware Cloud Foundation product to enable our enterprise customers to have all of the AI platform features they need to build, deploy, test, manage, and scale their AI infrastructure and workloads.
Responsibilities
- Collaborate with cross-functional teams to design and deliver expanded capabilities of Kubernetes-based platform services for AI
- Own the AI platform’s end user / in-product CLI across all components, help guide other teams in how to deliver CLI based experience for the platform
- Decompose vague problems into detailed requirements, and develop solutions that meet the needs of our customers
- Develop and maintain automated tests to ensure the quality and reliability of the Private AI feature set
- Participate in code reviews and ensure that the code is aligned with VMware's coding standards and best practices
- Troubleshoot and resolve complex issues related to Private AI services and how those services interface with other components of the stack such as storage, networking, etc.
Requirements
- Ability to succeed on an take-home distributed systems homework assignment and in-person technical interview including coding and debugging
- 5+ years experience in scalable distributed systems in Go or C++
- 5+ years of hands on experience with Container technologies (Docker and Kubernetes)
- Hands on experience deploying and maintaining Kubernetes Operators is a big plus
- Proven knowledge of systems design
- Strong analytical and diagnostic skills with ability to work independently
- Excellent communication and collaboration skills, with the ability to work with cross-functional teams
- Experience with agile development methodologies and version control systems, such as Git
- BS in Computer Science or related technical fields and 8+ years of related experience in the software industry or MS in Computer Science or related technical fields and 6+ years of related experience in the software industry
- Candidate should not require sponsorship