About the job
SummaryBy Outscal
Tenstorrent seeks a Deep Learning Library Developer with 10+ years of experience in kernel development, low-level optimizations, and tensor optimization. Proficiency in C/C++, machine learning frameworks, and performance profiling is essential. You'll lead kernel development, optimize code, and collaborate with ML engineers to ensure high-performance software.
About the job
Tenstorrent is leading the industry on cutting-edge AI technology, revolutionizing performance expectations, ease of use, and cost efficiency. With AI redefining the computing paradigm, solutions must evolve to unify innovations in software models, compilers, platforms, networking, and semiconductors. Our diverse team of technologists have developed a high performance RISC-V CPU from scratch, and share a passion for AI and a deep desire to build the best AI platform possible. We value collaboration, curiosity, and a commitment to solving hard problems. We are growing our team and looking for contributors of all seniorities.
As a Kernel Developer at Tenstorrent, you will play a crucial role in optimizing low-level workloads, kernel development, and enhancing our software's performance for machine learning applications. You will work closely with a team of highly skilled engineers to ensure that our software operates at peak efficiency, delivering high-quality results to our clients and users.
Responsibilities
- Kernel Development: Lead and participate in the design, development, and maintenance of kernel-level software components for our applications. develop and optimize kernels and kernel libraries for efficient machine learning and HPC applications.
- Implementation of tensor compute and tensor data movement optimizations kernels
- Heavy focus on optimizations.
- Low-Level Optimization: Analyze and optimize low-level code to improve the performance and efficiency of our software, with a strong emphasis on tensor optimization.
- Machine Learning Integration: Collaborate with machine learning engineers and data scientists to integrate optimized kernels and low-level routines into machine learning frameworks and pipelines.
- Performance Profiling: Identify performance bottlenecks, conduct performance profiling, and develop strategies to address and resolve them.
- Testing and Debugging: Write comprehensive unit tests, conduct thorough debugging, and ensure the stability and reliability of kernel-level code.
- Documentation: Create clear and concise documentation for code, APIs, and best practices to facilitate collaboration within the team.
- Research and Innovation: Stay up-to-date with the latest developments in kernel development, tensor optimization, and machine learning to propose innovative solutions and improvements.
Experience & Qualifications
- Bachelor's degree in Computer Science, Software Engineering, or a related field.
- 10+ Years of hands experience
- Proven experience in kernel development, with a strong focus on low-level optimizations and tensor optimization.
- Proficiency in C/C++ programming languages.
- Familiarity with machine learning frameworks and concepts.
- Strong problem-solving skills and the ability to analyze and debug complex issues.
- Experience with performance profiling and optimization tools.
- Excellent communication and teamwork skills.
- Self-motivated, detail-oriented, and able to work independently as well as in a team.
- Experience with GPU programming (CUDA, OpenCL) is a plus.
- Knowledge of operating system internals is a plus.
Tenstorrent offers a highly competitive compensation package and benefits, and we are an equal opportunity employer.