This is a critical role with a wide range of responsibilities, including: ● Analyze and improve system design to reduce failure modes and promote self-healing systems ● Establish and maintain robust systems that facilitate observability, encompassing logging, monitoring, distributed tracing, alerting, and offline test tools. ● Work with development partners to shape the architecture, design, and implementations of new and existing systems to enhance their reliability, performance, efficiency, and scalability ● Ability to work both independently as well as part of a geographically dispersed yet integrated team. ● Collaborate with service engineers to establish Service Level Agreements (SLAs) and Service Level Objectives (SLOs) for backend services. ● Being able to identify the indications or cues that demonstrate the effectiveness of an application and having the knowledge to improve or repair its performance ● Ability to assess options and suggest solutions when there is limited or unclear information. This position requires a level of comfort and assurance in dealing with uncertain situations. ● Ability to work seamlessly within a team as well as manage individual tasks ● Respond to emerging incidents, solve critical issues, and follow through with a plan for resolution or future mitigation ● Act as an SME on the Engineering Operations team, partnering with backend services teams and application teams to overcome challenges across all the platforms where we stream our service Qualities / Experience We’re Seeking We believe the right individual will have the following skills and experience to be successful in the role: ● 5+ years experience in software development ● Degree in Computer Science or related or equivalent work experience ● You have solid engineering and coding skills, data structure knowledge, and the ability to write high-performance production-quality code. ● Experience building service-oriented APIs and cloud services (preferable against AWS) ● Experience designing, implementing, and deploying microservices ● Extremely technical hands-on server software experience ● Proficient in Golang, and Javascript, and quick to learn new languages. ● Experience in the Linux environment and a good understanding of its fundamentals and internals: filesystems and modern memory management, threads, and processes, the user/kernel-space divide, etc. ● A good understanding of large-scale distributed systems in practice, including multi-tier architectures, application security, monitoring, and storage systems. ● Working knowledge of the TCP/IP stack, internet routing, and load balancing. ● Grit, drive, and a deep feeling of ownership. Bonus Points for Experience with the following: ● Golang ● Typescript ● Kubernetes ● Terraform ● Opentelemetry ● Istio ● Datadog ● Helm Charts ● HLS video transcoding, distribution & playback ● Experience designing, implementing, and running services in high demand high-traffic environments ● Experience with high-availability services
Get notifed when new similar jobs are uploaded
LTIMindtree is a global technology consulting and digital solutions company that enables enterprises across industries to reimagine business models, accelerate innovation, and maximize growth by harnessing digital technologies. As a digital transformation partner to more than 700+ clients, LTIMindtree brings extensive domain and technology expertise to help drive superior competitive differentiation, customer experiences, and business outcomes in a converging world. Powered by nearly 90,000 talented and entrepreneurial professionals across 30+ countries, LTIMindtree — a Larsen & Toubro Group company — combines the industry-acclaimed strengths of erstwhile Larsen and Toubro Infotech and Mindtree in solving the most complex business challenges and delivering transformation at scale. For more info, please visit www.ltimindtree.com
Get notified when new jobs are added by LTI Mindtree