L2 Production Support engineer (Gen AI/LLM Based Applications)

P99 soft

5-8 Years | Hyderabad, Telangana, India (On Site) | Full Time | 4 months ago

Apply Now

Job Summary

This L2 Production Support Engineer role involves providing operational support for AI-driven and enterprise software applications. Key responsibilities include incident triage, root cause analysis, monitoring application health, supporting CI/CD deployments, and troubleshooting issues on cloud platforms. The role requires strong problem-solving skills and ensuring high availability, reliability, and performance of production systems, including Docker/Kubernetes environments and various integration patterns. Participation in an on-call rotation is also required.

Must Have

Provide L2 support for production applications
Perform incident triage, root cause analysis, and issue resolution
Monitor application health using logs, alerts, and dashboards
Support CI/CD pipeline deployments and environment stability
Troubleshoot issues with deployed services on cloud platforms (AWS, Azure, GCP)
Collaborate with L3 engineering teams for complex problem resolution
Ensure adherence to security protocols (OAuth, SSO, Entra ID, Okta)
Maintain and troubleshoot Docker/Kubernetes-based deployments
Support varied integration patterns (REST, SOAP, gRPC, Web Sockets, Batch, Webhooks)
Perform performance tuning, load testing analysis, and optimization
Maintain documentation of issues, resolutions, runbooks, and support procedures
Participate in an on-call rotation for 24x7 support
Experience with React JS, Next JS, Java, .NET, Python
Experience with AWS, Azure, GCP
Experience with Git, branching/merging, pipelines
Experience with Docker, Kubernetes
Experience with APIs (REST/SOAP), gRPC, WebSockets, batch jobs

Good to Have

Experience with Generative AI/LLM-based applications and platforms (Azure AI Studio, AWS Bedrock, Hugging Face)
Exposure to RAG pipelines, data ingestion, cleansing, and evaluation
Knowledge of IaC tools such as Terraform/Ansible for environment setup
Experience supporting large-scale, data-driven AI/ML applications

Job Description

We are looking for an experienced L2 Production Support Engineer to provide operational support for AI-driven and enterprise software applications. The role requires strong problem-solving skills, a deep understanding of modern application stacks (frontend, backend, cloud, and integrations), and the ability to ensure high availability, reliability, and performance of production systems.

Key Responsibilities

Provide L2 support for production applications, ensuring minimal downtime and quick resolution of incidents.
Perform incident triage, root cause analysis, and issue resolution for application, infrastructure, and integration-related problems.
Monitor application health using logs, alerts, dashboards, and proactively prevent potential failures.
Support CI/CD pipeline deployments, rollback handling, and environment stability.
Work with cloud platforms (AWS, Azure, GCP) to troubleshoot issues with deployed services.
Collaborate with L3 engineering teams for complex problem resolution and permanent fixes.
Ensure adherence to security protocols (OAuth, SSO, Entra ID, Okta) in production environments.
Maintain and troubleshoot Docker/Kubernetes-based deployments.
Support varied integration patterns (REST, SOAP, gRPC, Web Sockets, Batch, Webhooks).
Perform performance tuning, load testing analysis, and optimization of applications.
Maintain documentation of issues, resolutions, runbooks, and support procedures.
Participate in an on-call rotation to provide 24x7 support for critical applications.

Required Experience & Skills

Bachelor’s/Master’s degree in Computer Science, Engineering, or related field.
5–8 years of experience in Production Support or Application Support (L2 role).
Strong hands-on experience with:
Frontend & backend stacks: React JS, Next JS, Java, .NET, Python
Cloud platforms: AWS, Azure, GCP
CI/CD tools & VCS: Git, branching/merging, pipelines
Containers & Orchestration: Docker, Kubernetes
Integration troubleshooting: APIs (REST/SOAP), gRPC, WebSockets, batch jobs

Beneficial / Nice-to-Have

Experience with Generative AI/LLM-based applications and related platforms (Azure AI Studio, AWS Bedrock, Hugging Face).
Exposure to RAG pipelines, data ingestion, cleansing, and evaluation.
Knowledge of IaC tools such as Terraform/Ansible for environment setup.
Experience supporting large-scale, data-driven AI/ML applications.

E-mail resume to srinivas.adepu@p99soft.com

18 Skills Required For This Role

Problem Solving Github Game Texts Load Testing React Js React Oauth Aws Azure Ansible Terraform Ci Cd Docker Websockets Kubernetes Git Python Java

Similar Jobs

Research Development

Senior Scientist, Discovery BioSciences Oncology

Bristol Myers Squibb • Cambridge, Massachusetts, United States (On Site)