L2 Production Support engineer (Gen AI/LLM Based Applications)

undefined ago • 5-8 Years • Product Management

Job Summary

Job Description

This L2 Production Support Engineer role involves providing operational support for AI-driven and enterprise software applications. Key responsibilities include incident triage, root cause analysis, monitoring application health, supporting CI/CD deployments, and troubleshooting issues on cloud platforms. The role requires strong problem-solving skills and ensuring high availability, reliability, and performance of production systems, including Docker/Kubernetes environments and various integration patterns. Participation in an on-call rotation is also required.
Must have:
  • Provide L2 support for production applications
  • Perform incident triage, root cause analysis, and issue resolution
  • Monitor application health using logs, alerts, and dashboards
  • Support CI/CD pipeline deployments and environment stability
  • Troubleshoot issues with deployed services on cloud platforms (AWS, Azure, GCP)
  • Collaborate with L3 engineering teams for complex problem resolution
  • Ensure adherence to security protocols (OAuth, SSO, Entra ID, Okta)
  • Maintain and troubleshoot Docker/Kubernetes-based deployments
  • Support varied integration patterns (REST, SOAP, gRPC, Web Sockets, Batch, Webhooks)
  • Perform performance tuning, load testing analysis, and optimization
  • Maintain documentation of issues, resolutions, runbooks, and support procedures
  • Participate in an on-call rotation for 24x7 support
  • Experience with React JS, Next JS, Java, .NET, Python
  • Experience with AWS, Azure, GCP
  • Experience with Git, branching/merging, pipelines
  • Experience with Docker, Kubernetes
  • Experience with APIs (REST/SOAP), gRPC, WebSockets, batch jobs
Good to have:
  • Experience with Generative AI/LLM-based applications and platforms (Azure AI Studio, AWS Bedrock, Hugging Face)
  • Exposure to RAG pipelines, data ingestion, cleansing, and evaluation
  • Knowledge of IaC tools such as Terraform/Ansible for environment setup
  • Experience supporting large-scale, data-driven AI/ML applications

Job Details

We are looking for an experienced L2 Production Support Engineer to provide operational support for AI-driven and enterprise software applications. The role requires strong problem-solving skills, a deep understanding of modern application stacks (frontend, backend, cloud, and integrations), and the ability to ensure high availability, reliability, and performance of production systems.

Key Responsibilities

  • Provide L2 support for production applications, ensuring minimal downtime and quick resolution of incidents.
  • Perform incident triage, root cause analysis, and issue resolution for application, infrastructure, and integration-related problems.
  • Monitor application health using logs, alerts, dashboards, and proactively prevent potential failures.
  • Support CI/CD pipeline deployments, rollback handling, and environment stability.
  • Work with cloud platforms (AWS, Azure, GCP) to troubleshoot issues with deployed services.
  • Collaborate with L3 engineering teams for complex problem resolution and permanent fixes.
  • Ensure adherence to security protocols (OAuth, SSO, Entra ID, Okta) in production environments.
  • Maintain and troubleshoot Docker/Kubernetes-based deployments.
  • Support varied integration patterns (REST, SOAP, gRPC, Web Sockets, Batch, Webhooks).
  • Perform performance tuning, load testing analysis, and optimization of applications.
  • Maintain documentation of issues, resolutions, runbooks, and support procedures.
  • Participate in an on-call rotation to provide 24x7 support for critical applications.

Required Experience & Skills

  • Bachelor’s/Master’s degree in Computer Science, Engineering, or related field.
  • 5–8 years of experience in Production Support or Application Support (L2 role).
  • Strong hands-on experience with:
  • Frontend & backend stacks: React JS, Next JS, Java, .NET, Python
  • Cloud platforms: AWS, Azure, GCP
  • CI/CD tools & VCS: Git, branching/merging, pipelines
  • Containers & Orchestration: Docker, Kubernetes
  • Integration troubleshooting: APIs (REST/SOAP), gRPC, WebSockets, batch jobs

Beneficial / Nice-to-Have

  • Experience with Generative AI/LLM-based applications and related platforms (Azure AI Studio, AWS Bedrock, Hugging Face).
  • Exposure to RAG pipelines, data ingestion, cleansing, and evaluation.
  • Knowledge of IaC tools such as Terraform/Ansible for environment setup.
  • Experience supporting large-scale, data-driven AI/ML applications.

E-mail resume to srinivas.adepu@p99soft.com

Similar Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Similar Skill Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Jobs in Hyderabad, Telangana, India

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

Product Management Jobs

Looks like we're out of matches

Set up an alert and we'll send you similar jobs the moment they appear!

About The Company

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (Hybrid)

Hyderabad, Telangana, India (Hybrid)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (On-Site)

Hyderabad, Telangana, India (Hybrid)

Hyderabad, Telangana, India (Hybrid)

Hyderabad, Telangana, India (On-Site)

Austin, Texas, United States (On-Site)

View All Jobs

Get notified when new jobs are added by P99 soft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug