Production Support Engineer

29 Minutes ago • 2 Years +
Software Development & Engineering

Job Description

As a Production Support Engineer, you will be responsible for supporting, monitoring, and maintaining the high availability of Tala's platform across all markets. This role involves global collaboration with engineering, CX, product, and program teams for incident response and post-mortem processes. You will also work with CX and Collections to identify product improvement areas, continuously review and enhance monitoring and alerting systems, and contribute to a robust financial infrastructure.
Good To Have:
  • Speak Vietnamese
Must Have:
  • Ownership of risk event process for PH & East Asia timezone
  • Coordinate teams responding to incidents
  • Communicate effectively during incidents
  • Oversee post-mortem and monitor follow-up actions
  • Ownership of escalations from in-country CXCL guild
  • Debugging and identifying problems
  • Resolving issues, escalating when necessary
  • Tracking and reporting on pending issues
  • Continuous improvement of monitoring dashboards and alerts
  • Identify patterns in customer and product issues
  • Propose improvements based on customer/product issues
  • Share product learning and knowledge globally
  • Identify and communicate repeating themes around risk events
  • Propose improvements to prevent recurrence of issues
  • Keep track of metrics related to production performance
  • Continuously improve documentation library
  • 2+ years experience in technology with microservices architecture
  • 1+ years experience in incident response or similar role
  • Experience working with a global remote team
  • Knowledge of AWS CloudWatch, SumoLogic, APM monitoring (NewRelic, Instana), mobile (Crashlytics data), BI (Looker, Snowflake)
  • Knowledge of relational databases, BI querying languages
  • Experience with Postman or scripting API queries
  • Excellent debugging and documentation skills
  • Ability to coordinate incident response and communicate effectively with stakeholders

Add these skills to join the top 1% applicants for this job

problem-solving
communication
talent-acquisition
game-texts
quality-control
incident-response
aws
postman
looker
microservices
construct

About Tala

At Tala, we’re applying advanced technology and human creativity to solve what legacy institutions can’t or won't. We are a global financial infrastructure company on a mission to unleash the economic power of the global majority, recognizing that today’s financial infrastructure doesn’t work for most of the world’s population.

We’re the first and only platform to combine the intelligence of a credit bureau, the payments execution of a fintech, and the relationship expertise of a bank into one vertically integrated solution. Our platform is powered by an expansive moat of proprietary data and AI/ML decisioning technology, enabling us to deliver instant, reliable liquidity personalized to every customer's needs. Through our flagship credit app, we’ve disbursed over $7 billion in credit to more than 12 million customers across Latin America, Southeast Asia, and East Africa. These customers have leveraged Tala products to start and expand small businesses, manage day-to-day needs, and pursue their financial goals.

Our pioneering work and proven impact have earned us consistent recognition, including being named to:

  • CNBC’s Disruptor 50 for five years.
  • CNBC’s World's Top Fintech Companies for two consecutive years.
  • Forbes’ Fintech 50 list for nine consecutive years.

Visionary investors, persuaded by the economic power of the global majority, have committed half a billion dollars in equity and debt to Tala's success.

Given the global nature of our team, we operate on a remote-first approach with office hubs in Santa Monica, CA (HQ); Nairobi, Kenya; Mexico City, Mexico; Manila, the Philippines; and Bangalore, India.

Most Talazens join us because they connect with our mission. If you are energized by the impact you can make at Tala, we’d love to hear from you!

About the Role

As a Production Support Engineer you will support, monitor and maintain the high availability of our platform across all of our markets, in collaboration with other team members globally.

You’ll work closely with engineering, CX, product and program teams globally for production incident response and post-mortem processes.

You’ll work closely with the CX and Collections teams to discover areas of improvement for our product based on their feedback and the customer communication.

You’ll continuously review and improve our existing monitoring and alerting systems.

What You'll Do

  • Ownership of risk event process for the PH & East Asia timezone: you’ll help coordinate teams responding to an incident, communicate effectively, oversee post-mortem and monitor that the follow-up action items are completed
  • Ownership of escalations from the in-country CXCL guild: debugging and identifying problems, resolving when possible and escalating to appropriate teams when necessary.
  • Tracking and reporting on pending issues, and regular updates on open items
  • Continuous improvement of our monitoring dashboards and alerts
  • In collaboration with the CX team, identify patterns in customer and product issues and propose improvements
  • In collaboration with the Production Support Engineers globally, share product learning, knowledge and exchange ideas
  • Identify and communicate repeating themes around risk events and propose improvements to prevent recurrence of the same issues
  • Keep track of metrics related to production performance and identify areas of improvement
  • Continuous improvements of our documentation library to allow faster onboarding of new team members and more efficient response times

What You'll Need

  • 2+ years of experience working in technology environment with experience in microservices architecture
  • 1+ years of experience in incident response or similar role
  • Experience working with a remote team in a global environment
  • Knowledge of various monitoring platforms such as AWS CloudWatch, SumoLogic, APM monitoring (NewRelic, Instana), mobile (Crashlytics data), BI (Looker, Snowflake)
  • Knowledge of relational databases, BI querying languages to be able to construct queries during investigations
  • Experience working with tools like Postman, or scripting API queries
  • Excellent debugging and documentation skills
  • Ability to coordinate incident response and communicate effectively with stakeholders from variety of teams across different timezones
  • Ability to remain calm under pressure during a production incident resolution
  • Candidates with QA, SRE or similar background are encouraged to apply
  • Big plus if you speak Vietnamese

Our vision is to build a new financial ecosystem where everyone can participate on equal footing and access the tools they need to be financially healthy. We strongly believe that inclusion fosters innovation and we’re proud to have a diverse global team that represents a multitude of backgrounds, cultures, and experience. We hire talented people regardless of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, or disability status.

We may use artificial intelligence (AI) tools to support parts of the hiring process, such as reviewing applications, analyzing resumes, or assessing responses. These tools assist our recruitment team but do not replace human judgment. Final hiring decisions are ultimately made by humans. If you would like more information about how your data is processed, please contact us.

Set alerts for more jobs like Production Support Engineer
Set alerts for new jobs by TALA
Set alerts for new Software Development & Engineering jobs in Philippines
Set alerts for new jobs in Philippines
Set alerts for Software Development & Engineering (Remote) jobs
Contact Us
hello@outscal.com
Made in INDIA 💛💙