Senior AI/NLP Engineer

1 Month ago • 4-5 Years • Research Development

Job Summary

Job Description

We are looking for a skilled Document AI / NLP Engineer to develop intelligent systems that extract meaningful data from documents such as PDFs, scanned images, and forms. In this role, you will build document processing pipelines using OCR and NLP technologies, fine-tune ML models for tasks like entity extraction and classification, and integrate those solutions into scalable cloud-based applications. You will collaborate with cross-functional teams to deliver high-performance, production-ready pipelines and stay up to date with advancements in the document understanding and machine learning space.
Must have:
  • Design, build, and optimize document parsing pipelines using tools like Amazon Textract, Azure Form Recognizer, or Google Document AI.
  • Perform data preprocessing, labeling, and annotation for training machine learning and NLP models.
  • Fine-tune or train models for tasks such as Named Entity Recognition (NER), text classification, and layout understanding using PyTorch, TensorFlow, or HuggingFace Transformers.
  • Integrate document intelligence capabilities into larger workflows and applications using REST APIs, microservices, and cloud components (e.g., AWS Lambda, S3, SageMaker).
  • Evaluate model and OCR accuracy, applying post-processing techniques or heuristics to improve precision and recall.
  • Collaborate with data engineers, DevOps, and product teams to ensure solutions are robust, scalable, and meet business KPIs.
  • Monitor, debug, and continuously enhance deployed document AI solutions.
  • Maintain up-to-date knowledge of industry trends in OCR, Document AI, NLP, and machine learning.

Job Details

Project description

We are looking for a skilled Document AI / NLP Engineer to develop intelligent systems that extract meaningful data from documents such as PDFs, scanned images, and forms. In this role, you will build document processing pipelines using OCR and NLP technologies, fine-tune ML models for tasks like entity extraction and classification, and integrate those solutions into scalable cloud-based applications. You will collaborate with cross-functional teams to deliver high-performance, production-ready pipelines and stay up to date with advancements in the document understanding and machine learning space.

Responsibilities

  • Design, build, and optimize document parsing pipelines using tools like Amazon Textract, Azure Form Recognizer, or Google Document AI.
  • Perform data preprocessing, labeling, and annotation for training machine learning and NLP models.
  • Fine-tune or train models for tasks such as Named Entity Recognition (NER), text classification, and layout understanding using PyTorch, TensorFlow, or HuggingFace Transformers.
  • Integrate document intelligence capabilities into larger workflows and applications using REST APIs, microservices, and cloud components (e.g., AWS Lambda, S3, SageMaker).
  • Evaluate model and OCR accuracy, applying post-processing techniques or heuristics to improve precision and recall.
  • Collaborate with data engineers, DevOps, and product teams to ensure solutions are robust, scalable, and meet business KPIs.
  • Monitor, debug, and continuously enhance deployed document AI solutions.
  • Maintain up-to-date knowledge of industry trends in OCR, Document AI, NLP, and machine learning.

Skills

Must have

  • 4-5 years of hands-on experience in machine learning, document AI, or NLP-focused roles.
  • Strong expertise in OCR tools and frameworks, especially Amazon Textract, Azure Form Recognizer, Google Document AI, or open-source tools like Tesseract, LayoutLM, or PaddleOCR.
  • Solid programming skills in Python and familiarity with ML/NLP libraries: scikit-learn, spaCy, transformers, PyTorch, TensorFlow, etc.
  • Experience working with structured and unstructured data formats, including PDF, images, JSON, and XML.
  • Hands-on experience with REST APIs, microservices, and integrating ML models into production pipelines.
  • Working knowledge of cloud platforms, especially AWS (S3, Lambda, SageMaker) or their equivalents.
  • Understanding of NLP techniques such as NER, text classification, and language modeling.
  • Strong debugging, problem-solving, and analytical skills.
  • Clear verbal and written communication skills for technical and cross-functional collaboration.

Nice to have

  • N/A

Other

  • Languages: English: B2 Upper Intermediate
  • Seniority: Senior

Similar Jobs

Aera Technology - Technical Writer

Aera Technology

Pune, Maharashtra, India (Hybrid)
1 Month ago
Ubisoft - User Acquisition & Monetization Assistant – Internship

Ubisoft

Paris, Île-de-France, France (Hybrid)
3 Months ago
zeta - Senior Associate - Reliability Operations

zeta

Hyderabad, Telangana, India (On-Site)
6 Months ago
Trek - Product Support Analyst (ERP)

Trek

Haryana, India (On-Site)
6 Months ago
London stock Exchange - Technical Product Manager

London stock Exchange

Bengaluru, Karnataka, India (On-Site)
2 Months ago
zoox - Engineering Manager, ML Training Platform

zoox

Foster City, California, United States (Hybrid)
11 Months ago
Apple - ML Lead | Acoustics

Apple

Cupertino, California, United States (On-Site)
2 Months ago
Aera Technology - Client Partner | Enterprise Platform Sales | AI /ML Decision Intelligence | Texas

Aera Technology

Texas, United States (Hybrid)
10 Months ago
Riot Games - Researcher III - Player Platform

Riot Games

United States (On-Site)
3 Months ago
level ai - AI Analyst

level ai

Noida, Uttar Pradesh, India (Hybrid)
1 Month ago

Get notifed when new similar jobs are uploaded

Similar Skill Jobs

Putnam - Director, RWE & Biostatistics

Putnam

Kraków, Lesser Poland Voivodeship, Poland (Hybrid)
1 Month ago
Trellix - Senior Integrated Marketing Manager

Trellix

United States (Remote)
2 Months ago
PhonePe - Head, Business Marketing

PhonePe

Bengaluru, Karnataka, India (On-Site)
1 Month ago
DataVisor - Data Scientist - Fraud Detection

DataVisor

Mountain View, California, United States (Hybrid)
3 Months ago
Rackspace Technology - Principal Backend Java Engineer

Rackspace Technology

United States (Hybrid)
2 Months ago
Simcorp - Lead Fullstack Engineer - (C# and Angular)

Simcorp

Noida, Uttar Pradesh, India (Hybrid)
1 Month ago
velotio technologies  - Senior Engineer (Data Engineer)

velotio technologies

Maharashtra, India (Remote)
4 Months ago
cyara - Sales Operations Analyst – Data

cyara

Hyderabad, Telangana, India (Hybrid)
8 Months ago
Morning Star - Senior Data Scientist

Morning Star

Mumbai, Maharashtra, India (Hybrid)
1 Year ago
Paytm - IT Auditor - Technical Security

Paytm

Noida, Uttar Pradesh, India (On-Site)
2 Months ago

Get notifed when new similar jobs are uploaded

Jobs in Gurugram, India

Capgemini - Angular and UI Development

Capgemini

Hyderabad, Telangana, India (On-Site)
2 Months ago
Guardian - Lead Engineer - IT

Guardian

Chennai, Tamil Nadu, India (Hybrid)
1 Month ago
Hitachi - D365 F&O (Delivery offshore)

Hitachi

India (Remote)
10 Months ago
Capgemini - Product Owner

Capgemini

Kolkata, West Bengal, India (On-Site)
3 Months ago
Glean - Software Engineer, Machine Learning (India)

Glean

Bengaluru, Karnataka, India (On-Site)
9 Months ago
Zscaler - Manager, App Security

Zscaler

Bengaluru, Karnataka, India (Hybrid)
3 Months ago
Capco - Agile Delivery Coach

Capco

India (On-Site)
3 Weeks ago
Paytm - Key Account Manager - Electronic Data Capture - Bangalore

Paytm

Bengaluru, Karnataka, India (On-Site)
3 Months ago
Veeam Software - Senior Director of Engineering

Veeam Software

Pune, Maharashtra, India (On-Site)
1 Month ago
Capgemini - Service Delivery Senior Specialist

Capgemini

Chennai, Tamil Nadu, India (On-Site)
3 Months ago

Get notifed when new similar jobs are uploaded

Research Development Jobs

Apple - Senior Machine Learning Applied Researcher

Apple

San Francisco, California, United States (On-Site)
3 Months ago
DOTSOFT SA - R&D ICT Projects Economist / Admin Assistant

DOTSOFT SA

Pylaia, Greece (On-Site)
4 Weeks ago
Vercel - DX Engineer, AI SDK

Vercel

San Francisco, California, United States (Hybrid)
3 Months ago
rivos - Deep Learning Libraries Engineer

rivos

Santa Clara, California, United States (Hybrid)
1 Year ago
Microsoft - Member of Technical Staff, AI Pretraining

Microsoft

London, England, United Kingdom (On-Site)
4 Months ago
Snorkel AI - Staff Applied AI Engineer

Snorkel AI

Redwood City, California, United States (Hybrid)
2 Months ago
Ansys - Senior Backend R&D Engineer

Ansys

Montigny-le-Bretonneux, Île-de-France, France (Remote)
2 Months ago
Apple - AIML - Machine Learning Educator

Apple

Seattle, Washington, United States (On-Site)
2 Months ago
NVIDIA - Research Scientist, Deep Learning and Computer Vision

NVIDIA

Hsinchu, Hsinchu City, Taiwan (On-Site)
5 Months ago
Inkittt - Director of AI

Inkittt

San Francisco, California, United States (On-Site)
1 Year ago

Get notifed when new similar jobs are uploaded

About The Company

Empower your future with Luxoft: Innovate, thrive and grow in a software-defined world.

Kraków, Lesser Poland Voivodeship, Poland (Remote)

Wrocław, Lower Silesian Voivodeship, Poland (Remote)

Gdańsk, Pomeranian Voivodeship, Poland (Remote)

Warsaw, Masovian Voivodeship, Poland (Remote)

Bengaluru, Karnataka, India (On-Site)

Chennai, Tamil Nadu, India (On-Site)

View All Jobs

Get notified when new jobs are added by luxsoft

Level Up Your Career in Game Development!

Transform Your Passion into Profession with Our Comprehensive Courses for Aspiring Game Developers.

Job Common Plug