About the role
Role Overview
You will own the architecture, development, and continuous improvement of Sully.ai’s NLP models powering the Receptionist and Assistant agents. Working cross‑functionally with Product, Clinical, and Reliability Engineering, you’ll translate clinical workflows into robust conversational AI solutions that meet HIPAA‑level security and compliance requirements.
What You’ll Do
- Architect NLP Pipelines. Design end‑to‑end pipelines for intent detection, entity extraction, and dialogue management using Hugging Face Transformers.
- Fine‑Tune Transformer Models. Adapt state‑of‑the‑art architectures (e.g., BERT, GPT) on domain‑specific data to optimize receptionist and assistant workflows, leveraging prompt engineering and RAG techniques.
- Define Evaluation Frameworks. Establish benchmarks (F1‑score, ROUGE, MMLU) and A/B test protocols to measure dialogue accuracy, user satisfaction, and model latency.
- Deploy & Scale. Containerize models with Docker, serve via FastAPI, and orchestrate on Kubernetes for high availability and observability.
- Lead MLOps Best Practices. Build CI/CD pipelines for model training, testing, and versioning; integrate monitoring and alerting for data drift and performance regressions.
- Collaborate & Mentor. Partner with Product and Clinical teams to curate training data, refine user flows, and onboard new engineers into our NLP practice.
What You’ll Bring
- 5+ years of software engineering experience, with 3+ years focused on ML/NLP in production settings.
- Proficiency in Python and deep learning frameworks (PyTorch or TensorFlow), with hands‑on experience using Hugging Face Transformers.
- Demonstrated success in fine‑tuning and deploying transformer‑based models for conversational AI or related NLP applications.
- Experience building and scaling RESTful services with FastAPI, containerizing with Docker, and managing Kubernetes deployments.
- Strong analytical skills and familiarity with evaluation metrics for NLP systems (F1, ROUGE, MMLU).
- Excellent communication and collaboration skills in fast‑paced, cross‑functional teams.
Tech Stack
- Languages & Frameworks: Python, PyTorch/TensorFlow, Hugging Face Transformers.
- APIs & Services: FastAPI, Docker, Kubernetes, CI/CD (GitHub Actions, Jenkins).
- Cloud & Data: AWS/GCP/Azure, SQL/NoSQL databases.
- AI & MLOps: Prompt engineering, RAG, model versioning, monitoring & alerting.