Description
About PulsePoint:
PulsePoint is a fast-growing healthcare technology company (with adtech roots) using real-time data to transform healthcare. We help brands and agencies interpret the hard-to-read signals across the health journey and unify these digital determinants of health with real-world data to produce the most dimensional view of the customer. Our award-winning advertising platforms use machine learning and programmatic automation to seamlessly activate this data, making marketing, predictive analytics, and decision support easy and instantaneous.
Description
Our Data & Analytics team is at the very heart of what makes PulsePoint an innovative, fast-paced and market changing company.
Our path forward is through data and this team is in the driver seat for the journey.
The Big Picture:
You will build, deliver & continually innovate on PulsePoint's insightful data-driven solutions. Your efforts help alleviate friction points and streamline processes that enable internal teams to provide exceptional service, powering the decisions of our customers.
As a Data Engineer, ML/Data Science, you will use your data science and stats expertise to contribute to R&D projects for DTC, new Data Products, and Bespoke Segments expansion. You can work fully remotely in India, and we will provide you with a company-issued laptop. This is a FTE role.
In short, you will be the conduit through which we will revolutionize health decisions through real-time data.
Key Responsibilities:
Write robust, modular, production-ready code in Python and SQL, following best practices in OOP, version control (Git), and software design principles.
Collaborate with other data scientists to design and productionize ML models and integrate them into end-to-end data systems.
Build tools, frameworks and ETL/ELT pipelines to enable efficient data access, processing, and model deployment.
Apply a working knowledge of common ML algorithms (classification, regression, clustering, etc.) to support experimentation and solution design.
Here are some projects you can help with:
Data science & stats-related projects
Work on R&D projects for DTC
Help build new Data Products
Contribute to Bespoke Segments expansion
Help us design and define the methodology for our measurement products and user identification
Continuously improve the quality of HCP onboarding/Targeting/measurement
Audience IQ/DTC product development, Identity graph/Data IQ
Collaborate with internal teams to delight our customers with timely and accurate data reporting that meets all requirements
Research & implement new data products or capabilities
Automate data visualization and reporting capabilities that empower users (both internal and external) to access data on their own, thereby improving quality, accuracy, and speed
Synthesize raw data into actionable insights to drive business results, identify key trends and opportunities for business teams, and report the findings in a simple, compelling way
Evaluate and approve additional data partners or data assets to be utilized for identity resolution, targeting, or measurement
Enhance PulsePoint's data reporting and insights generation capability by publishing internal reports about Health data
Act as the “Subject Matter Expert” to help internal teams understand the capabilities of our platforms, how to implement & troubleshoot
Requirements
Required qualifications:
2-6 years of hands-on experience as a Data Science Engineer, ML Engineer, or similar role
4-5+ years of relevant experience in:
Strong SQL skills for querying and managing structured datasets on cloud databases like GCP, AWS, Trino etc.
Highly proficient knowledge of Excel (pivot tables, VLOOKUP, formulas, functions)
Data analysis & manipulation
Solid programming experience in Python, especially in production environments (modular design, data validation, error handling, testing)
At least a Bachelor’s degree in Business Intelligence and Analytics or closely related field
Practical experience with:
Knowledge of Distributed Systems and Cluster computing frameworks like Apache Spark, for large-scale data processing and machine learning with PySpark ML
Google Cloud Architecture covering BigQuery, Cloud Storage (GCS), Compute Engine VMs, Dataproc clusters
ML Pipeline Orchestration
Deploying and managing ML models, with working knowledge of Bagging & Boosting Techniques, Model performance metrics, hyperparameter tuning etc.
MLOps practices, exposure to MLflow, Vertex AI, or other MLOps tools
Experience with Containerization (Docker) and Kubernetes
Knowledge of Airflow, Dagster, or similar orchestration tools
Proven experience in experimentation methods and Stats modeling in support of product development and optimization
Willing and able to work 3:30pm-12:30am IST, you can work fully remotely
Preferred qualifications:
Experience with LookML & DBT
Understanding of Frontend Dev Tools
And one of:
Able to organize large data sets to answer critical questions, extrapolate trends, and tell a story
Experience in Programmatic/Adtech
Familiarity with health-related data sets
Watch this video
here to learn more about our culture and get a sense of what it’s like to work at PulsePoint!
What are ‘red flags’ for us:
Candidates won’t succeed here if they haven’t worked closely with data sets or have simply translated requirements created by others into SQL without a deeper understanding of how the data impacts our business and, in turn, our clients’ success metrics.
Selection Process (the order of these sessions may be subject to change):
SQL test (40 mins)
Intro call with recruiter (15 mins)
Hiring manager call (45 mins)
Call with Sr. Data Scientist (30 mins)
Calls with SVP of Data and VP of Data (30 mins each)