As an MLOps Data Engineer, you will be responsible for building high-quality ML datasets at scale, used to train ML models that power AI-enabled solutions. You will be responsible for establishing and executing the strategy for our organization's ML Data Pipelines, with an initial focus on agile ML Data Ops.
Must have:
Data Pipeline Design
ETL Pipeline Building
Python, Java, Bash
Airflow, KubeFlow
Good to have:
Generative AI
Snowflake, Spark
Public Cloud
Data Governance
Perks:
Global Presence
Unique Culture
About the job
Syncronis a leading SaaS company with over 20 years of experience, specializing in aftermarket solutions. Our Connected Service Experience (CSX) platform offers domain-fit solutions for:
Supply Chain optimization,
Pricing strategy,
Service Lifecycle Management (e.g. warranty management, field service management, service parts management, knowledge management).
Our company has aglobal presencewith offices in US, UK, Germany, France, Italy, Japan, Poland, India and group headquarters in Sweden. We build upon the belief that our greatest strength is our People. Our uniquecompany culturehas beenappreciated by our Employees. With this we arewinning the hearts and minds of world-leading organizations, such as JCB, Kubota, Electrolux, Toyota, Renault and Hitachi. About The Role You will join a team of talented and friendly Data Scientists in Machine Learning Operation (MLOPs) and AI (Artificial Intelligence) as a Data Science Squad member. Team develops state-of-the-art Machine Learning-powered services for automated Supply Chain optimisation, Pricing strategy improvements, and Service Lifecycle Management including Generative AI-powered Knowledge management and warranty claim fraud detection and more. Team is performing full MLOps cycles from customer pain discovery to production. What would you do?
As an MLOps Data Engineer, you will be responsible for building high quality ML datasets at scale, used to train ML models that power AI-enabled solutions.
To build foundational tools and data pipelines to ingest, normalize and clean the valuable data that would be fundamental for our Data Scientists and ML engineers in Syncron
You will be responsible for establishing and executing the strategy for our organization’s ML Data Pipelines, with an initial focus on agile ML Data Ops
Identification of infrastructure components and data stack to be used, design and implementation of pipelines between data systems and teams, automation workflows, data enrichment and monitoring tools all for AI models
Code and contribute to the stack.
Dive into our dataset and design, implement and scale data pre/post processing pipelines of ML models,
Work on applied ML solutions in the areas of data mining, cleaning, normalizing and modelling.
Build data processing streams for cleaning and modelling text data for LLMs,
Work with Privacy and Security team on data governance, risk and compliance initiatives,
Work on initiatives to ensure stability, performance and reliability of our data infrastructure.
Who you are?
An exceptional data engineer who is passionate about data for AI and values it can bring to Syncron, who loves working with data ops at scale; and who is committed to the hard work necessary to continuously improve our ML data pipelines.
Bachelors in Computer Science, Mathematics, Physics, or a related fields.
Experience in statistical analysis & visualization on datasets using Pandas
Experience designing and building highly available, distributed systems of data extraction, ingestion, normalization and processing of large data sets in real time as well as batch, that will be used across engineering teams using orchestration frameworks like Airflow, KubeFlow or other pipeline tools,
Demonstrated prior experience in creating data pipelines for text data sets NLP large language models and generative AI.
Ability to produce well-engineered software, including appropriate automated test suites, technical documentation, and operational data strategy.
Excellent coding skills in Python, Java, bash, SQL, and expertise with Git version control.
Experience using big data technologies (Snowflake, Airflow, Kubernetes, Docker, Helm, Spark, PySpark),
Experience with any public cloud environment – AWS
Significant experience with relational databases and query authoring (SQL) as well as NoSQL databases like DynamoDB etc.,
Experience building and maintaining ETL (managing high-quality reliable ETL pipelines).
Unsureif you meet all the job requirements but passionate about the role?Apply anyway!Syncron values diversity and welcomes all Candidates,eventhose with non-traditional backgrounds. We believe in transferable skills and a shared passion for success! #Mid-Snr #Full-Time The world is changing. Manufacturing companies are shifting from selling products to delivering services. And we are driving this transformation together with our Customers, by helping them reduce costs and manual processes. We are guiding them on their journey towards a fully connected service experience and making their brand stronger. Our goal: to make the complex simple. Visitsyncron.comto get to know us better! If you encounter any case of potential ethical or laws violations, you may submit a report to a dedicated Syncron Whistleblowing Platformhere. You may request Syncron Whistleblowing Procedure via the „ask a question” tab availablehere.
View Full Job Description
Add your resume
80%
Upload your resume, increase your shortlisting chances by 80%