Company Description
Bosch Global Software Technologies Private Limited is a 100% owned subsidiary of Robert Bosch GmbH, one of the world's leading global supplier of technology and services, offering end-to-end Engineering, IT and Business Solutions. With over 27,000+ associates, it’s the largest software development center of Bosch, outside Germany, indicating that it is the Technology Powerhouse of Bosch in India with a global footprint and presence in the US, Europe and the Asia Pacific region.
Job Description
Roles & Responsibilities :
Data Architecture & Engineering
- Design and implement end-to-end data pipelines for ingestion, transformation, and storage of structured, semi-structured, and time-series data.
- Build both real-time and batch processing frameworks using Databricks, supporting scalable analytics and AI workloads.
- Develop and maintain ETL/ELT workflows using Python and SQL, ensuring reusability and maintainability.
- Architect and optimize data lakes/lakehouses (Azure Synapse, Delta Lake, BigQuery, or Snowflake) for efficient querying and cost control.
- Design and manage NoSQL databases (MongoDB) and time-series databases (InfluxDB, TimescaleDB, Azure Data Explorer) for sensor and operational data.
- Enable AI/ML readiness by developing feature pipelines, managing datasets, and integrating with model inference systems.
Cloud & Integration
- Orchestrate and monitor data pipelines using Azure Data Factory, Azure Functions, and Event Hub for real-time ingestion and transformation.
- Build serverless, event-driven applications using Azure Functions (Python-based), AWS Lambda, or GCP Cloud Functions.
- Implement hybrid data integration between edge, on-prem, and cloud using secure APIs, message queues, and connectors.
- Integrate data from IoT devices, ERP, MES, PLM, and simulation tools to enable enterprise-wide digital twin insights.
- Develop containerized microservices using Docker and Kubernetes to support portable, cloud-agnostic deployments across Azure, AWS, and GCP.
Performance, Security & Governance
- Implement frameworks for data quality, lineage, and observability (Great Expectations, Azure Purview, OpenMetadata).
- Enforce data governance, privacy, and compliance with standards such as GDPR, ISO 27001, and industry regulations.
- Optimize resource utilization and cost across compute, storage, and database layers.
- Establish data retention, access control, and lifecycle policies across multi-tenant environments.
Collaboration & Strategy
- Collaborate with cloud architects, AI/ML engineers, and domain experts to align data architecture with Industry 4.0 and Digital Twin goals.
- Evaluate and introduce emerging technologies such as vector databases, streaming analytics, and data mesh frameworks.
- Mentor junior engineers and promote best practices in Pythonic coding, DevOps, and GitOps workflows.
- Develop and maintain data engineering accelerators and reusable frameworks for internal adoption.
Qualifications
Educational qualification:
Required Qualifications
- Bachelor’s or Master’s degree in Computer Science, Data Engineering, or related field.
- 8+ years of experience in data engineering, analytics, or big data systems.
Mandatory skills:
- Strong programming skills in Python and SQL for data transformation, orchestration, and automation.
- Expertise in Azure data services (Synapse, Data Factory, Event Hub, Azure Functions, Databricks).
- Hands-on experience with MongoDB, Cosmos DB, and time-series databases such as InfluxDB, TimescaleDB, or Azure Data Explorer (ADX).
- Proven experience with streaming frameworks (Kafka, Event Hub, Kinesis) and workflow orchestrators (Airflow, Argo, or Prefect).
- Proficiency in Docker and Kubernetes for containerization and scalable deployment.
- Familiarity with data lake/lakehouse architectures, NoSQL models, and cloud-agnostic patterns.
- Knowledge of CI/CD pipelines and infrastructure-as-code tools (Terraform, Bicep, ARM templates).
Preferred Skills
- Experience with industrial IoT, Digital Twin data models, and protocols such as OPC-UA and MQTT.
- Exposure to edge-to-cloud data flows and predictive maintenance or anomaly detection solutions.
- Knowledge of data quality, governance, and metadata management tools.
- Strong communication and analytical skills to align data solutions with business and operational KPIs.
Additional Information
Position Overview
We are seeking a highly skilled Senior Data Engineer to design, build, and optimize large-scale, cloud-native data platforms that power Digital Twin and Industrial AI solutions. This role focuses on developing high-performance data ingestion and transformation pipelines that unify IoT, enterprise, and AI/ML data, enabling real-time insights, scalability, and interoperability across hybrid environments.