Overview
We are seeking an experienced Data Architect with extensive expertise in designing and implementing modern data architectures. This role requires strong software engineering principles, hands-on coding abilities, and experience building data engineering frameworks. The ideal candidate will have a proven track record of implementing Databricks-based solutions in the healthcare industry, with expertise in data catalog implementation and governance frameworks.
About the Role
As a Data Architect, you will be responsible for designing and implementing scalable, secure, and efficient data architectures on the Databricks platform. You will lead the technical design of data migration initiatives from legacy systems to modern Lakehouse architecture, ensuring alignment with business requirements, industry best practices, and regulatory compliance.
Key Responsibilities
Design and implement modern data architectures using Databricks Lakehouse platform
Lead the technical design of Data Warehouse/Data Lake migration initiatives from legacy systems
Develop data engineering frameworks and reusable components to accelerate delivery
Establish CI/CD pipelines and infrastructure-as-code practices for data solutions
Implement data catalog solutions and governance frameworks
Create technical specifications and architecture documentation
Provide technical leadership to data engineering teams
Collaborate with cross-functional teams to ensure alignment of data solutions
Evaluate and recommend technologies, tools, and approaches for data initiatives
Ensure data architectures meet security, compliance, and performance requirements
Mentor junior team members on data architecture best practices
Stay current with emerging technologies and industry trends
Qualifications
Extensive experience in data architecture design and implementation
Strong software engineering background with expertise in Python or Scala
Proven experience building data engineering frameworks and reusable components
Experience implementing CI/CD pipelines for data solutions
Expertise in infrastructure-as-code and automation
Experience implementing data catalog solutions and governance frameworks
Deep understanding of Databricks platform and Lakehouse architecture
Experience migrating workloads from legacy systems to modern data platforms
Strong knowledge of healthcare data requirements and regulations
Experience with cloud platforms (AWS, Azure, GCP) and their data services
Bachelor's degree in Computer Science, Information Systems, or related field; advanced degree preferred
Technical Skills
Programming languages: Python and/or Scala (required)
Data processing frameworks: Apache Spark, Delta Lake
CI/CD tools: Jenkins, GitHub Actions, Azure DevOps
Infrastructure-as-code (optional): Terraform, CloudFormation, Pulumi
Data catalog tools: Databricks Unity Catalog, Collibra, Alation
Data governance frameworks and methodologies
Data modeling and design patterns
API design and development
Cloud platforms: AWS, Azure, GCP
Container technologies: Docker, Kubernetes
Version control systems: Git
SQL and NoSQL databases
Data quality and testing frameworks
Optional - Healthcare Industry Knowledge
Healthcare data standards (HL7, FHIR, etc.)
Clinical and operational data models
Healthcare interoperability requirements
Healthcare analytics use cases