POSITION SUMMARY
As a Sr. Data Engineer at Oportun, you will be a key member of our team, responsible for designing, developing, and maintaining sophisticated software / data platforms in achieving the charter of the engineering group. Your mastery of a technical domain enables you to take up business problems and solve them with a technical solution. With your depth of expertise and leadership abilities, you will actively contribute to architectural decisions, mentor junior engineers, and collaborate closely with cross-functional teams to deliver high-quality, scalable software solutions that advance our impact in the market. This is a role where you will have the opportunity to take up responsibility in leading the technology effort – from technical requirements gathering to final successful delivery of the product - for large initiatives (cross-functional and multi-month-long projects).
RESPONSIBILITIES
- Database Design & Architecture
- Design, implement, and maintain optimal database schemas for relational (MariaDB) and NoSQL (MongoDB) databases.
- Participate in data modeling efforts to support analytics in Databricks.
- Performance Monitoring & Tuning
- Monitor and tune performance across all platforms to ensure optimal performance.
- Use profiling tools (e.g., EXPLAIN, query plans, system logs) to identify and resolve bottlenecks.
- Security & Compliance
- Implement access controls, encryption, and database hardening techniques.
- Manage user roles and privileges across MariaDB, MongoDB, and Databricks.
- Ensure compliance with data governance policies (e.g., GDPR, HIPAA).
- Backup & Recovery
- Implement and maintain backup/recovery solutions for all database platforms.
- Periodically test restore procedures for business continuity.
- Data Integration & ETL Support
- Support and optimize ETL pipelines between MongoDB, MariaDB, and Databricks.
- Work with data engineers to integrate data sources for analytics.
- Monitoring & Incident Response
- Set up and monitor database alerts.
- Troubleshoot incidents, resolve outages, and perform root cause analysis.
- MariaDB-Specific Responsibilities
- Administer MariaDB instances (standalone, replication, Galera Cluster).
- Optimize SQL queries and indexing strategies.
- Maintain stored procedures, functions, and triggers.
- Manage schema migrations and upgrades with minimal downtime.
- Ensure ACID compliance and transaction management.
- MongoDB-Specific Responsibilities
- Manage replica sets and sharded clusters.
- Perform capacity planning for large document collections.
- Tune document models and access patterns for performance.
- Set up and monitor MongoDB Ops Manager / Atlas (if used).
- Automate backup and archival strategies for NoSQL data.
- Databricks-Specific Responsibilities
- Manage Databricks workspace permissions and clusters.
- Collaborate with data engineers to optimize Spark jobs and Delta Lake usage.
- Ensure proper data ingestion, storage, and transformation in Databricks.
- Support CI/CD deployment of notebooks and jobs.
- Integrate Databricks with external data sources (MariaDB, MongoDB, S3, ADLS).
- Collaboration & Documentation
- Collaborate with developers, data scientists, and DevOps engineers.
- Maintain up-to-date documentation on data architecture, procedures, and standards.
- Provide training or onboarding support for other teams on database tools.
REQUIREMENTS
- Bachelor's or master’s degree in computer science, Data Science, or a related field.
- 5+ years of experience in data engineering, with a focus on data architecture, ETL, and database management.
- Proficiency in programming languages like Python/PySpark and Java or /Scala
- Expertise in big data technologies suchs as Hadoop, Spark, Kafka, etc.
- In-depth knowledge of SQL and experience with various database technologies (e.g., PostgreSQL, MariaDB MySQL, NoSQL databases).
- Experience and expertise in building complex end-to-end data pipelines.
- Experience with orchestration and designing job schedules using the CICD tools like Jenkins, Airflow or Databricks
- Ability to lead ETL migration from Talend to Databricks pyspark
- Demonstrated building capabilities, reusable utilities, and tools for speeding complex business processes.
- Ability to work in an Agile environment (Scrum, Lean, Kanban, etc)
- Ability to mentor junior team members.
- Familiarity with cloud platforms (e.g., AWS, Azure, GCP) and their data services (e.g., AWS Redshift, S3, Azure SQL Data Warehouse).
- Strong leadership, problem-solving, and decision-making skills.
- Excellent communication and collaboration abilities.
Preferred Skills and Tools
- MariaDB Tools: mysqldump, mysqladmin, Percona Toolkit
- MongoDB Tools: mongodump, mongotop, mongoexport, Atlas UI
- Databricks Tools: Jobs UI, Databricks CLI, REST API, SQL Analytics
- Scripting: Bash, Python, PowerShell
- Monitoring: Prometheus, Grafana, CloudWatch, DataDog
- Version Control & CI/CD: Git, Jenkins, Terraform (for infrastructure-as-code)
- Preferred Cloud provider: AWS