About the job
Role: Senior Software Engineer – Spark and DatabaseHMH Software Engineering provides cutting edge, individualized learning experiences to millions of students across the United States. We are as driven by this mission as we are by continuously improving ourselves and the way we work. Our offices are high energy, collaborative beehives of activity where work is centered on small, autonomous teams that build great software. We trust each other, hold ourselves and our teammates accountable for results, and improve student outcomes with each release.
At HMH we constantly experiment with new approaches and novel ways of solving problems. We often succeed and sometimes stumble — either way we learn and move forward with more confidence than we had the day before. We are as passionate about new technologies and engineering craftsmanship as we are about transforming the EdTech industry itself.
If this sounds like you let’s talk.
The Opportunity – Senior Software Spark and Database Developer for HMH Reporting Platform
Senior Software Engineers personify the notion of constant improvement as they work with their team to build software that delivers on our mission to improve student outcomes. You’re not afraid to try new things even if they don’t work out as expected. You are independent, self-directed, high energy and as eager to contribute to your team as you are to progress on your own path to software craftsmanship. You’ll thrive working in a fast-paced, low friction environment where you are exposed to a wide range of cutting-edge technologies.
Reporting Platform
You will be working on the Reporting Platform that is part of the HMH Educational Online/Digital Learning Platform using cutting-edge technologies. The Reporting team builds highly scalable and available platform. The platform is built using Microservices Architecture, Java microservices backend, REACT JavaScript UI Frontend, REST APIs, AWS RDS Postgres Database, AWS Cloud technologies, AWS Kafka, AWS Kinesis, Spark with Scala, Kubernetes or Mesos orchestration, Apache Airflow scheduler, DataDog for logging/monitoring/alerting, Concourse CI or Jenkins, Maven etc.
Responsibilities
Responsibilities:
- Implement complex queries and stored procedures to support REST APIs and batch rollups of reports data for customer organizations.
- Writing, designing, testing, implementing, and maintaining database applications/procedures using SQL or other database programming languages.
- Resolve performance issues, performance tuning of database systems, queries, indexing.
- Use of Apache Airflow scheduler to setup DB jobs to run automatically.
- Supporting streaming event processing using Spark framework with Scala.
- Manage and create data import and export processes (ETL) into the databases and create and manage data integration scripts using file transfers, API calls, and/or other methods.
- Develop solutions using AWS database technologies like RDS Postgres and Aurora Postgres.
- Provide support for systems architecture for Reporting Platform.
- Setting up Monitor Dashboards and Alerts using DataDog to proactively catch issues.
- Diagnose and troubleshoot database errors.
- Create automation for repeating database tasks.
- DevOps knowledge to automate deployments using Jenkins or Concourse.
Skills & Experience
Successful Candidates must demonstrate an appropriate combination of:
- 6+ years of experience as a DB Developer, preferably with Postgres, creating and supporting commercial data warehouses and data marts.
- 2+ years of experience working with Apache Spark and Scala development.
- 1+ years of experience working with Airflow Schedulers with Python development.
- Plus is experience working with Airflow Schedulers with Python development.
- Strong hands-on working knowledge of managing databases on AWS Databases including: RDS and Aurora.
- Strong command of SQL. SQL server tools, ETL jobs including stored procedures.
- Database technologies such as SQL, Aurora, Redshift, Liquibase or Flyway
- Cloud technologies such as AWS and Azure
- Data Center Operating Technologies such as Apache Mesos, Apache Aurora, and TerraForm and container services such as Docker and Kubernetes
- Advanced knowledge of database security and performance monitoring standards.
- Understanding of relational and dimensional data modeling.
- Shell scripting skills.
- Knowledge of DataDog for setting up monitoring and alerting dashboards.
- Working knowledge of Jenkins or Concourse tool for CI/CD.
- Ability to work independently and in a group to provide sound design and technology leadership.
- Self-starter attitude with initiative & creativity.
- Ability to pay attention to details, dealing with interruptions and changing timelines and priorities.
- Ability to communicate and work effectively with all levels of company.
- Related AWS DBA Certification is a preferred.
- Working knowledge of AirFlow is a plus.
- Knowledge of AWS Database Migration Service and Lambda is a plus.