Senior Platform Engineer
Dash0
Job Summary
Dash0 is building an easy-to-use, OpenTelemetry-native observability platform, aiming to make observability accessible and delightful for every developer. As a Senior Platform Engineer, you will design, build, and maintain the foundational infrastructure and tools supporting our software products. You will collaborate with cross-functional teams to ensure reliability, scalability, and security, enabling efficient delivery of high-quality software. This role involves architectural design, automation, troubleshooting, and continuous improvement.
Must Have
- Design, build, and maintain foundational infrastructure and tools.
- Collaborate with cross-functional teams to ensure reliability, scalability, and security.
- Develop and influence platform architecture, making high-level design decisions.
- Develop and maintain CI/CD pipelines for continuous integration and delivery.
- Conduct code reviews to ensure adherence to coding standards and best practices.
- Monitor and troubleshoot platform issues, implementing solutions for high availability.
- Ensure compliance with security best practices and regulatory requirements.
- Provide technical support and on-call for applications, systems, and infrastructure.
- Continuously evaluate and implement new technologies and tools.
- Identify and mitigate technical risks impacting project timelines or quality.
- Create and maintain technical documentation.
- Oversee codebase maintenance and evolution, including refactoring.
- Stay updated on industry trends, emerging technologies, and best practices.
- Address complex technical challenges and provide innovative solutions.
- Design, deploy, and maintain cloud-based infrastructure using DevOps practices.
- Automate deployment, configuration, and monitoring processes.
- Proficient in diagnosing and resolving complex technical issues.
- Excellent verbal and written communication skills for effective collaboration.
- Ability to adapt to changing technologies, project requirements, and business priorities.
- Strong collaboration skills for working towards common goals.
Good to Have
- Knowledge of GitOps tools like ArgoCD.
- Understanding of software-defined networking (SDN) and virtual networking concepts.
- Master’s Degree or Advanced Courses in software engineering, system architecture, cloud computing, or data science.
- Certifications relevant to specific technical skills (e.g., AWS Certified Solutions Architect, Certified ScrumMaster).
Perks & Benefits
- Competitive salary & equity package.
- Fully remote company and a flexible work environment.
- Quarterly come-togethers in Europe for fun, team-building and discussions.
- Collaborative and supportive team culture.
Job Description
Senior Platform Engineer
Location
East Coast - remote
Employment Type
Full time
Department
Engineering / R&D
About Dash0
Imagine a world where observability is... well, easy. That’s what we’re building at Dash0. We’re not just another observability company, we’re a team of passionate experts who are obsessed with making observability accessible and delightful for every developer.
We are OpenTelemetry-native, ensuring seamless interoperability within modern observability ecosystems, and our “welcome present” to the OpenTelemetry community, OTelBin, a free editing, visualization, and validation tool for OpenTelemetry collector configurations, has been extremely well received.
Our mission is to make Dash0 the best observability experience in the world, and we’re seeking driven talent to shape that journey and grow alongside us.
The Opportunity
As a Senior Platform Engineer, you will be responsible for designing, building, and maintaining the foundational infrastructure and tools that support our software products and services. You will collaborate with cross-functional teams to ensure reliability, scalability, and security across our platform, enabling our development teams to deliver high-quality software efficiently.
What You’ll Do
- Architectural Design: Develop and influence the platform architecture, making high-level design decisions and ensuring that technical solutions align with business goals.
- Automation: Develop and maintain CI/CD pipelines to enable continuous integration and delivery of software applications.
- Cross-Team Collaboration: Collaborate with development teams to design and implement scalable and resilient architecture solutions.
- Code Review and Quality Assurance: Conduct code reviews to ensure adherence to coding standards, best practices, and maintainable, efficient code. Ensure the overall quality of the codebase.
- Troubleshooting: Monitor and troubleshoot platform issues, and implement solutions to ensure high availability and performance.
- Security: Ensure compliance with security best practices and regulatory requirements across the platform.
- Support: Providing technical support and on-call for applications, systems, and infrastructure as needed.
- Continuous Improvement: Continuously evaluate and implement new technologies and tools to improve platform reliability and efficiency.
- Security: Ensure that software developed is secure and compliant with relevant security standards and practices.
- Risk Management: Identify and mitigate technical risks that could impact project timelines or quality.
- Documentation: Create and maintain technical documentation, including architecture diagrams, design documents, and coding standards.
- Codebase Maintenance: Oversee the maintenance and evolution of the codebase, including refactoring efforts as needed to improve code quality.
- Technical Research: Stay updated on industry trends, emerging technologies, and best practices in software / platform engineering.
- Problem Solving: Address complex technical challenges, troubleshoot issues, and provide innovative solutions to problems that arise during development.
- DevOps: Design, deploy, and maintain cloud-based infrastructure using modern DevOps practices and tools.
- Automation: Automate deployment, configuration, and monitoring processes to streamline operations and improve efficiency.
Who You Are
Soft Skills
- Problem-Solving: Proficient in diagnosing and resolving complex technical issues, making decisions under pressure, and innovating solutions to meet business needs.
- Communication: Excellent verbal and written communication skills for effective collaboration with cross-functional teams, articulating technical concepts to non-technical stakeholders, and mentoring junior engineers.
- Adaptability: Ability to adapt to changing technologies, project requirements, and business priorities, maintaining a proactive approach to learning and development.
- Collaboration: Strong collaboration skills for working together towards common goals and resolving issues efficiently.
Hard Skills
- Cloud Computing Platforms: Proficiency in one or more cloud computing platforms such as AWS (Amazon Web Services) or Google Cloud Platform (GCP) is crucial. This includes understanding core services, networking, security, and resource management within the chosen cloud environment.
- Infrastructure as Code (IaC): Experience with tools like Terraform to automate the provisioning and management of infrastructure resources. Knowledge of GitOps tools like ArgoCD is also beneficial.
- Containerization and Orchestration: Expertise in containerization technologies such as Docker for packaging applications, Kubernetes for orchestrating containerized workloads and Helm for deploying Kubernetes workloads. Understanding concepts like pod deployment, service discovery, and scaling in Kubernetes is important.
- DevOps and Continuous Integration/Continuous Deployment (CI/CD): Experience with DevOps practices, proficiency in setting up and maintaining CI/CD pipelines using tools like GitHub Actions, GitLab CI or Jenkins to automate software builds, testing, and deployment processes.
- Programming Proficiency: Good knowledge of programming languages relevant to the company's technology stack (such as GoLang, Java and/or Python), including best practices in coding, testing, and security.
- Security Best Practices: Awareness of security principles and how to implement secure coding practices, encryption, authentication, and authorization mechanisms.
- Networking: Understanding of networking fundamentals including TCP/IP, DNS, HTTP/HTTPS, load balancing, firewalls, and VPNs. Knowledge of software-defined networking (SDN) and virtual networking concepts is beneficial.
- Monitoring and Logging: Experience with monitoring and logging tools such as Prometheus, Grafana or OpenTelemetry for tracking system performance, analyzing logs, and troubleshooting issues.
Education
- Bachelor’s Degree in Computer Science, Engineering, or a related field: Fundamental education providing a strong foundation in software development principles, algorithms, and data structures.
- Master’s Degree or Advanced Courses (optional but beneficial): Specialization in areas such as software engineering, system architecture, cloud computing, or data science can enhance expertise and strategic insight.
- Certifications (optional): Certifications relevant to specific technical skills or technologies (e.g., AWS Certified Solutions Architect, Certified ScrumMaster, etc.) can demonstrate expertise and commitment to staying current with technological advancements.
Why You'll Love Working at Dash0
- Competitive salary & equity package.
- Fully remote company and a flexible work environment.
- Quarterly come-togethers in Europe for fun, team-building and discussions.
- Collaborative and supportive team culture.