The Role
As a Senior DevOps Engineer on eBay’s Developer Platform & Experience team, you will design and evolve the Continuous Delivery foundations that power thousands of applications and support 3,000+ developers every day. You’ll build paved‑path tooling, automate at scale, and adopt modern cloud infrastructure so teams can develop, test, and ship high‑quality, secure, performant software—rapidly and reliably.
Why this role exists
Our engineers ship code to millions of users. Your mission is to shorten the path from idea to production by building the platforms, tooling, and AI-assisted workflows that make development fast, safe, and delightful. You’ll own core pieces of our internal developer platform (IDP)—from hermetic builds and ephemeral test environments to continuous delivery and quality automation—and lead the adoption of AI to accelerate everyday engineering tasks.
We draw inspiration from top tech companies that operate at massive scale—e.g., modern build systems like Bazel and Buck2, proven CD patterns like Spinnaker, and internal developer portals such as Backstage—so our developers spend less time on toil and more time delivering customer value.
What you’ll do
- Own the Software Build, Test platform and frameworks. Design and evolve fast, reproducible, and cache-efficient builds; reduce CI build times with creative solutions; improve correctness and remove flakiness in infrastructure; maintain scalable artifact storage and dependency management.
- Platform modernization. Our platform is powered by Jenkins, Maven for Java, NPM for NodeJs builds. You’ll be evaluating state of the art CI platforms (e.g.Tekton) and pave the way for modernizing our Build, Test stack.
- Spin up ephemeral, production-like test environments. Standardize on sandboxed, on-demand environments (e.g., per-PR) to enable reliable integration/e2e testing and preview deployments.
- Harden & simplify deployments. Advance our CD/GitOps workflows (progressive delivery, automated rollbacks, canaries), with golden paths and strong guardrails.
- Build the internal developer portal. Curate paved roads for services, data, and infra via templates, scorecards, and software catalogs to improve discoverability and self-service.
- Introduce AI-assisted engineering.
- Ship secure, private AI copilots for code authoring, refactoring, and code review.
- Use LLMs for test generation, flaky test triage, log summarization, debug suggestions, Error classifications, Selective Test Execution and AIOps across CI/CD.
- Build evaluation harnesses, prompt libraries, RAG over internal docs, and policy controls for IP, PII, and secrets.
- Champion reliability, security & compliance. Bake in supply-chain security (SBOMs, provenance, signing), policy-as-code, and infra guardrails, Patching CVEs in accordance with policies
- Instrument, measure, improve. Track DORA and DevEx metrics (lead time, deployment frequency, change-failure rate, MTTR) and drive continuous improvement via experiments.
- Partner widely. Work with product teams, Cloud, Frameworks, Security, and Data/ML to understand friction points and design paved-road solutions that scale.
- Collaborate across. Our Platform teams are spread across the globe as well as the developer community. Provide top notch Developer Platforms that satisfy the needs of geographically distributed teams.
Example problems you might own in your first 6–12 months
- Data driven analysis and cut average CI build times by 40% via incremental builds, dependency management and smarter caching; Bring down slowest test suites with parallelization, Selective Test executions, profiling and flake-busting.
- Launch ephemeral “PR environments” with seeded data and synthetic traffic; integrate with feature flags for safe, progressive rollouts.
- Stand up an internal developer portal (service templates, scorecards, docs search) and migrate golden paths there.
- Deliver an AI DevEx toolkit: repo-aware chat, code-review assistant, flaky-test explainer, and CI log summarizer—with evaluation dashboards and privacy controls.
- Pilot remote dev pods to standardize the edit-build-run loop for large repos and speed onboarding.
Qualifications
Must-haves
- 7+ years building platforms/tools for large engineering orgs; deep expertise in one or more build systems (preferably building Java, Nodejs stacks), CI orchestration (preferably Jenkins), test infra, deployment/CD pipelines, or Internal Developer Portals.
- Self-starter with a proactive mindset and strong sense of ownership.
- Proven ability to manage communications effectively with partner teams across global regions, with hands-on experience working in private cloud environments.
- Skilled at designing robust solutions while proactively anticipating potential issues to ensure reliability and efficiency.
- Strong systems design for high-scale developer workflows (monorepos/multirepos, artifact caching, remote execution, hermetic builds).
- Experienced in providing timely solutions to developer challenges in Jenkins environments, ensuring smooth CI/CD workflows.
- DevOps fundamentals: Containers, Kubernetes, service mesh, IaC (Terraform), GitOps, observability (metrics/logs/traces), SLOs/SLIs.
- Practical AI skills & design patterns: Prompt engineering, hands-on RAG, LLM evaluations, API orchestration, privacy/guardrails; ability to ship AI-backed tools that measurably reduce toil.
- System design & design patterns: strong grasp of distributed systems, API design, resiliency, and object-oriented/functional patterns; ability to create clear, scalable architectures and ADRs.
- Agentic/MCP architectures: practical experience designing agent loops (planner/executor/critic), tool abstractions, memory, and MCP-style tool/resource servers for enterprise integration.
- Proficiency in at least two languages (e.g., Java, Kotlin, Python, Go,). Fullstack development experience is a plus.
- Fluency in Linux/Ubuntu commands to get bottom of the system level issues.
- Database literacy: working knowledge of NoSQL and modern relational databases.
- Observability dashboards: ability to build with Prometheus, Grafana, ELK, Splunk, New Relic, or Nagios.
- Performance sleuthing: able to diagnose system and web-service performance issues end-to-end.
Nice-to-haves
- Experience with modern CI/CD platforms (e.g., Tekton/Spinnaker/Argo CD/Flux) and progressive delivery.
- Prior work with Backstage or other IDPs; plugin development and service catalog design.
- Background in developer analytics and productivity research; familiarity with DORA, DX frameworks
- Experience with remote dev environments (e.g., DevPods/Codespaces-style) at scale.
- Experience with microservices architecture and related DevOps practices.
How we measure success
- Lead time for changes trends down; deployment frequency trends up without increasing risk.
- CI stability & speed improve (p95 build/test time, flake rate, queueing).
- Change failure rate & MTTR drop via safer releases and better rollback automation.
- Developer NPS/DevEx survey and onboarding time improve; IDP adoption grows. (Benchmarked using DORA-style measures.)
Our stack
Kubernetes, GitHub, Jenkins in a Cloud environment, Maven/Bazel/Gradle/NodeJS Buck-like builds, JFrog Artifactory, Static & Security Code Analysis tools, Code Coverage, Image builders, Container Registry, In-house built CD platforms like Tekton/Spinnaker/Argo , Backstage IDP, OpenTelemetry, OpenSearch, Feature Flags, SBOM/provenance, LLM APIs for AIOps.