About Sixtyfour
We build AI research agents that can discover, link, and reason over everything about people and companies. The platform turns that intelligence into automated research workflows for sales, recruiting, and marketing.
About the role
What you’ll do
- Design and ship agentic systems (tool calling, multi-agent workflows, structured outputs) that reliably fetch, extract, and normalize data across the web and APIs.
- Own robust web scraping: directory crawling, CAPTCHA handling, headless browsers, rotating proxies, anti-bot evasion, and backoff/retry policies.
- Develop backend services in Python + FastAPI with clean contracts and strong observability.
- Scale workloads on AWS + Docker (batch/queue workers, autoscaling, fault tolerance, cost control).
- Parallelize external API requests safely (rate limits, idempotency, circuit breakers, retries, dedupe).
- Integrate third-party APIs for enrichment and search; model and cache responses; manage schema evolution.
- Transform and analyze data using Pandas (or similar) for normalization, QA, and reporting.
- Pitch in across the stack: billing (Stripe), and occasional front-end changes to ship end-to-end features.
Minimum requirements
- Hands-on experience with agentic architectures (tool calling, structured outputs/JSON, planning/execution loops) and prompt engineering.
- Proven web scraping expertise: solving CAPTCHAs, session/auth flows, proxy rotation, stealth techniques, and legal/ethical constraints.
- AWS + Docker in production (at least two of: ECS/EKS, Lambda, SQS/SNS, Batch, Step Functions, CloudWatch).
- Building high-throughput data/IO pipelines with concurrency (asyncio/multiprocessing), resilient retries, and rate-limit aware scheduling.
- Integrating diverse external APIs (auth patterns, pagination, webhooks); designing stable interfaces and backfills.
- Strong data wrangling with Pandas or equivalent; comfort with large CSV/Parquet workflows and memory/perf tuning.
- Excellent ownership, product sense, and pragmatic debugging.
Nice to have
- Entity resolution/record linkage at scale (probabilistic matching, blocking, deduping).
- Experience with Langfuse, OpenTelemetry, or similar for tracing/evals; task queues (Celery/RQ), Redis, Postgres.
- Search relevance (BM25/vector/hybrid), embeddings, and retrieval pipelines.
- Playwright/Selenium, stealth browsers, anti-bot frameworks, CAPTCHA providers.
- CI/CD, infrastructure as code (Terraform), and cost/perf observability.
- Security & compliance basics for data handling and PII.
Technology
Language Models, Opensearch/Elasticsearch, Next.js (typescript), Python, FastAPI, AWS, Docker, Celery workers, Playwright, Supabase, Stripe