Job Description
Who you are 8+ years building platform infrastructure, data infrastructure, data platforms, or backend systems with significant data components. You have built and operated pipelines, data access layers, or ETL/ELT systems in production Strong proficiency in Python. Our stack is Python-heavy across Prefect, FastAPI, dbt, and the SDK layer Hands-on experience with SQL and at least two of: Snowflake, Redshift, Postgres. You understand the performance characteristics of each and can write queries that don't bring down production Experience with AWS — S3, RDS, EKS, EventBridge, IAM. Comfortable working in a Terraform-managed environment Experience with Kubernetes. Our workloads run on EKS and you will deploy, debug, and scale services on K8s Familiarity with data orchestration tools (Prefect, Airflow, or Dagster) and transformation frameworks (dbt) Understanding of data governance concepts — RBAC, PII handling, audit logging, data lineage Fluency with AI-assisted development tools (Claude Code, Cursor, or similar). This is a hard requirement — the team uses these tools daily and we expect engineers to leverage them for code generation, debugging, and investigation Experience building shared libraries or SDKs consumed by multiple teams — versioning, backwards compatibility, migration support Experience with event-driven architectures — CDC, event buses, schema registries, at-least-once delivery semantics Experience with OpenTelemetry, ClickHouse, or similar observability infrastructure Prior work in regulated environments (SOC 2, FedRAMP, HIPAA) where compliance requirements shaped system design Experience with Ray for distributed compute workloads What the job involves The Core Services team within Snorkel AI's Infrastructure organization will own the data platform that powers everything at Snorkel — the pipelines, access layers, event systems, governance, and compute infrastructure that every product team and customer deployment depends on We are a small team with a large surface area, and we are in the middle of a foundational architecture shift: moving from a single-database data path to a multi-source, event-driven platform with dedicated stores for different workloads (transactional data in Postgres/RDS, analytical data in Snowflake, bulk storage in S3, metrics platform) The decisions being made now will define how data flows at Snorkel for years You will be making them You’ll also shape our AI-native development workflow, contribute in modernizing CI/CD workflow (Buildkite, GitHub Actions), and integrate AI SRE tooling Your work will directly accelerate developer velocity, reliability, and product quality across the company Build and maintain the shared data access library and SDKs that Platform, Packaging, and Dataset API teams use to read from and write to multiple data sources (Snowflake, S3, RDS). Design interfaces that abstract source-level complexity while providing built-in auth, RBAC enforcement, pagination, and query governance Design and implement event-driven data flows using event brokers, CDC connectors, schema registry, event routing, dead letter queues. Make sure events flow reliably and failures are visible and recoverable Build the systems that track how data moves through the platform (lineage), enforce who can access what (governance and RBAC), and log what happened (auditing). This includes PII handling, retention policy enforcement, and audit infrastructure for enterprise and federal compliance Instrument the data platform with OpenTelemetry, define and monitor SLOs for query latency and pipeline success rates, and build alerting that catches issues before they become incidents. You will be on-call for the systems you build Contribute to infrastructure cost visibility and optimization - query cost estimation, workload right-sizing, and routing data to the most cost-effective storage tier for its access pattern Benefits Health: Snorkelers and their dependents are covered by comprehensive medical, dental, and vision plans. Wellness: All Snorkelers get a yearly wellness stipend to use on anything related to health and well-being. PTO and rest days: We provide unlimited personal time off so you can find your work-life balance. Plus, two company-wide rest days per quarter in addition to company holidays. One global team: From our headquarters in Redwood City, CA, to regional offices in San Francisco and New York, we work together as one global team. Events: Get to know your fellow Snorkelers outsi