← Back to jobs

Senior AI Engineer

Fieldguide
FULL_TIME Remote · US San Francisco Bay Area, US USD 15833–21250 / month Posted: 2026-05-11 Until: 2026-07-10
Apply Now →
You will be redirected to the original job posting on BeBee.
Apply directly with the employer.
Job Description
Who you are You’re a strong software engineer who’s built your skills for an AI-native world. These principles resonate with you: Bias to building: You move fast and resolve uncertainty by shipping AI-native instincts: You treat LLMs, agents, and automation as core building blocks Strong product judgment: You decide what matters and why—not just how to implement it Learning velocity: You learn quickly from feedback and adjust based on data Grounded optimism: You improve what's broken today and push toward what's possible next End-to-end ownership: You understand production systems and own outcomes We care more about capability and trajectory than years on a resume, but most strong candidates have: 3–6+ years shipping production software in complex, real-world systems Strong command of TypeScript, Python, and Postgres Shipped LLM-powered features serving real production traffic Built retrieval pipelines and agent orchestration systems Implemented evaluation frameworks for model outputs and agent behavior Worked with vector databases, embedding models, and RAG architectures Hands-on experience with modern LLM APIs (OpenAI, Gemini, Anthropic) and agent frameworks Comfortable operating in ambiguity and taking responsibility for outcomes What the job involves Fieldguide is building AI agents for the most complex audit and advisory workflows. As a Senior AI Engineer, you'll own meaningful product areas end-to-end—designing agentic architectures, building evaluation systems, and shipping agents that professionals trust with mission-critical work This role is for engineers who have shipped LLM-powered features in production and are ready to take the lead on complex systems while mentoring those around them Design and build agentic systems that automate complex audit workflows end-to-end Translate customer problems into concrete agent behaviors and orchestration logic Orchestrate LLMs, tools, retrieval, and business logic into reliable, production-grade agent experiences Own agents across their lifecycle: delivery, reliability, performance, and observability Use AI to accelerate design, build, test, and iteration cycles Prototype quickly, then harden systems for enterprise-grade reliability Build evaluation frameworks, feedback loops, and guardrails to improve agents over time Design prompts, retrieval pipelines, and orchestration logic that perform at scale Make clear trade-offs on what to build, cut, or skip based on customer value Partner with Product and Design to define capabilities that deliver real outcomes Stay close to customer workflows and optimize for highest-impact problems Identify capability gaps and unblock team progress proactively Raise the quality bar through code review, design feedback, and pairing Create reusable abstractions, patterns, and tooling that increase team velocity Share learnings across the team and establish engineering best practices Enterprise-grade reliability: Building systems professionals depend on Human-in-the-loop design: Knowing when to automate vs. when to surface decisions Nuanced evaluation: Audits require judgment, so feedback structures matter Explainability: Making AI outputs and reasoning transparent and trustworthy Complex domains: Navigating compliance and enterprise rigor while moving fast Shipping daily value: Delivering agent experiences customers use every day Benefits Health Dental PTO