Job Description
Who you are AI & ML Enablement: Experience designing data workflows, feature pipelines, or infrastructure that specifically supports AI/ML training, inference, experimentation, and monitoring Data Product Engineering: Proven experience building large-scale, production-grade data products and analytical systems Modern Data Systems: Strong expertise in SQL, distributed processing frameworks (e.g. Spark), cloud data platforms, and high-scale compute systems using Python and/or Rust Architecture & Systems Design: Demonstrated ability to design scalable data models, platform components, and distributed system architectures Analytical Rigor: Strong statistical and analytical skills; able to validate data quality and product outcomes through rigorous analysis Technical Communication: Clear communicator who can explain technical decisions, trade-offs, and risks to engineers, product partners, and leadership Healthcare Data Experience: Direct experience with claims, clinical, RWE, provider, patient, or life sciences data, including familiarity with coding systems (ICD-10, CPT, NDC, NPI) Data Product Delivery: Experience building and operating data products specifically designed for external consumption via customers, APIs, or serving layers High-Scale Data Architecture: Proven experience in optimizing systems for performance, cost efficiency, versioning, and large-volume data productization Applied AI / Agentic Workflows: Experience applying AI/agentic workflows to engineering, data quality, or delivery processes Fast-Growth Execution: Experience successfully operating in high-growth or ambiguous environments where leaders must adeptly balance architecture, delivery speed, and quality What the job involves Join Komodo Health's Data Foundations team and play a critical role in shaping the core data products that fuel our Healthcare Map This role is essential for transforming massive, complex healthcare datasets into performant, trustworthy, and usable data assets that directly power both customer-facing applications and internal product innovation By building and scaling our foundational data systems, you will directly enable the transparency and efficiency required to drive better health outcomes across the industry The Senior Data Engineer will be responsible for designing, implementing, and scaling the foundational data systems and pipelines that power Komodo Health’s platform and analytics products This role owns the end-to-end data processing lifecycle—from ingestion and modeling to serving—on a massive scale, utilizing cloud infrastructure and advanced distributed computing techniques This platform also serves as the foundation for AI/ML and agentic applications across Komodo Health You will set technical direction, ensure data quality, and be a technical leader in tackling the most complex data problems in healthcare Design and implement high-performance data processing and serving patterns across large-scale healthcare datasets using Spark, Python, Rust, and SQL, supporting both analytics and AI/ML workloads Lead complex cross-functional initiatives, balancing trade-offs across scalability, reliability, performance, cost, and delivery speed Build and maintain robust data quality, validation, observability, lineage, monitoring, and alerting frameworks Architect scalable Healthcare Map data products that power APIs, analytics, internal tooling, and customer-facing applications Partner with Product, Data Science, and Platform teams to translate healthcare use cases into scalable technical solutions Drive engineering excellence across system design, code quality, testing, documentation, and CI/CD practices Mentor engineers through design reviews, technical coaching, and architectural guidance Enable AI/ML and agentic applications by delivering high-quality, feature-ready datasets, curated data products, and reliable serving layers for training, inference, and evaluation Looking back on your first 12 months at Komodo Health, you will have… Architectural Advancement: Deliver high-impact technical initiatives that improve pipeline performance, scalability, and system efficiency Platform Hardening: Improve the reliability, observability, and cost-efficiency of core Data Foundations systems Healthcare Data Innovation: Develop deep domain expertise and contribute novel approaches to challenges such as patient journey mapping and identity resolution Cross-Functional Delivery: Partner with Data Product and Engineeri