Job Description
Scientific Data Architect โ Indianapolis, IN ๐ Indianapolis, IN | Full-Time | Hybrid The Role We're looking for a product-minded, outcome-obsessed Scientific Data Architect to join a high-impact team at the intersection of life sciences R&D and AI. You'll work directly with scientific and technical stakeholders onsite a few days per week in the Indianapolis area, translating complex scientific data challenges into scalable, AI-ready solutions. What You'll Do Design and implement extensible, reusable data models (tabular and JSON) that capture and organize scientific data at scale Develop Python-based parsers to programmatically interrogate proprietary instrument output files Integrate lab software (ELN/LIMS) via APIs and build data visualization apps using Streamlit, Plotly, and similar frameworks Collaborate with scientists, engineers, and product managers to develop and deploy ML, AI, and statistical models Rapidly prototype and demo solutions directly with end users to accelerate adoption Contribute to product roadmap by translating customer pain points into actionable priorities Travel to client sites in the Indianapolis, St. Louis, and Chicago regions as needed What We're Looking For PhD with 4+ years or MS with 8+ years of industry experience in life sciences Deep domain knowledge in drug discovery, preclinical development, CMC, or product quality testing Proven track record designing and implementing AI/ML-driven use cases in cloud environments Hands-on Python development experience including data modeling, parsing, and app development Experience integrating ELN/LIMS systems via APIs Strong communication and storytelling skills โ comfortable engaging scientists and executive stakeholders alike Self-starter mentality with a bias toward prototyping and action Bonus Points Experience with Streamlit, Plotly, Holoviews, or similar data app frameworks Familiarity with AWS or other cloud-native environments Background in scientific consulting or customer-facing roles Experience with exploratory data analysis across complex biopharma datasets