Senior Machine Learning Engineer - Healthcare

MD Anderson Cancer Center

FULL_TIME Remote · US Houston, TX, Harris, US USD 12208–18292 / month Posted: 2026-05-11 Until: 2026-07-10

You will be redirected to the original job posting on BeBee.
Apply directly with the employer.

Job Description

The University of Texas MD Anderson Cancer Center is seeking a Senior Machine Learning Operations Engineer to support enterprise-wide artificial intelligence initiatives within Data Impact & Governance. The Senior Machine Learning Operations Engineer will join a multidisciplinary environment that integrates multidimensional data, advanced analytics, and machine learning to drive sustainable, responsible AI solutions that improve cancer care outcomes. Within this mission-driven environment, the Senior Machine Learning Operations Engineer plays a critical role in building, deploying, and sustaining production-quality machine learning systems. The Senior Machine Learning Operations Engineer partners closely with data scientists, engineers, clinicians, and business stakeholders to ensure AI solutions are scalable, secure, reliable, and aligned with responsible AI principles across UT MD Anderson. The ideal candidate is a seasoned machine learning or software engineering professional with a strong foundation in MLOps, cloud and on-premises AI platforms, and healthcare-focused AI lifecycle management. This individual typically holds a Bachelor's degree in a relevant technical discipline, with a Master's degree preferred, and brings significant hands-on experience developing, deploying, and maintaining machine learning systems in production environments. Experience leading or designing shared ML services, evaluating third-party AI solutions, and applying responsible AI practices within regulated or clinical settings is highly valued. Minimum $146,500 - Midpoint $183,000- Maximum $219,500 based on a 40-hour work week. Work Location: Remote within Texas only. Why Us? This role offers the opportunity to directly influence how artificial intelligence is responsibly scaled across UT MD Anderson, contributing to meaningful, long-lasting improvements in cancer care while working alongside experts in data science, engineering, and clinical innovation. The Senior Machine Learning Operations Engineer is supported by an environment that values continuous learning, technical excellence, and sustainable work practices while enabling professional growth and enterprise-level impact. Employer-paid medical coverage starting day one for employees working 30+ hours/week, plus optional group dental, vision, life, AD&D, and disability insurance. Accruals for PTO and Extended Illness Bank, plus paid holidays, wellness, childcare, and other leave options. Tuition Assistance Program after six months of service and access to extensive wellness, fitness, and employee resource groups. Defined-benefit pension through the Teachers Retirement System, voluntary retirement plans, and employer-paid life and reduced salary protection programs. Responsibilities AI Model Lifecycle & MLOps Oversee end-to-end AI model lifecycles including training, evaluation, deployment, monitoring, and maintenance of production-quality machine learning models Design and implement CI/CD pipelines for model training, deployment, monitoring, and retraining with a focus on security, scalability, reliability, reproducibility, and performance Implement rigorous testing, versioning, and documentation practices to support reproducibility, risk mitigation, and measurable impact Maintain comprehensive experiment tracking, data lineage, model lineage, and model scorecards Design fallback, rollback, and decommissioning strategies to ensure operational continuity of AI solutions Responsible AI & Governance Promote responsible AI practices by minimizing bias, enhancing fairness, and maximizing transparency in machine learning models Ensure AI lifecycle management aligns with institutional standards and best practices Support assessment, validation, and onboarding of external machine learning models and AI-driven products to minimize organizational risk and maximize value Platform, Infrastructure & Tooling Develop and maintain scalable data pipelines, feature stores, and artifact management systems Deploy and operate ML workloads across cloud and on-premises environments including Azure, AWS, or GCP Utilize containerization and orchestration technologies such as Docker, Kubernetes, and DAG-based tools Apply DevOps and MLOps tools including Azure DevOps, GitHub Actions, and version control systems Stakeholder Engagement & Enablement Collaborate with stakeholders to gather requirements, translate AI concepts into understandable terms, and incorporate feedback Partner with data scientists, ML engineers, and software engine