Publication

Reinforcement learning with action-derived rewards for chemotherapy and clinical trial dosing regimen selection

Aug. 17, 2018

People

Pratik Shah

Former Principal Research Scientist

Projects

Self-Learning AI Model Learns from Patient Data to Design Novel Clinical Trials

Share this publication

Gregory Yauney, Shah P. Proceedings of the 3rd Machine Learning for Healthcare Conference (2018). PMLR 85:161-226

Abstract

Unstructured learning problems without well-defined rewards are unsuitable for current reinforcement learning (RL) approaches. Action-derived rewards can allow RL agents to
fully explore state and action trade-os in scenarios that require specific outcomes yet are
unstructured by external reward. Clinical trial dosing choice is an example of such a problem.
We report the successful formulation of clinical trial dosing choice as an RL problem using action-based rewards and learning of dosing regimens to reduce mean tumor diameters (MTD) in patients undergoing simulated temozolomide (TMZ) and procarbazine, 1-(2-chloroethyl)-3-cyclohexyl-l-nitrosourea, and vincristine (PCV) chemo- and radiotherapy clinical trials. The use of action-derived rewards as partial proxies for outcomes is described for the first time. Novel dosing regimens learned by an RL agent in the presence of action-derived rewards achieve significant reduction in MTD for cohorts and individual patients in simulated TMZ and PCV clinical trials while reducing treatment cycle administrations and dosage concentrations compared to human-expert dosing regimens. Our approach can be easily adapted for other learning tasks where outcome-based learning is not practical.

via Journal of Machine Learning Research

MLHC paper-Pratik Shah.pdf

Reinforcement learning with action-derived rewards for chemotherapy and clinical trial dosing regimen selection

People

Projects

Abstract

Artificial intelligence model “learns” from patient data to make cancer treatment less toxic

Pratik Shah invited to speak at 2019 American Association for Cancer Research Annual Meeting

Congratulations to Pratik Shah on being selected as an AAAS-Lemelson Invention Ambassador

MAS.S72. How to Write Academic Grant Proposals and Research Manuscripts

Reinforcement learning with action-derived rewards for chemotherapy and clinical trial dosing regimen selection

People

Projects

Share this publication

Abstract

Artificial intelligence model “learns” from patient data to make cancer treatment less toxic

Pratik Shah invited to speak at 2019 American Association for Cancer Research Annual Meeting

Congratulations to Pratik Shah on being selected as an AAAS-Lemelson Invention Ambassador

MAS.S72. How to Write Academic Grant Proposals and Research Manuscripts