Project

MDAgents: Adaptive Collaboration Strategy for LLMs in Medical Decision Making

yubin kim

Groups

Foundation models have become invaluable in advancing the medical field. Despite their promise, the strategic deployment of LLMs for effective utility in complex medical tasks remains an open question. Our novel framework, Medical Decision making Agents(MDAgents) aims to address this gap by automatically assigning the effective collaboration structure for LLMs. Assigned solo or group collaboration structure is tailored to the complexity of the medical task at hand, emulating real-world medical decision making processes. We evaluate our framework and baseline methods with state-of-the-art LLMs across a suite of challenging medical benchmarks: MedQA, MedMCQA, PubMedQA, DDXPlus, PMC-VQA, PathVQA, and MedVidQA, achieving the best performance in 5 out of 7 benchmarks that require an understanding of multi-modal medical reasoning. Ablation studies reveal that MDAgents excels in adapting the number of collaborating agents to optimize efficiency and accuracy, showcasing its robustness in diverse scenarios. We also explore the dynamics of group consensus, offering insights into how collaborative agents could behave in complex clinical team dynamics. Our code can be found at https://github.com/mitmedialab/MDAgents

MDAgents: Medical Decision-making Agents

The design of MDAgents incorporates four stages:

Medical Complexity Check: The system evaluates the medical query, categorizing it as low, moderate, or high complexity based on clinical decision-making techniques.
Expert Recruitment: Based on complexity, the framework activates a single Primary Care Clinician (PCC) for low complexity issues or a Multi-disciplinary Team (MDT) or Integrated Care Team (ICT) for moderate or high complexities.
Analysis and Synthesis: Solo queries use prompting techniques like Chain-of-Thought (CoT) and Self-Consistency (SC). MDTs involve multiple LLM agents forming a consensus, while ICTs synthesize information for the most complex cases.
Decision-making: The final stage synthesizes all inputs to provide a well-informed answer to the medical query.

Agent Roles

Moderator: The moderator agent functions as a general practitioner (GP) or emergency department doctor who first triages the medical query. This agent assesses the complexity of the problem and determines whether it should be handled by a single agent, an MDT, or an ICT. The moderator ensures the appropriate pathway is selected based on the query’s complexity and oversees the entire decision-making process.
Recruiter: The recruiter agent is responsible for assembling the appropriate team of specialist agents based on the complexity assessment of the moderator. The recruiter may assign a single PCP agent for low-complexity cases, while MDT or ICT with relevant expertise will be formed for moderate and high-complexity cases.
General Doctor/Specialist: These agents are domain-specific or general physicians recruited by the recruiter agent. Depending on the complexity of the case, they may work independently or as part of a team. General physicians handle less complex, routine cases, whereas specialists are recruited for their specific expertise in more complex scenarios. These agents engage in the collaborative decision-making process, contributing their specialized knowledge to reach a consensus or provide detailed reports for high-complexity cases.

Medical Complexity Classification

The first step in the MDAgents framework is to determine the complexity of a given medical query by the moderator LLM, which functions as a generalist practitioner (GP). The moderator acts as a classifier to return the complexity level of the given medical query. It is provided with information on how medical complexity should be defined and is instructed to classify the query into one of three different complexity levels:

Low: Simple, well-defined medical issues that can be resolved by a single PCP agent. These typically include common, acute illnesses or stable chronic conditions where the medical needs are predictable and require minimal interdisciplinary coordination.
Moderate: The medical issues involve multiple interacting factors, necessitating a collaborative approach among an MDT. These scenarios require the integration of diverse medical knowledge areas and coordination between specialists through consultation to develop effective care strategies.
High: Highly complex medical scenarios demand extensive coordination and combined expertise from an ICT. These cases often involve multiple chronic conditions, complicated surgical or trauma cases, and complex decision-making that integrates various healthcare departments and multiple specialists.

Expert Recruitment

Given a medical query, the goal of the recruiter LLM is to enlist domain experts as individuals, in groups, or as multiple teams, based on the complexity levels determined by the moderator LLM. Specifically, we assign medical expertise and roles to multiple LLMs, instructing them to either act independently as solo medical agents or collaborate with other medical experts in a team.

Medical Collaboration and Refinement

The initial assessment protocol of our decision-making framework categorizes query complexity into low, moderate, and high. This categorization is grounded in established medical constructs such as acuity for straightforward cases, comorbidity and case management complexity for intermediate and multi-disciplinary care requirements, and severity of illness for high-complexity cases requiring comprehensive management. We outline the specific refinement approach:

Low: Straightforward cases. For queries classified under low complexity, characterized by straightforward clinical decision pathways, a single PCP agent is deployed. The domain expert, who is recruited by the recruiter LLM, applies few-shot prompting to the problem. The output answer is directly obtained from the agent’s response to the query without the need for iterative refinement.
Moderate: Intermediate complexity cases. In addressing more complex queries, the utilization of an MDT approach has been increasingly recognized for its effectiveness in producing comprehensive and nuanced solutions. The MDT framework leverages the collective expertise of professionals from diverse disciplines, facilitating a holistic examination of the query at hand. This collaborative method is particularly advantageous in scenarios where the complexity of a problem transcends the scope of a single domain, necessitating a fusion of insights from various specialties. The MDT approach not only enhances decision-making quality through the integration of multidimensional perspectives but also significantly improves the adaptability and efficiency of the problem-solving process.
Building upon this foundation, our framework specifically addresses queries of moderate complexity through a structured, multi-tiered collaborative approach. An MDT recruited by the recruiter LLM starts an iterative discussion process aimed at reaching a consensus with at most R rounds. For every round, consensus within the MDT is determined by parsing and comparing their opinions. In the event of a disagreement, the moderator agent reviews the MDT’s discourse and formulates feedback for each agent.
High: Complex care cases. In contrast to the MDT approach, the ICT paradigm is essential for addressing the highest tier of query complexity in healthcare. This structured progression through the ICT ensures a depth of analysis that is specialized and focused at each stage of the decision-making process. Beginning with the Initial Assessment Team, moving through various diagnostic teams, and culminating with the Final Review & Decision Team, our ICT model aligns specialist insights into a cohesive narrative that informs the ultimate decision. This phased approach, supported by evidence from recent healthcare studies, has been shown to enhance the precision of clinical decision-making, as each team builds upon the foundation laid by the previous, ensuring a meticulous and refined examination of complex medical cases. The resultant reports are not only reflective of comprehensive medical evaluations but also of a systematic and layered analysis that is critical in the management of intricate health scenarios.

Decision-making

In the final stage of our framework, the decision-maker LLM agent synthesizes the diverse inputs generated throughout the decision-making process to arrive at a well-informed final answer to the medical query. This synthesis involves several components depending on the complexity level of the query:

Low: Directly utilizes the initial response from the primary decision-making agent.
Moderate: Incorporates the conversation history between the recruited agents to understand the nuances and disagreements in their responses.
High: Considers detailed reports generated by the agents, which include comprehensive analyses and justifications for their diagnostic suggestions.

The decision-making process is formulated as the final answer, determined by integrating the outputs from the analysis and synthesis step based on its medical complexities. This integration employs ensemble techniques such as temperature ensembles to ensure the decision is robust and reflects a consensus among the models when applicable.

Research Topics

#artificial intelligence #healthcare