Introduction

The pervasive integration of AI in cardiovascular imaging raises pertinent legal and ethical considerations. The increasing complexity of AI models, particularly those involving machine learning (ML) and deep learning (DL), introduces the challenge of the "black box"—a term that describes the opacity of AI decision-making processes [1]. This challenge raised concerns about the explicability and transparency of AI systems, which are essential for their ethical deployment and clinical acceptance [2].

Despite numerous advancements in AI-driven cardiovascular imaging, existing reviews often focus predominantly on technical enhancements and clinical outcomes, with less emphasis on how these AI models make decisions or on the mechanisms underlying their outputs [3]. This gap underscores a vital need for comprehensive reviews that not only explores these advanced technologies but also delves into the ethical and practical implications of their opaque nature.

This manuscript is structured to first outline the basic concepts and technologies underpinning AI in cardiovascular imaging, followed by an exploration of the "black box" phenomenon. Subsequent sections discuss legal, ethical, and practical challenges, culminating in a discussion on future directions that bridge gaps between technical capabilities and clinical needs. Our goal is to furnish clinicians, researchers, and policymakers with a deeper understanding of AI's potential and limitations in cardiovascular healthcare.

An overview on AI

AI involves developing computer programs that perform complex tasks mimicking human cognition. A key component of AI, machine learning (ML), enables algorithms to learn from data, improve performance, and make predictions [4]. Advances in computational power and big data have propelled ML's application in healthcare [5]. The rise of smart devices and electronic medical records has expanded data availability, enhancing ML algorithm performance despite data complexity [6].

ML training may be either “supervised” or “unsupervised.” In supervised training, an ML model is trained on a range of inputs in association with a known outcome which is supervised, either in accordance to an objective classification metric or by a domain expert. In contrast, unsupervised training refers to the development of a model to explore the patterns or clusters that are not well-defined inside datasets. In this form, the model is only provided by unlabeled input data and does not learn to fit data to an outcome [7].

Deep learning (DL), a subset of ML, is another crucial concept in AI. DL is programmed to process data with large artificial neural networks through multiple processing layers, resembling the working of biological neurons [8]. It has achieved impressive results when used for complex tasks involving very high-dimensional data, including speech and image recognition to self-driving cars [9, 10]. Deep learning models utilize numerous layers of hidden neurons to generate increasingly abstract and nonlinear representations of the underlying data. This process, known as "representation learning," constitutes a pivotal aspect of deep neural networks. Following the acquisition of these representations, final output nodes are frequently utilized as inputs for logistic regression models or support vector machines (SVMs) to perform the ultimate regression or classification tasks. Convolutional neural networks (CNNs) and recurrent neural networks (RNNs) represent two prominent forms of deep learning models for supervised learning. The primary distinction between CNNs and RNNs lies in their respective layer designs. Beyond these methods, there exists a diverse array of deep neural network architectures.

CNNs resemble fully connected neural networks, comprising neurons with adjustable weights and biases. Their potency stems from the capacity to establish local connectivity across images or signals. These localized connections incorporate nonlinear activation functions, facilitating the transformation of representations into higher, slightly more abstract forms. Furthermore, shared weights across layers, layer pooling, and the integration of numerous hidden layers enable the learning of highly intricate functions. In contrast, RNNs excel in processing sequential data such as speech and language. Comprised of an additional hidden state vector, RNNs retain "memory" regarding the historical data observations, rendering them well-suited for tasks involving sequential information [11, 12].

In recent years, generative AI (GAI), a subtype of AI, peaked with the introduction of new language and image models that showed unprecedented capabilities. GAI models can now create images or even videos from text input, edit images from text prompts, and generate text taking part in complete conversations. These models have also openly available feeding more fuel into the surge of GAI’s popularity among the general public. Hence, this allowed non-technical users to experiment with use cases in various domains and specialties. GAI can be routed back to the advancement of specifically two type of networks: transformers, which are more complex forms of RNNs and GANs which use two different CNNs and train them together in an adversarial manner.

While RNNs excel at handling sequential data by maintaining a hidden state that captures the essence of previous inputs, they struggle with long-range dependencies and parallel processing. The transformer overcomes these limitations by introducing self-attention mechanisms, which allow the model to weigh the importance of each word in a sequence relative to all other words, rather than relying solely on the sequential processing inherent in RNNs. The transformer's architecture eliminates the need for recurrent connections, enabling it to process all tokens in a sequence simultaneously. This parallelism significantly enhances efficiency and allows the model to capture long-term dependencies more effectively. The use of multi-head self-attention within the transformer ensures that the model can focus on different parts of the sequence simultaneously, leading to a richer and more nuanced understanding of context. Transformers paved the way to the remarkable power and admirable status of Large Language Models (LLMs) like ChatGPT today [13].

Generative Adversarial Networks (GANs), on the other hand, are highly effective in generating realistic data across various domains, such as images, video, and audio. GANs consist of two neural networks—a generator and a discriminator—that are trained together in a competitive setting. The generator attempts to produce data that mimics the real data distribution; while, the discriminator tries to distinguish between real and generated data. This adversarial process pushes the generator to create increasingly convincing outputs, ultimately resulting in the generation of highly realistic data [14].

Teaching Points:

  • AI and ML are critical for performing tasks that mimic human cognition, with ML enabling algorithms to learn from data and improve predictions.

  • Deep Learning, a subset of ML, uses large neural networks to process complex data, excelling in tasks like image and speech recognition.

  • GAI models, powered by transformers and GANs, are revolutionizing AI applications by creating realistic data, such as images and text, from simple inputs.

AI applications in cardiovascular imaging

AI can analyze vast amounts of image data to identify subtle patterns and anomalies that may be overlooked by human experts. For instance, AI-powered systems can accurately quantify coronary artery stenosis from CT angiography in real time [15]. Neural networks can also be trained with the appropriate data to detect early signs of heart failure from chest X-rays [16]. Such applications can lead to earlier and more accurate diagnoses, enabling timely interventions and improved patient outcomes. Beyond diagnostic capabilities, AI is optimizing imaging workflows. Automated image acquisition, reconstruction, and segmentation tasks reduce human error and expedite the interpretation process [17]. Additionally, AI-driven predictive models can identify patients at high risk for cardiovascular events based on imaging data, allowing for proactive risk management strategies [18].

Generative AI (GAI) is revolutionizing cardiovascular imaging by enhancing image quality, automating complex tasks, and improving diagnostic precision across various modalities [19].In Cardiac MRI (CMR), for example, GAI plays a crucial role in accelerating image reconstruction and reducing motion artifacts, with methods like those developed by Ghodrati et al. enabling free-breathing scans, thus enhancing patient comfort and scan efficiency [20]. Advanced reconstruction techniques such as variational neural networks (VNNs) allow for high-quality imaging from undersampled data, significantly reducing scan times without compromising accuracy [21]. This is particularly beneficial for procedures requiring detailed volumetric and functional analysis of the heart, making CMR more accessible and reliable for clinical decision-making.

In Cardiac Computed Tomography (CCT), GAI-based approaches have shown significant promise in improving both image quality and diagnostic accuracy. AI-driven algorithms, such as Itu et al.'s method for Fractional Flow Reserve CT (FFR-CT), have drastically reduced analysis time while maintaining high predictive accuracy, showcasing the potential of AI to enhance non-invasive coronary artery disease (CAD) evaluation [22]. These AI-powered advancements are not only streamlining clinical workflows but also providing more consistent and reliable diagnostic information, ultimately improving patient outcomes in cardiovascular care. Table 1 provides overview of the AI concepts and applications discussed in this section.

Table 1 AI Concepts and Applications

Teaching Points:

  • AI improves diagnostic accuracy by identifying subtle patterns in cardiovascular imaging, such as detecting coronary artery stenosis and early heart failure.

  • AI optimizes imaging workflows by automating tasks like image acquisition and reconstruction, reducing human error and speeding up diagnosis.

The “black box” challenges

The "black box" nature in AI models

Despite these benefits, the complexity of AI models, particularly deep learning methods, poses significant challenges [23, 24]. In the context of AI in radiology, "Black box" refers to situations where the AI model's decision-making process is opaque or not easily understandable by humans. This means that while the AI can provide results or recommendations, the underlying reasoning or mechanisms that led to these conclusions are not transparent. Such indications can pose challenges in clinical settings because clinicians may not fully understand or trust the AI’s outputs, which can impact patient care [25, 26]. Understanding and explaining AI's decisions is crucial for clinical acceptance and ethical deployment [27].

Ensuring the reliability of an AI system requires demonstrating that the system has learned the underlying properties and that the decisions made are not based on irrelevant correlations between input and output values in the training dataset [28]. While it is possible to minimize an AI method's weaknesses by carefully selecting its model architecture and training algorithm, errors cannot be eliminated [29].

The ability of different AI models to understand generated models varies significantly. With the emergence of new and powerful DL methods, it is becoming increasingly difficult to reconstruct decisions. Frequently, the resulting models function as “black boxes,” rendering it arduous for users to comprehend the internal processes [28]. Users can only understand input and output values, despite designers possessing an understanding of the system's architecture and the methodologies employed to generate the models [30]. In contrast, interpretable models are referred to as white boxes, where weights are assigned to each feature, allowing for easy reading and interpretation while an intermediate stage between the two is the gray box. Gray box models provide a certain level of insight into internal data processing [31].

It is important to note that in practice, a method cannot always be clearly classified as a white, gray, or black box method. Thus, to address the issue of lack of explainability, there is a need for explanation models for black box models, which help in understanding how they work. Figure 1 provides visual representation of the black box problem in comparison with explainable AI.

Fig. 1
figure 1

Visual representation of the black box problem in comparison with explainable AI

Challenges and limitations associated with AI's “Black-Box” nature in cardiovascular imaging

As stated before, the decision-making process of AI is often unclear, which presents a challenge in interpreting and understanding its results. Although the results of DL in cardiovascular imaging are promising, they are still modest, and several challenges must be overcome to improve them [32]. Common deep learning architectures, such as convolutional neural networks (CNNs), generative adversarial networks (GANs), and recurrent neural networks (RNNs) do not provide explanations for their outcomes [33]. In the clinical context, the most important challenge is often referred to as a black box. Therefore, in the health sector, developing explainable machine learning systems remains a top priority for computer scientists, policymakers, and users [34].

There is currently no agreed-upon definition for explainability, despite the consensus on the importance of developing and implementing interpretable models [35]. For instance, Luo et al. proposed a new data preprocessing technique for detecting cardiac diseases using cardiac magnetic resonance (CMR) imaging and a new network structure for the estimation of left ventricular volume. Their study demonstrated that the method had high accuracy in predicting left ventricular (LV) volumes. However, they pointed out a significant challenge commonly encountered in deep learning methods—the lack of interpretability for physicians. Achieving true interpretability in LV volume prediction may for example mean enabling physicians to identify the specific pixels used in blood volume computations. They emphasized that future research should focus on achieving interpretability in the direct prediction of LV volumes [36].

Even though AI algorithms can detect coronary artery disease, heart failure, conduction abnormalities, and valvular heart disease and aid in diagnoses, the lack of transparency raises concerns about their reliability, interpretability, and potential biases. To ensure that AI's clinical integration aligns with practical standards in healthcare, it is essential to understand the inner workings of these algorithms.

Teaching Points:

  • The complexity of AI models, particularly deep learning methods, often leads to a "black box" phenomenon, where the decision-making process is not transparent and difficult to interpret.

  • Explainability is crucial for clinical acceptance and ethical deployment of AI in healthcare, yet remains a challenge due to the opaque nature of many AI models.

  • In cardiovascular imaging, the lack of transparency in AI models raises concerns about reliability, interpretability, and potential biases, making explainable AI a priority.

Impact of explainability on clinical decision-making and patients’ outcomes in cardiovascular imaging AI

As elaborated before, in cardiovascular imaging, AI has an essential role, and understanding how it works is essential for effective implementation [37]. Evidence-based medicine is challenged by the opaqueness of ML models, especially in medical imaging. In evidence-based medicine, clinical decisions are informed by the best available evidence from scientific research, combined with clinical expertise and patient values. This approach relies heavily on transparent and interpretable data and models, allowing clinicians to understand the rationale behind recommendations or decisions. However, ML models, including those used in CV imaging, often operate as "black boxes," meaning their internal decision-making processes are not easily interpretable or explainable. This lack of transparency poses a significant challenge for evidence-based medicine because clinicians may struggle to trust or understand the outputs of these models, hindering their ability to integrate them effectively into clinical practice. In the context of CVS imaging, where accurate diagnosis and treatment decisions are paramount, the opaqueness of ML models can lead to uncertainty or skepticism among healthcare professionals. Clinicians may hesitate to rely on ML-based recommendations without a clear understanding of how the model arrived at its conclusions.

One of the significant challenge is related to error detection. It is plausible that AI systems may sometimes deviate from accepted standards of clinical decision-making [38]. Image classification algorithms, such as convolutional neural networks, are particularly susceptible to unexpected and unusual classification errors [39], leading to difficulty in comprehending the causal factors influencing these ML models’ correlations. This ambiguity can undermine healthcare practitioners' confidence in relying on AI predictions, particularly when they conflict with conventional clinical judgment [40]. To optimize ML systems, it's imperative to comprehend their decision-making process. AI explainability allows individuals to understand how an AI model makes decisions, going beyond just improving AI actions [41].

Qualitative research indicates that clinicians prioritize pertinent and easily comprehensible ML model information to make informed decisions. A study conducted by Tonekaboni et al. found that clinicians do not necessarily prefer to understand the causal mechanisms of action behind ML decision-making. Instead, they prefer easily understandable and relevant information about how the model works in the context of healthcare decision-making. This information may include confidence scores, the reasoning behind a decision, and details that are tailored to the specific clinical context [27].

Lang and colleagues also have pointed out that some of the most effective applications of AI in cardiovascular imaging may not be explainable. This has raised concerns among some experts who suggest that the use of unexplainable models should be stopped due to the significant problems they may pose [38, 42]. In conclusion, while technical experts may not possess comprehensive understanding of machine learning (ML) algorithms, it is imperative that these systems furnish outputs or associated information enabling users to assess predictions pertinent to their clinical decision-making. Although efforts are underway to develop mechanisms for contextualizing ML predictions based on user needs, achieving full comprehension of AI predictions remains an evolving research frontier [43]. Table 2 provides and overview of the importance of.

Table 2 Explainability in AI for cardiovascular imaging

Teaching Points.

  • The lack of transparency in AI models, particularly in cardiovascular imaging, poses a challenge to evidence-based medicine by making it difficult for clinicians to understand and trust AI-generated recommendations.

  • The opaqueness of AI models can undermine healthcare professionals' confidence, especially when AI predictions conflict with traditional clinical judgment.

  • Clinicians prioritize AI outputs that are relevant, easily comprehensible, and tailored to specific clinical contexts, even if they do not fully understand the underlying mechanisms.

Legal and ethical implications

Challenges related to unexplainable AI in healthcare

The opacity in AI systems introduces significant legal and ethical challenges in healthcare. Clinician trust is crucial for AI integration into clinical workflows. A lack of explainability and transparency can lead to ethical dilemmas and affect reliance on AI for patient care [44]. Ethical principles such as beneficence (acting in the best interest of patients) and non-maleficence (do no harm) come into play when considering the potential risks associated with using AI systems with opaque decision-making processes. Transparency in algorithmic processes is key to facilitating comprehension [45]. In clinical settings, AI techniques must provide justifications for their decisions to increase clinicians' confidence in the accuracy of the results [46]. The use of AI models with low transparency or interpretability also raises concerns about accountability, patient safety, and decision-making processes. From a legal perspective, the issue of clinician trust intersects with liability and accountability. If clinicians rely on AI-driven diagnoses or treatment recommendations without fully understanding the rationale behind them, it can complicate matters in cases of medical errors or adverse outcomes. Determining responsibility becomes challenging when the decision-making process of AI remains opaque, potentially raising questions about liability and legal accountability [38].

Unfortunately, many AI-based cardiovascular imaging applications often exhibit an unexplainable "black box." It can be challenging to evaluate the clinical risks and benefits of unexplainable models, particularly when there is a risk of biased decision-making. The challenge becomes even greater when it comes to distinguishing between AI models that can be explained and those that cannot [30]. The use of unexplainable AI in medical applications has been a topic of debate in recent times. While some argue that regulations should deal more strictly with the unexplainable models, others believe that stricter regulations might impede innovation, clinical adoption, and lead to suboptimal patient outcomes [38]. The replication of clinical trials for technically unexplained models is uniquely challenging since commercial developers often do not wish to divulge their trade secrets [47]. Nevertheless, it is essential to recognize that the uncertainty surrounding medical interventions is not a new challenge. However, it is essential to recognize that the unique complexities of AI-based cardiovascular imaging applications warrant careful consideration of whether distinct regulatory approaches are necessary. This includes adherence to validation plans and regulations set forth by regulatory bodies such as the FDA for the deployment of medical AI. [30].

Legal frameworks governing unexplainable AI extend to medical malpractice, making it more difficult for clinicians to set standards of care. The changing landscape necessitates a reevaluation of professional expectations and guidelines. Increasingly, AI-based care poses challenges to traditional ethical beliefs, as automated decision-making impacts comprehensibility [38, 48].

Another challenge in the context of unexplainable AI is the concept of informed consent. Clinical experts believe that informed consent is essential before using AI on patients. They also believe that computer-aided detection applications should be disclosed in reports, explaining the reasons for eventual disagreement. The provision of inaccurate information to patients and clinicians about the risks of AI algorithms may indeed constitute a breach of the duty of care, so the adequacy of information provided to users is crucial in making judgments. When it comes to information, however, they wonder what exactly needs to be disclosed to the patient [49]. These challenges become more sophisticated in the use of unexplainable AI. Patients have the right to understand and agree to the procedures or treatments suggested by AI algorithms.

Some proposals are made to avoid some legal and ethical issues: one possible solution is to efficiently extract interpretable features for disease classification by leveraging the power of deep learning. Researchers proposed techniques for extracting features from deep learning models that are not only accurate for disease classification but also interpretable by healthcare professionals. By leveraging the capabilities of deep learning algorithms, these techniques aim to identify and extract meaningful and interpretable features or patterns from medical images that are indicative of specific diseases or conditions [50]. This approach allows clinicians to better understand how the deep learning model arrives at its predictions by providing insights into the features or characteristics of the medical images that contribute to the classification process.

Another approach is to provide visible explanations of the output of neural networks after their application to medical images. GRADCAM, short for Gradient-weighted Class Activation Mapping, is a technique used in computer vision and deep learning for visualizing and understanding the decision-making process of convolutional neural networks (CNNs). It works by generating a heatmap that highlights the regions of an input image that are most important for CNN's classification decision. This heatmap is produced by computing the gradient of the predicted class score with respect to the final convolutional layer of the CNN. By visualizing which parts of the input image contribute most strongly to the network's decision, GRADCAM provides valuable insights into how the model is processing the data and making predictions. This can significantly improve the understanding of the decisions made by these networks and enhance the trust and adoption of AI technologies among medical professionals [45]. An example of GRADCAM use in a cardiovascular context was highlighted by Zhang et al., where they employed attention supervision in a deep learning model to guide a multi-stream Convolutional Neural Network (CNN) to focus on specific myocardial segments for automated motion artifact detection in cardiac T1-mapping [51]. However, some commentators have suggested it may be necessary to abandon unexplainable AI models. This is due to the significant problems that arise from the use of such models, which may be difficult to explain or understand [47].

European and American Multi-society Statement highlights numerous AI-related ethical challenges and opportunities. Recognizing the need for practical guidelines, a framework has been called for to assist AI practitioners. However, it's worth noting that the rapid pace of change in AI techniques and tools makes it challenging to maintain a comprehensive and up-to-date understanding of the ethical landscape [52, 53].

Physician liability and fault

The use of unexplainable AI models in cardiovascular imaging raises complex questions regarding physician liability within the existing medical malpractice framework. The foundation of medical practice is based on the duty of care, which includes providing treatment, information, follow-up, and maintaining patient confidentiality. However, the evolving landscape of AI in clinical settings introduces uncertainties regarding the appropriate standard of care for clinicians employing unexplainable models [38, 54].

At present, regulations do not appear to conceive of any legally significant distinction between medical imaging AI models that can be explained and those that cannot, leaving the question open as to whether this regulatory approach appropriately balances patient interests, and whether it strikes a balance between innovation and safety.

As establishing a direct link between breach of duty and patient harm becomes increasingly difficult in AI-related medical malpractice, causation becomes especially intricate. In cases of unexplainable AI models contributing to patient injury, true causation, determined by a "but-for" test, may prove elusive [55]. A legal cause-and-effect analysis adds to the complexity, especially with models that operate beyond human comprehension and are technically unexplainable [38]. It is difficult to hold physicians legally responsible for injury under circumstances where foreseeable outcomes are difficult to identify.

Physicians must provide treatment consistent with professional best practices as mandated by law. When someone claims medical malpractice, they must prove that a physician failed to meet their duty of care and that as a result, they suffered legally recognizable harm. Courts face considerable challenges when it comes to adopting perspectives about the unexplainable nature of AI models in medical imaging, potentially complicating the attribution of liability [56]. Establishing standards of care and legal causation in medical malpractice cases is a complex task and presents inherent difficulties. Furthermore, the introduction of unexplainable medical AI adds another layer of complexity in product liability cases, leading to discussions on whether manufacturers should be held accountable for the unforeseeable outcomes of their products [56]. For instance, if a DL-powered model is used for cardiovascular CT image reconstruction and a patient is injured due to misdiagnosis of a cardiovascular abnormality, it may not be immediately clear whether the physician is responsible for the injury, even if a court finds that the physician breached their duty of care. Notably, the automatic presumption of fault in product liability regimes contrasts with the evidence-based approach in civil liability regimes [45].

Currently, the European and North American Multi-society Statement mentioned that physicians, including radiologists, are held liable in cases where "standard of care" is not provided. In cases where Al is used as a decision aid, radiologists will likely still be considered liable, though it is probable that litigation will also accuse Al product manufacturers. Since models incorporate large amounts of data, some of which are not perceptible to humans, the question will arise whether physicians should remain solely responsible or whether responsibility should be shifted to those who produce, market, and sell models. If, for example, low-dose CT images are enhanced by an algorithm to improve image quality, but this processing alters an important, but subtle feature so much that it is barely perceptible, the software developer should be liable for this. In the end, it is up to practice and case law to resolve these complex legal issues [52]. The American College of Radiology (ACR) also believes that, for now, since there are no diagnostic radiology models cleared for autonomous use in the U.S., radiologist responsibility remains solely with them.

Ultimately, the evolving landscape of medical AI necessitates careful consideration of regulatory approaches, ongoing technological advancements, and dynamic interpretations by courts. It is still difficult to strike a balance between encouraging innovation, ensuring patient safety, and setting clear standards of accountability. Despite existing literature discussing multiple liability theories about AI use, a definitive and unanimous answer to this issue has not yet been found [49]. In the coming years, solutions that improve interpretability and transparency while considering ethical considerations will play a pivotal role in shaping the responsible integration of AI into cardiovascular imaging. Table 3 provides a brief overview of the ethical and legal implications of unexplainable AI in CVS imaging.

Table 3 Legal and ethical implications of unexplainable AI in cardiovascular imaging

Teaching Points:

  • The opacity of AI systems raises significant ethical and legal challenges, particularly regarding accountability in cases of medical errors or adverse outcomes. Clinicians need transparency to build trust and make informed decisions.

  • The use of unexplainable AI models complicates physician liability within the existing medical malpractice framework. Establishing a clear standard of care and causation becomes increasingly difficult when AI decisions are not fully understood.

  • Informed consent is crucial when using AI in healthcare. Patients have the right to understand and agree to AI-driven procedures or treatments, making transparency in AI systems vital for maintaining trust and ensuring patient safety.

Bridging the gap and future directions

To overcome the ethical issues and challenges associated with the use of unexplainable AI in healthcare, particularly in cardiovascular imaging, there has been a surge of interest in explainable AI (XAI) techniques. This section explores these techniques and outlines future directions to address the black box problem.

Advancements in explainable AI (XAI) Techniques and innovative solutions for interpretability

Model-based versus post hoc explanation

Model-based explanation refers to models, including linear regression or support vector machines that are simple enough to be easily understood while still being sophisticated enough to effectively capture the relationship between inputs and output [43]. These models are usually the traditional ML models that are simpler and more interpretable, in contrast to more modern complex models such as deep neural networks. Sparsity and simulatability are two well-known examples of these models. Sparsity refers to models that force many coefficients exactly to zero. Hence, this leads to a sparse model where only a subset of features significantly contributes to output, making the inner construct of this model explainable [57] Simulatability implies whether a human can internally reason about the model's computations and decision-making process. In simpler models, such as linear regression, it's easier for an individual to comprehend how each feature contributes to the final output [58]. Figure 2 shows how some explainable models can have minimal black box problem.

Fig. 2
figure 2

An illustration of how some explainable models can have minimal black box problem

In contrast to model-based explanation, post hoc explanation trains a neural network and subsequently tries to elucidate the behavior of the resulting black box network rather than forcing the neural network to be explainable. This makes the post hoc to be easier to understand and more user-friendly and can be applied to any model, regardless of its complexity [57]. Techniques include inspection of the learned features, feature importance, interaction of features, and visual explanation by saliency maps [59,60,61,62]. However, the weakness of this method is its limited capacity to capture the full complexity of a model. Therefore, the choice between these two is a trade-off between accuracy and interpretability and depends on the specific case used.

The global and local explanation

Global explanation, also called dataset-level explanation, refers to understanding the overall workings of a machine learning model across the entire dataset. It can quantify the importance of features and present them as scores at the dataset level. In this way, it is determined how much the features contribute to the output in the entire data set [60]. Local explanation explains how the model reached a particular decision, in every instance or data point. As an example, in a neural network model, the global explanation can find out at the "dataset level" that high blood pressure can increase the risk of cardiovascular events. While the local explanation shows why an increase in blood pressure leads to an increase in the risk of cardiovascular events in "a single person" [57, 63].

There are examples of global and local explainability in cardiovascular imaging as well. In 2019, Clough et al. presented a classification framework for identifying cardiac diseases using temporal sequences of cardiac MR segmentation based on a convolutional neural networks (CNN) model [64]. Their model not only performed the classification but, with the help of variational autoencoders (VAE), also allowed global and local interpretation. Variational autoencoders (VAEs) are a type of generative model that learns a latent representation of input data and can reconstruct input data from a compressed latent space [65]. By local interpretation, they meant the ability to ask, “Which features of this particular image led to it being classified in this particular way?” and by global interpretation, they meant, “Which common features were generally associated with images assigned to this particular class?”.

Techniques for interpretable features and visual explanations

Techniques that extract interpretable features from deep learning models are essential for demystifying black box AI systems. Research should focus on developing methods that transform complex neural network representations into more understandable formats without losing the accuracy and robustness of the original models [45]. Furthermore, visual explanation tools such as Gradient-weighted Class Activation Mapping (GRADCAM) can provide intuitive insights into AI decisions by highlighting important regions in an image that contribute to the model’s output. These visual aids can help clinicians understand and trust AI diagnoses by showing which parts of an image were most influential [62].

Development of hybrid models

Hybrid models that combine interpretable models with black box systems can enhance transparency without sacrificing performance. These models can use interpretable components to provide explanations and black box components to handle complex, high-dimensional data [66]. Research should focus on optimizing these hybrid approaches to maintain accuracy while improving interpretability.

User-friendly interfaces

Also, designing user-friendly interfaces that present AI explanations in an accessible manner is crucial. Future research should prioritize developing tools and platforms that allow clinicians to interact with and query AI models easily. Interactive dashboards, visualization tools, and customizable explanation reports can help make AI insights more usable and trustworthy [67].

By advancing XAI techniques and developing innovative solutions for interpretability, the medical community can enhance the transparency and trustworthiness of AI models in cardiovascular imaging. These efforts will facilitate the responsible and effective integration of AI technologies into clinical practice, ultimately leading to better patient outcomes and improved healthcare delivery.

Education and training for healthcare professionals

The integration of AI in healthcare, particularly in cardiovascular imaging, necessitates comprehensive education and training programs for healthcare professionals. These programs are essential for equipping clinicians with the necessary skills to understand, interpret, and effectively use AI models in their practice. Without proper training, the benefits of AI cannot be fully realized, and the potential for misuse or mistrust may increase [27].

Training programs should be designed to provide a robust understanding of AI concepts, including machine learning, deep learning, and explainable AI (XAI) [68]. These programs should cover both the theoretical foundations and practical applications of AI in healthcare. Clinicians need to understand not only how to use AI tools but also how these tools work, their limitations, and the ethical considerations involved [69].

An example of successful AI training can be found in radiology. Many radiology departments have started incorporating AI training into their residency programs. These programs often include courses on AI fundamentals, hands-on training with AI tools, and case studies demonstrating AI applications in radiological practice [70]. For instance, the Radiological Society of North America (RSNA) offers educational resources and workshops on AI, helping radiologists stay updated with the latest AI advancements and best practices [71].

Workshops and continuing education programs (CMEs) are vital for keeping healthcare professionals abreast of the latest developments in AI. Organizations such as the American College of Cardiology (ACC) and the European Society of Cardiology (ESC) can play a pivotal role by offering regular workshops, webinars, and courses focused on AI in cardiovascular imaging. These sessions can cover new AI tools, clinical case studies, and interactive discussions on the challenges and benefits of AI integration [72].

Online platforms and resources can also provide accessible training opportunities for healthcare professionals. In addition, developing certification programs for AI proficiency in healthcare can standardize training and ensure a high level of competency among clinicians [73]. Certification can also provide a benchmark for institutions to assess the AI skills of their staff. For instance, a certification program could cover topics such as AI fundamentals, practical applications in cardiovascular imaging, ethical considerations, and patient communication.

Patient involvement and informed consent

As AI technologies, particularly those with black box characteristics, become more integrated into healthcare, especially in cardiovascular imaging, it is crucial to focus on patient involvement and informed consent. Addressing the challenges associated with black box AI models requires specific strategies to ensure patients are informed and engaged in their care decisions [74]. Future efforts should focus on creating and refining communication strategies that help patients understand the use of black box AI in their care. This involves developing educational materials that clearly explain AI technologies, their benefits, risks, and limitations in an accessible manner. Visualization tools, such as interactive diagrams or videos, can be particularly effective in demystifying complex AI concepts [75].

To address the challenges posed by black box AI, the informed consent process must be enhanced. Consent forms should include detailed information about the AI technology being used, how it contributes to the diagnostic or treatment process, and any potential uncertainties or limitations. Future work should explore standardized consent frameworks that can be adapted across various healthcare settings to ensure consistency and thoroughness [76].

Workshops, online courses, and informational brochures can help bridge the knowledge gap and empower patients to participate actively in their care decisions [77]. In addition, future research should aim to make black box AI algorithms more transparent and interpretable to patients. This could involve developing intermediate explanation models or user-friendly interfaces that provide insights into how AI algorithms arrive at their conclusions. For instance, integrating explainable AI (XAI) techniques that generate patient-friendly summaries of the AI’s decision-making process can enhance transparency [78].

Standardization, regulatory frameworks and policy aspect in imaging

The lack of standardized criteria for AI explainability presents significant challenges for consistent assessment across various applications [36]. Developing standardized frameworks and metrics for explainability is essential to provide common ground for developers, clinicians, and policymakers [79]. These standards will help ensure that AI models are evaluated consistently, enhancing their reliability and ethical deployment [52].

Ethical challenges posed by unexplainable AI models necessitate robust regulatory frameworks [44]. Future work should focus on creating guidelines that balance transparency, innovation, patient safety, and accountability [66]. Legal frameworks must address the unique complexities of AI in healthcare, ensuring that ethical principles such as beneficence, non-maleficence, autonomy, and justice are upheld in clinical practice. This approach will help build trust among clinicians and patients, facilitating the integration of AI into healthcare workflows [49].

Establishing mechanisms for the regular assessment of AI algorithms will help identify deviations from accepted standards and ensure that AI systems remain aligned with clinical needs [80, 81]. By implementing standardized explainability, robust ethical frameworks, and continuous monitoring, the medical community can ensure the responsible and effective use of AI in cardiovascular imaging and beyond.

Teaching Points:

  • Different XAI techniques like model-based and post hoc explanations offer various trade-offs between accuracy and interpretability.

  • Understanding both global (dataset-level) and local (instance-level) explanations helps in interpreting how AI models make decisions across different contexts and individual cases.

  • Techniques like GRADCAM and feature extraction make complex AI models more understandable by highlighting important features and decision-making processes.

  • Combining interpretable models with black box systems and developing user-friendly interfaces can enhance both transparency and performance in AI applications.

  • Comprehensive training for healthcare professionals on AI concepts and applications is crucial for effective and safe AI integration into clinical practice.

  • Ensuring that patients are well-informed and involved in decisions regarding AI-based treatments is vital for ethical and effective healthcare delivery.

  • Developing standardized criteria and robust regulatory frameworks for AI explainability will help balance innovation with patient safety and ethical considerations.

Conclusion

The integration of AI in cardiovascular imaging holds great potential but is hindered by the black box nature of most of the conventional models used in AI, which poses significant challenges for clinical decision-making, interpretability, and trust. While AI has demonstrated promising results in detecting various cardiovascular conditions, the lack of transparency raises concerns about its reliability and application in evidence-based medicine. To overcome these challenges, there is a pressing need to develop explainable AI (XAI) techniques that provide clear insights into AI decision-making processes. These techniques, including model-based and post hoc explanations, can bridge the gap between complex AI models and the need for transparency in clinical settings.

Moreover, comprehensive education and training programs for healthcare professionals are essential to ensure the effective and responsible use of AI in practice. These programs should equip clinicians with the knowledge and skills to understand and apply AI tools while addressing the ethical implications of their use. Additionally, patient involvement and informed consent must be prioritized to maintain autonomy and trust in AI-driven healthcare.

Finally, establishing robust ethical and regulatory frameworks is crucial for the safe and effective integration of AI in clinical workflows. By addressing these challenges, we can ensure that AI technologies are deployed responsibly, ultimately enhancing patient outcomes and transforming cardiovascular care.