Introduction

Long-term postoperative cognitive change is a concern for older patients and their care partners. There are at least four important, credible sources of information on long-term postoperative cognitive outcomes which should be considered when trying to develop a unified, patient-centered picture of what might reasonably be expected. These different sources have achieved different degrees of methodological rigor (particularly control of confounding factors) and generalizability (Fig. 1).

Fig. 1
figure 1

Sources of information on long-term postoperative cognition are varied in their methodology, strengths, and limitations; all must be considered together to provide a unified understanding of the knowns and unknowns in this highly clinically relevant area of study

Addressing this topic first requires a comment on terminology. Previously, the research diagnosis of long-term cognitive effects following surgery/anesthesia was called “POCD” — for “postoperative cognitive dysfunction” or “postoperative cognitive decline.” We use this term in the review; much of the extant published literature applying neuropsychological tests to surgical cohorts is indeed referencing POCD. POCD was statistically defined, i.e., based on neuropsychiatric test performance normalized to baseline or a control group; for example, a postoperative decline of 1 or 2 standard deviations below baseline may meet POCD criteria. This lacked a clinical interpretation, emphasizing that it is a research definition only and not intended for clinical use [1]. Thus, in 2018 [2], new definitions were published, which retained — though slightly modified — the statistical definition but added a requirement for subjective cognitive decline, either by self-report or by others’ observation. The new definition, which attempted to better align with how mild cognitive impairment and/or dementia are defined, is used to make the research diagnosis of “postoperative neurocognitive disorder” (sometimes abbreviated pNCD, PND, or NCD-P). It is not yet known how research using the pNCD research definition will change how we clinically interpret postoperative cognitive change.

Anecdote, the Origin of POCD

Anecdote is a powerful and important source of information that motivated the movement towards a scientific definition of POCD. Certainly, the concern around long-term adverse cognitive outcomes after surgery was initiated by, and has persisted in part because of, the sharing of anecdotes. Consider an article in the British Medical Journal dating from 1887, wherein several cases linked anesthetics used at the time, such as chloroform and nitrous oxide, to “insanity [3]. In one case, a young woman “never regained her senses or recognized her friends.” In 1899, an article in the journal The Hospital responded, “Anaesthetics rarely produced insanity except in patients who had previous attacks, or were predisposed” [4]. In the modern era, popular media articles discussing POCD perhaps universally lead with a brief description of an anecdote of cognitive change from a patient-centered perspective — that these anecdotes continue to accumulate despite modern intraoperative hypnotics and monitoring, is important to explicitly acknowledge. In contrast, modern scientific publications on postoperative cognition rarely include patient-centered descriptions, now that formal research definitions are available. Anecdote, although underexamined, still provides us with critical patient perspectives on these definitions, as well as a lay understanding of long-term postoperative cognitive outcomes.

Do lay descriptions of anecdotal postoperative cognitive change align with the medical understanding of patients meeting the research definition of POCD? A popular media article published recently in British news source The Guardian received over 80 submitted comments, including nearly 40 unique anecdotes of POCD [5]. These anecdotes relayed a lay perspective of neurocognitive change after surgery or other medical exposures — ranging from minor sedation cases to perioperative cardiac arrest. Writers cited a wide variety of perceived causes, from anesthetic medications to surgical trauma to in-hospital experiences (e.g., poor food, limited activity).

In these anecdotes, patients and caregivers described deficits in memory and executive function, psychological changes like depression and anger, and “brain fog” after surgery [5]. The reported symptoms they describe align with and extend beyond neuropsychiatric domain-based deficits measured in research-defined POCD, like the memory and executive function deficits identified in primary cohort studies. While “brain fog” was the single most common anecdotal descriptor, it is poorly localized in a neuropsychological sense, sometimes reflecting deficits in attention, processing speed, word-finding, memory, or other sources [6]. However, in contrast to the research understanding of POCD, most of the anecdotes from these comments describe long-lasting, even permanent, symptoms. For example, the writer of one anecdote stated, “Three years on I am only managing to read a few pages at a time,” capturing what may be an executive function deficit (attention and short-term memory), its functional impact, and its duration. Although this patient’s experience is likely extreme — and, more broadly, the captivating details of anecdotes are inherently nongeneralizable — there are important lessons in how patients who perceive cognitive decline express, and relate to, their symptoms. For example, studying and describing potential subjective cognitive outcomes after surgery and their prevalence, in patient-centered, functional terms, would help all perioperative clinicians better communicate POCD risks.

Critically, there is presently an absence of data on how frequently patients experience permanent and subjectively impactful cognitive decline after surgery. Understanding common subjective symptoms of postoperative cognitive change may support earlier identification and referral for testing and therapy for patients. Anecdotes also relayed dissatisfaction with the medical establishment, describing medical professionals as failing to give warning about POCD symptoms and its management and generally leaving patients and their caregivers unprepared [5]. This presents an opportunity, and the patient care imperative, to systematically study the patient experience of cognitive recovery or non-recovery after surgery. The role of anecdote on the patient-centered understanding of POCD, while not being equivalent to research-defined pNCD, is critical to how perioperative clinicians discuss these risks.

Primary Cohort Studies: Detailed Characterization

Much of the published evidence about POCD is derived from primary cohort studies, where a group of patients is recruited shortly before surgery and then undergoes cognitive testing both before and at specified intervals after surgery. Primary cohort studies have the advantage of closely prespecifying analytic time points and neuropsychological tests of interest and offer detailed characterization of a small group of patients. POCD determinations are formally made in comparison to volunteer nonsurgical patients or to the participants’ own baseline measurements. The underlying assumption, that surgical patients would experience cognitive change similar to volunteer nonsurgical controls absent surgery, or similar to their own baseline performance after correction for learning effects, is most easily met shortly after surgery, where long-term cognitive aging would be expected to be subtle.

The ISPOCD-1 study [7] — one of the earliest to systematically measure prevalence of POCD after major surgery — found that 26% of patients met POCD criteria at 1 week, and 9.9% at 3 months, after major noncardiac surgery. ISPOCD-1 and others like it have precisely defined cohort-average behavior on cognitive tests in the time following surgery, offering a short-term recovery trajectory which may be helpful for clinical understanding. Cohorts limited to a specific anesthetic strategy or a specific type of surgery have yielded important insights. From this, we know that 3-month POCD rates may be similar whether a person undergoes coronary angiography with sedation (21%), hip replacement under general anesthesia (16%), or coronary artery bypass grafting (16%) [8]. POCD rates are also similar 3 months and 1 year after total intravenous anesthesia versus inhaled volatile anesthesia [9] and 3 months after neuraxial anesthesia versus general anesthesia [10]. Meta-analyses of POCD studies after coronary artery bypass grafting have demonstrated that, on balance, the short-term cognitive decline seen after surgery recovers coincident with clinical recovery [11]. However, caution should be exercised when comparing results across studies, due to highly variable definitions of POCD among different studies which may not be comparable. Early work implementing the new definition for pNCD — adding the requirement for subjective cognitive decline — has yielded the surprising conclusion that while only 3% of the investigators’ cohort of hip arthroplasty patients met POCD criteria at 12 months, nearly 30% met criteria for pNCD [12].

Deliberate selection and rigorous administration of neuropsychological tests in primary cohort studies have allowed precise characterization of the typical cognitive deficits seen after anesthesia/surgery. The most impacted domains include memory and executive function/attention [13] (which may be difficult to discriminate between). More broadly, primary cohort studies can offer detailed assessment modalities which generate new insights but would be impractical to perform across a population. The NeuroVISION study performed postoperative brain magnetic resonance imaging of their noncardiac surgical cohort and found a surprising 7% rate of perioperative covert stroke, which was associated with twofold odds of cognitive decline at 1 year [14]. Reproducible associations between POCD and older ages, lower baseline cognitive performance [15], frailty [12], and other intrinsic predictors of accelerated cognitive decline have been extensively shown. However, it is worth restating that the methodology for primary cohort studies relies on an underlying assumption that, except for the effects of surgery, surgical patients would experience cognitive change similar to volunteer nonsurgical controls or similar to their own baseline performance after correction for learning effects. At extended durations after surgery — perhaps 6 months or longer — the impact of longitudinal cognitive aging will be greater. If control populations are not well-matched on age, frailty, and health covariates like stroke risk, surgical patients on accelerated decline trajectories (perhaps due to advanced age, frailty, or progressing cerebrovascular disease) are likely to meet criteria for POCD or pNCD while not being cognitively impacted by the surgery itself. In other words, imbalanced comorbid conditions in surgical patients may drive cognitive decline greater than controls, achieving the “decline” in neuropsychological performance relative to controls necessary to meet criteria.

Further, one of the greatest challenges for primary cohort studies is that they are resource-intensive. This creates barriers to using this methodology to measure long-term outcomes; for feasibility, the emphasis is often on short-term (e.g., 3–6 month) cognitive performance. Long-term assessments may be hampered by participant drop-out, which tends to induce bias — more impaired participants may be less likely to volunteer to continue in the study. Because cognitive recovery may be ongoing during the first year after major surgery [11], short-term outcomes do not reflect full cognitive recovery for some patients. To understand long-term cognitive change (i.e., >12 months), an alternative approach is needed.

Epidemiology: a Long-Term View

Epidemiologic cohorts are designed to provide generalized, often repeatedly measured, information on health conditions and function for a broad population of subjects relevant to a broad variety of health questions. They are often publicly funded or derived from data collected for other purposes, because of the huge effort for data gathering that they present, but they offer a correspondingly powerful perspective on common conditions affecting human health — like surgery. Some of these studies include repeated objective and/or patient-reported cognitive function. Unlike primary cohort studies, where investigators have a wide variety of choices for neuropsychological tests that correspond to the diverse domains of cognitive function, epidemiologic cohorts typically do not provide detailed assessment of cognition. The power of an epidemiologic approach to cognition leverages the potential to collect data over a very long (e.g., decades) time frame. For perioperative cognition work, this enables modeling of both pre- and postoperative cognitive trajectory that better describes, and controls for, the effect of time on cognitive change in an older population. Epidemiological studies need not assume that controls would behave similarly to surgical patients — this assumption can and should be verified.

There has also been rapid development, over the past two decades or so, of a theoretical basis for making causal conclusions using observational data [16]. This field of “causal inference” has critical implications for postoperative cognition work, seeking to answer the question of whether surgery and anesthesia affect long-term cognitive outcomes. As randomizing patients into surgery and anesthesia versus a non-intervention control group is often both unethical and unrealistic in healthcare settings, sophisticated observational analysis designed to offer causal inference offers an alternative approach. Not all epidemiologic studies of cognition are causal, nor does the use of causal inference techniques — like propensity weighting — imply that conclusions regarding cause and effect are appropriate. Nonetheless, epidemiological analysis has yielded critical insights into long-term cognition before and after surgery, and we discuss some relevant conclusions here.

Prospective cohort designs can be approximated using epidemiological data. Using the Mayo Clinic Study of Aging, Schulte and colleagues identified subtly faster decline in cognitive trajectory among those patients with a history of surgery and anesthesia, as compared to nonsurgical controls [17]. Like primary cohorts, however, the nonsurgical controls were not fully comparable to surgical patients; thus, it is impossible to attribute differences in cognitive outcome to surgery and anesthesia versus the other imbalanced covariates. Designed in this way, analysis of epidemiological data can reproduce the conclusions of prospective cohort studies while offering higher precision or a longer duration of analysis.

Observational approaches using epidemiologic data can yield novel insights, despite not being designed to evaluate cause and effect. Using data from the English Longitudinal Study of Aging, Krause and colleagues compared cognitive trajectories between elective surgical patients and those undergoing a medical hospital admission or a stroke, with a particular focus on the statistical model’s measurement of acute cognitive decline at the time of the health exposure [18]. Rather than dividing a cohort into a surgery versus non-surgery, as one might for a primary cohort, these studies shifted the framework by comparing the neurocognitive outcomes of major surgical and medical hospitalizations as opposed to no major admissions. Patients who underwent surgical hospitalization experienced a minimal decline in cognitive trajectory compared to those with no hospitalization, but the degree of cognitive change was negligible compared with the substantial declines observed after major medical admissions or stroke events [18]. The negligible cognitive impact of surgical hospitalizations compared with medical hospitalizations was verified in an Australian epidemiological cohort, offering strong evidence of generalizability [19]. On the basis of this work, patients should be reassured that average long-term cognitive decline after major surgery is not meaningfully different than decline that would have been expected even without surgery.

Observational data also offer powerful opportunities for causal inference, in carefully limited situations. Here, the design and analysis theoretically justify that the causal impact of surgery and anesthesia is the only difference between the surgical group and controls. One such opportunity exists in the comparison of coronary artery bypass grafting (CABG) versus percutaneous coronary intervention (PCI) to address serious coronary artery disease. Factors determining CABG versus PCI are known and can be mathematically accounted for; disease severity is balanced between the two groups, since both have severe enough disease to merit intervention; and the assumption that preoperative rate of cognitive decline is equal between the groups can be verified. This creates a “target trial” — an epidemiological term for conceptualizing an observational analysis using causal inference techniques as theoretically equivalent to a randomized trial [20]. In this case, observational data were used to create a “randomized trial” of CABG versus PCI in order to study long-term cognitive outcomes [21]. There was no difference in average cognitive outcomes at 5 years, and up to 10 years, after traditional CABG (using a cardiopulmonary bypass pump) versus PCI; however, off-pump CABG — a clinically inferior revascularization strategy — also appeared to yield inferior cognitive outcomes. This offers strong evidence that, perhaps surprisingly, there is no across-the-board adverse cognitive impact of traditional CABG and its associated cognitively relevant exposures (major surgery, hypnotics, cardiopulmonary bypass pump, postoperative intensive care unit stay, mechanical ventilation, postoperative delirium, etc.).

The stream of evidence from population-level data consistently finds there is, on average, no or at most minimal additional long-term cognitive decline after anesthesia and surgery and identifies other major health-related exposures — such as a medical hospital admission — as themselves potentially implicated in long-term cognitive outcome. However, the caveat is that these population-level findings do not identify or explain individual-level outliers; not all participants experience the same or average outcome. While epidemiologic cohorts can be used to identify individuals with poorer cognitive outcome than the population average, approximating POCD/pNCD [22], this analysis used methodology which still experiences the same limitations as primary cohort studies — that individuals on a more rapid baseline rate of cognitive decline will meet the definition, despite not being causally impacted by surgery. An epidemiological approach to predicting adverse cognitive outcomes, while promising, is not yet sufficiently developed for clinical application.

Cognition in Context: Dementia Literature

Understanding long-term postoperative cognitive outcomes requires us to start to expand our perspective beyond surgery and anesthesia. Age, limited cognitive reserve, and baseline vulnerability with predisposing disease conditions are all risk factors for accelerated preoperative neurocognitive decline and are also risk factors for POCD [23]. More broadly, health factors including hypertension, alcohol overuse, smoking, physical inactivity, and diabetes — which may themselves prompt a surgery such as for lung cancer or peripheral vascular disease — are some of the many causes of accelerated cognitive decline in late life [24]. Complicated hospitalization or comorbid delirium — sometimes accompanied by, but not limited to, surgery and anesthesia — is associated with long-term cognitive decline. In an extreme situation, for 24% of critical illness survivors, new cognitive decline equivalent to mild cognitive impairment persists 12 months or longer and, interestingly, is unrelated to the use of sedative or analgesic medications [25]; this further develops the evidence that health stressors may broadly be accompanied by adverse cognitive outcomes, which accumulate (or recover) over time. Although long-term POCD/pNCD is a different entity from related neurocognitive disorders like dementia, successes in dementia risk using life-course modeling [24] have important implications for studying POCD as well.

A broadened framework for POCD must encompass various causal or associative factors that determine patients’ underlying cognitive resilience or influence the development of neurocognitive dysfunction throughout a longer timeframe of life-course (Fig. 2). In this perspective, the impact of short-term provocative events such as surgery and anesthesia on the cognitive trajectory may be limited as one potential explanatory factor among all the others. Considering life-course cognitive trajectory also offers the opportunity to think about improvement in long-term cognitive function if surgery and anesthesia successfully address drivers of neurocognitive decline that are surgically amenable. While long-term postoperative cognitive improvement (POCI) has not yet been systematically studied, shorter-term neurocognitive improvement is seen after kidney transplant [26], left ventricular assist device implantation [27], and cochlear implant placement [28].

Fig. 2
figure 2

A life-course philosophy, when considering postoperative cognitive change, incorporates elements of the broader dementia/late-life cognition literature, as well as surgery, as potentially provocative and cognitively impactful exposures. The potential outcomes include both cognitive decline, which we hypothesize is particularly likely in the setting of perioperative complications, and cognitive improvement which may result from a successful surgery intended to address a surgically amenable problem responsible for causing excess cognitive decline

Summarizing the Literature to Inform Patient Care

Considering only one of these streams of evidence yields an incomplete picture of plausible long-term cognitive outcomes after major surgery and anesthesia. If one attended only to anecdote or primary cohort studies with incidences of 30% or more, surgery would rarely be undertaken; if one attends only to epidemiological studies, which provide strong evidence that the average cognitive impact of surgery/anesthesia is negligible, one would dismiss the credible, idiosyncratic adverse cognitive experiences of a small number of patients. A balanced interpretation must use elements of all four of these methodologies.

From anecdote, we know that there exist patients who experience functionally impactful, sustained (or permanent) cognitive decline following surgery/anesthesia. Primary cohort studies have characterized those deficits — and anecdotes confirm this — as particularly occurring in the cognitive domains of memory and executive function, allowing a patient-centered description of what the cognitive change experience might feel like. Asking questions about memory and executive function (e.g., planning and executing tasks) may offer a feasible way to screen patients with cognitive complaints for symptoms potentially consistent with POCD. But clearly, health events in general can be cognitively impactful for older patients — comorbid medical diseases are often associated with accelerated cognitive decline, and healthcare exposures like medical hospitalization, even though they are not thought to impact the brain directly, also are followed by a measurable decline in cognition. Surgery is thus not qualitatively different from other adverse, cognitively relevant health exposures an older patient may encounter in late life. Since surgery is often elective and occurs at a discrete time point, it offers a useful paradigm for studying pre-, intra-, and postoperative interventions hypothesized to be cognitively protective. While this line of research has not yet yielded broadly applicable therapies, it is exciting to think about what the future may hold for the role of perioperative care in supporting optimal cognitive health outcomes for older patients.

Conclusion

Most surgical patients will experience no long-term change in cognition, after an appropriate interval for recovery from the acute physical trauma of surgery. It is important to state clearly that uncomplicated surgery for appropriately selected older patients which effectively corrects a life-limiting or functionally impactful health issue is beneficial — older patients are not undertaking major surgery for trivial reasons. But for those that do experience a decline in cognition, healthcare professionals must neither dismiss nor catastrophize their symptoms, Healthcare exposures do result in cognitive change, and surgery is not immune from this. For many patients, short-term cognitive change will resolve as physiological healing occurs. However, patients experiencing durable cognitive decline deserve detailed characterization of that decline, ongoing discussions with their care team, cognitive rehabilitation, and a focus on identifying modifiable risk factors (e.g., manage cerebrovascular risk factors after subclinical perioperative stroke; identify and treat postoperative wound infection; control hyperglycemia) to help facilitate cognitive stabilization or recovery.