Introduction

Osteoarthritis (OA) is a chronic disease primarily affecting the cartilage and tissues surrounding the joints, usually manifesting with joint pain, swelling, and dysfunction. The knee and hip are two commonly affected joints. Approximately 16% of individuals over the age of 45 exhibit radiographic evidence of knee OA, and about 10% have hip OA1,2. Prior to individuals with advanced symptoms and structural damage becoming candidates for total joint replacement, early identification or treatment based on precise diagnostic markers could mitigate significant costs and mortality for patients3,4.

The extracellular matrix (ECM), comprising water, proteins, polysaccharides, and other molecules, plays a vital role in joint healthy by providing structural and cellular support5. The degradation of the ECM, particularly its impact on collagen and proteoglycans (PG), constitutes a significant pathological process in OA that stems from various factors including inflammation, injury, and aging. Consequently, cartilage gradually fracture, potentially leading to bone-on-bone contact, friction, pain and impaired joint function6,7. Studies have highlighted the associations between ECM degradation and increased protease activity, oxidative stress, and inflammatory factors8. Therefore, interventions aimed at these degradation processes may aid in the development of more effective therapies to slow down or halt the progression of OA9.

Cathepsins constitute a class of proteases divided into three main families based on the catalytic type: cysteine proteases (e.g., cathepsin B, K, L), aspartic proteases (e.g., cathepsin D) and serine proteases (e.g., cathepsin A)10,11. They play crucial roles in diverse biological processes, including protein degradation, regulation of cell signaling, immune response, and apoptosis, functioning both intracellularly and extracellularly12,13,14. Deficient or excessive expression of cathepsins can lead to various diseases and abnormal phenotypes, including cancer and arthritis10. Specifically, aberrant activity or expression of cathepsins is closely linked to the onset and progression of OA, contributing to ECM degradation15. Thus, cathepsins are potential acknowledged as diagnostic biomarkers and potential targets for OA treatment16.

Recent studies have elucidated the roles of several cathepsins in promoting the progression of OA. A recent study revealed that cathepsin B upregulation induced NOD-like receptor protein-3 (NLRP3) inflammasome activation, subsequently triggering chondrocytes (CHs) pyroptosis and promoting the development of OA17. Cathepsin D has been found to play a crucial role in CHs death18. An elevation in serum levels of cathepsin K, rather than D, is implicated in the pathogenesis of OA by stimulating bone resorption and cartilage degradation19. Moreover, cathepsin K is activated in CHs, osteoclasts, and synoviocytes in OA states, and its inhibitor could be a potential therapeutic target for modifying OA disease20. Most of the research on cathepsins in OA to date has primarily focused on their situ expression in knee, with findings frequently generalized to the hip, and this practice extends to treatment recommendations outlined in clinical guidelines. However, the roles of serum cathepsins may vary significantly between knee and hip OA, and the causality between individual cathepsins and the risk of knee or hip OA has not been adequately investigated.

As genomics advances, there is mounting evidence unveiling the role of heritability in disease pathogenesis. Mendelian randomization (MR) is a statistical method that relies on genome-wide association studies (GWAS) to examine potential causal relationships in observational studies. Its basic idea is to utilize randomly distributed genetic variation inherent in nature, employing genetic variation as instruments (instrumental variables, IVs) to estimate the causal effect of a risk factor on a health outcome21,22. These GWAS-identified variations directly modify the amino acid sequence of a protein, which may affect the cathepsins' amount, stability, function, or physiological processing, triggering the pathological process of OA. In this study, bidirectional MR analyses were performed to investigate the causal effects between serum cathepsins and OA (knee and hip). Our findings offer primary genetic evidence for novel diagnostic serum biomarkers and potential therapeutic targets for knee and hip OA treatment.

Materials and methods

Study design

Two-sample MR is a technique that uses genetic variants as IVs to establish causal relationships between exposure phenotypes and outcomes. It addresses common limitations of observational studies by utilizing publicly available datasets from large-scale GWAS on both exposures and outcomes. The study design is shown in Fig. 1. Three hypotheses underpin the design of this study: The first hypothesis posits a strong correlation between single nucleotide polymorphisms (SNPs) and exposure. The second hypothesis asserts that the selected SNPs should be unaffected by confounding variables. The third hypothesis suggests that SNPs should not be directly linked to the outcome; rather, their impact should only be mediated through exposure23,24.

Figure 1
figure 1

MR design for causal analysis of serum cathepsins and knee/hip OA on genetic predisposition.

Data sources

The summary-level data used in the MR analysis are derived from GWAS datasets. SNPs for assessing the levels of various serum cathepsins (µg/L) were obtained in healthy donors from the INTERVAL study, which included 3301 European individuals25. Supplementary Table S1 presents the GWAS IDs of serum cathepsins traits. Meanwhile, knee OA (GCST007090) and hip OA (GCST007091) dataset were also obtained from GWAS study by Tachmazidou et al26. The knee OA dataset includes 24,955 cases and 378,169 controls of European ancestry, with a total of 29,999,696 SNPs. The dataset for hip OA comprises 15,704 cases and 378,169 controls, also of European ancestry, with 29,771,219 SNPs. All dataset can be accessed at IEU OpenGWAS project (https://gwas.mrcieu.ac.uk).

Selection of instrumental variables

Due to the limited availability of SNPs, we selected a significance threshold of P < 5 × 10−6 for screening the SNPs of serum cathepsins, so as to obtain enough SNPs for subsequent analysis. To mitigate bias resulting from linkage disequilibrium (LD), we employed SNP clumping at a distance of 10,000 kb with an r2 threshold of 0.00127,28. Palindromic SNPs were also excluded from our analysis. To follow the third hypothesis of MR, we further examined and excluded SNPs that were strongly associated with outcome (P < 5 × 10−8).

Given the comparatively lenient criteria, F statistics were computed to quantify the sample overlap effect and weak instrument bias; a bias of F < 10 was deemed questionable29. Furthermore, we conducted a reverse MR analysis, where cathepsins were the outcome and knee and hip osteoarthritis were the exposures. For SNPs selection, we applied a p-value threshold of P < 5 × 10−8, using the same criteria as mentioned previously for other conditions.

Statistical analysis

In this study, we estimated the causal relationship between serum cathepsins and knee/hip OA using a variety of MR analytic techniques. In MR analyses, the random-effects inverse-variance weighting (IVW) method was employed as the primary technique. In general, the outcomes were most reliable when all IVs were valid30. This method is a traditional approach in MR analysis. It computes the weighted average using the inverse of the variance of each instrumental variable (IV) as the weight. This procedure guarantees that all IVs are effectively utilized in the analysis. We employed MR-Egger and weighted median (WM) methods as supplementary analytical tools in our Mendelian Randomization (MR) framework. MR-Egger is particularly useful for providing robust causal estimates in the presence of pleiotropy. Conversely, the WM method demonstrates reduced sensitivity to outliers and measurement errors, enhancing the reliability of our causal inferences. Additionally, the consistency in the direction of effect between the results of the weighted mode and simple mode methods and the primary method serves as a reference for evaluating the conclusions31,32. If the effect directions from these four methods align with those from the IVW method, it further confirms the validity of the results. Additionally, we conducted more sensitivity tests to ensure the accuracy of our findings. Heterogeneity was assessed using Cochran's Q test33. Genetic pleiotropy was assessed using the MR-Egger intercept34 and MR-PRESSO (Mendelian Randomization Pleiotropy RESidual Sum and Outlier) test35. Subsequently, the "leave-one-out" sensitivity test and funnel plot were employed in sensitivity analysis to evaluate the effectiveness and stability of the MR results. Statistical analyses were conducted using the "TwoSampleMR" package (version 0.5.7) in R (version 4.3.1). Outcomes were considered statistically significant at p < 0.05.

Results

Exploration of the bidirectional causal relationship between serum cathepsins and knee OA

Our results (IVW) incorporating nine cathepsins as exposure indicated that the serum cathepsin O level has a positive correlation with knee OA (OR 1.08, 95% CI 1.03–1.13, P = 0.001), suggesting that elevated serum cathepsin O levels may increase the risk of knee OA (Figs. 2A, 3B). The weighted median method also showed similar results (OR 1.07, 95% CI 1.01–1.13, P = 0.016). Furthermore, the consistent direction of effect across all five methods enhanced the credibility of the findings (Fig. 3A). Cochran's Q test indicated no heterogeneity (IVW pval = 0.994, MR-Egger pval = 0.993). No evidence of pleiotropy was detected using MR-Egger intercept (pval = 0.564) and MR-PRESSO (pval = 0.992) (Table 1). MR-PRESSO analysis also revealed no outliers. In the leave-one-out analysis, no single SNP of cathepsin O exhibited a significant effect on MR results (Fig. 3C). Funnel plot showed no direct heterogeneity between IVs (Fig. 3D).

Figure 2
figure 2

(A) The forest plot of the causal effect of cathepsins on Knee OA; (B) The forest plot of the causal effect of Knee OA on cathepsins.

Figure 3
figure 3

The MR results of cathepsin O on knee OA. (A) Scatter plot showing the causal effect of cathepsin O on knee OA. (B) Forest plot for the overall causal effects of cathepsin O on knee OA. (C) Leave-one-out plot for the causal effect of cathepsin O on knee OA. (D) Funnel plot of SNPs related to cathepsin O and knee OA.

Table 1 Evaluation of heterogeneity and directional pleiotropy of causal effect between cathepsin O and knee osteoarthritis using different methods.

To further explore whether knee OA leads to an elevation in serum cathepsin levels, we conducted reverse MR analyses using the same methodology, but the findings did not indicate any causal effect (Fig. 2B). Thus, from a genetic perspective, excessive elevation of serum cathepsin O increases the risk of knee OA. However, serum cathepsin levels are not necessarily elevated in individuals with knee OA, and there is no bidirectional causal relationship between them.

Exploration of the bidirectional causal relationship between serum cathepsin H and hip OA

To ascertain whether serum cathepsins exhibit the same causal relationship with hip OA as with knee OA, we conducted same MR analyses and identified a causal effect of serum cathepsin H on hip OA (IVW OR 1.05, 95% CI 1.00–1.10, P = 0.031). It indicated cathepsin H might be a detrimental factor for the onset of hip OA (Fig. 4A, 5B). The remaining four MR statistical techniques corroborated these findings by demonstrating consistent effect trends, thereby enhancing the robustness of the results (Fig. 5A). Additionally, the Cochran's Q test revealed no heterogeneity (IVW pval = 0.665, MR-Egger pval = 0.677). Results from the MR-Egger intercept (pval = 0.315) and MR-PRESSO (pval = 0.664) indicated no evidence of pleiotropy (Table 2). No outliers were detected using MR-PRESSO. Furthermore, the leave-one-out analysis and funnel plot provided additional support for the reliability of the results. (Fig. 5C, D).

Figure 4
figure 4

(A) The forest plot of the causal effect of cathepsins on Hip OA; (B) The forest plot of the causal effect of Hip OA on cathepsins.

Figure 5
figure 5

The MR results of cathepsin H on hip OA. (A) Scatter plot showing the causal effect of cathepsin H on hip OA. (B) Forest plot for the overall causal effects of cathepsin H on hip OA. (C) Leave-one-out plot for the causal effect of cathepsin H on hip OA. (D) Funnel plot of SNPs related to cathepsin H and hip OA.

Table 2 Evaluation of heterogeneity and directional pleiotropy of causal effect between cathepsin H and hip osteoarthritis using different methods.

In the reverse MR results, when hip OA was considered as the exposure, no causal relationship was found with the serum cathepsins (Fig. 4B). Therefore, significant elevation of serum cathepsin H raises the risk of hip OA. Nonetheless, there is also no bidirectional causal association between them.

Discussion

Nowadays, the rise in rates of hip and knee joint replacements in the late stage of OA underscores the urgent need for early diagnosis and non-surgical treatments. Hip OA differs from knee OA in various aspects2. However, most research has primarily focused on the knee or on mixed populations with hip and knee OA. This may limit our understanding of hip OA-specific disease features and raise concerns about the external validity of treatment responses, thereby constraining the development and implementation of successful hip OA therapies. Therefore, the present study identified the pathological factors associated with hip and knee OA separately and observed the potential distinctions.

Cathepsins as well as other excessive catabolic cytokines and enzymes, such as Interleukin-6/17 (IL-6/17) and metalloproteinases (MMPs), have been identified to be responsible for the ECM destruction and remodeling in OA process5. However, no one has done separate studies on their functions in knee and hip OA. Over the past decade, several MR studies have investigated the causal variables, such as inflammation, hormonal, and nutrient-related exposures, associated with OA36. The rapid expansion of large-scale GWAS focusing on OA risk factors, coupled with the accessibility of summary results, has facilitated the development of genetic tools for MR research.

In this work, we employed this genetic tool to explore the causal relationship between nine distinct cathepsins in serum and the risk of hip and knee OA. This marks the first large-scale genetic consortia-based MR investigation into the causal association between serum cathepsins and OA. To gain a more comprehensive understanding of complex biological phenomena between cathepsins and hip or knee OA, we employed a bidirectional MR37 approach that simultaneously considers causality in both directions. This approach aids in better comprehending the intricate relationships between variables, encompassing potential bidirectional influences and feedback mechanisms, thereby enhancing the reliability of inferences. Integrating findings, we concluded that serum cathepsin O is a significant risk factor for knee OA, while cathepsin H is for hip OA, which provided a more thorough complement to earlier research. Though, many other cathepsins have been reported to be activated in OA. When their serum levels were analyzed by MR for causality with OA (knee and hip), no significant differences were found, which may be due to complex genetic and environmental factors and genetic-environmental interactions.

Cathepsin O is a cysteine protease that reported to participate in immune response38,39 and tumor metastasis40,41, which did not gain as much attention as other cathepsins in OA. Besides, cathepsin H, a lysosomal cysteine protease known for its specific aminopeptidase activity42, is implicated in pathological processes such as tumor metastasis43, inflammatory response44, and neurotransmitter production45, and ECM degradation46. Our findings revealed that serum cathepsin O raised the risk of knee OA, and cathepsin H increased the risk of hip OA. The reliability of the results was demonstrated by different methods of MR analyses as well as sensitivity analyses. Furthermore, the negative results of reverse MR analyses suggest that the presence of knee or hip OA does not necessarily indicate elevated levels of serum cathepsins expression. Therefore, the abnormal increased cathepsin O and cathepsin H in serum can be diagnostic biomarkers for early knee and hip OA, respectively.

Serum proteins play crucial roles in biological processes and serve as direct targets for numerous drugs25. Presently, drugs targeting Cathepsin O or H comprise endogenous peptide inhibitors, along with both natural and synthetic inhibitors possessing low molecular weight42. Suppression of cathepsin O or H by appropriate drugs might significantly decrease the risk of knee and hip OA.

Strengths and limitations: The current study, which is a bidirectional MR study, first avoided confounding bias of randomly assigned SNPs at conception and then avoided reverse causality effects. But there are still some limitations. First, we appropriately relaxed the thresholds in the selection of instrumental variables in order to obtain a sufficient number of SNPs for analysis, which may have contributed to some extent to the relatively small statistical power. Second, the expression of cathepsins may also be influenced by age and gender, and their relationship with OA could be nonlinear. However, due to the unavailability of individual genotype expression data, further statistical analysis cannot be conducted.

Conclusion

To our knowledge, few studies have investigated the direct effect of serum cathepsins in OA, especially cathepsin O and H. The present data, for the first time, genetically elucidate the causality of serum cathepsin O in knee OA and cathepsin H in hip OA, which could serve as diagnostic serum biomarkers. However, their regulations by upstream regulators largely remains unknown. Further research is necessary to confirm if cathepsin O and H levels are balanced in the joint system and peripheral circulation. Moreover, significant attention should be devoted to determining how precisely modifying pathologically elevated cathepsins with pharmaceutical and non-pharmacological strategies could impact the incidence and progression of OA across different joints based on their distinct characteristics.