Background

Cancer occurs due to uncontrolled cell proliferation, and it can develop anywhere in the human body (Farrior n.d.). Cancer is a big concern for all human beings as it is one of the major reasons for human death (Sung et al. 2021). Breast cancer is one of the most common types of cancer. More than a million cases of breast cancer are diagnosed each year (Lin et al. 2022). VCAN can be a prolific biomarker for BC prognosis. The VCAN gene provides instructions for making a protein called versican. Versican is a type of protein known as a proteoglycan, which means it has several sugar molecules attached to it. Versican is a chondroitin sulfate proteoglycan found in the extracellular matrix that is important for changes in cell phenotype associated with development and disease (Jørgensen et al. 2022). Versican is found in the extracellular matrix of many different tissues and organs. The extracellular matrix is the intricate lattice of proteins and other molecules that forms in the spaces between cells (Antonius et al. 2022; Fadholly et al. 2020a, b, c; Yamaguchi 2000). Versican interacts with many proteins and molecules to facilitate the assembly of the extracellular matrix and ensure its stability (Wu et al. 2005). Versican is involved in cell adhesion, proliferation, and migration. Alteration in the versican function can result in serious trouble in cell adhesion, proliferation, and migration (Dituri et al. 2022). Overexpression of VCAN causes uncontrolled cell proliferation. VCAN mRNA expression analysis indicated that it can be a biomarker for breast cancer, brain cancer (Ricciardelli et al. 2009), bladder cancer, esophagus cancer (Thelin et al. 2012), lung cancer (Brunn et al. 2021), pancreas cancer (Skandalis et al. 2006) and stomach cancer (Setoguchi et al. 2011). Human BC is a heterogeneous disease with different neoplasms originating from the epithelial cells (Azam and Sounni 2022; Zhang et al. 2023). The molecular underpinning of the BC cells led to the identification of prognostic tools to estimate therapeutic reactions and predict gene expression signatures which can lead to the long-term survival of the patients (Munshi et al. 2022; Al Saber et al. 2022; Samad et al. 2020). Early prognosis of human BC can save many lives and a potent biomarker can uncover BC in its early stages which can be cured (Agostinetto et al. 2022; Biswas et al. 2021, 2023; Han et al. 2016; Kaddoura et al. 2023). VCAN can be a potential biomarker for human BC because it is correlated with BC-affected patients. BC-affected cases are estimated in 2022 which is 15% of overall cancers in 2022, and 43,250 deaths are recorded in 2022 which is 7.1% of total cancer deaths in 2022 by breast cancer (https://seer.cancer.gov/). The death rate of BC is getting higher day by day, and BC is one of the cancers which hardly produce any symptoms in the pre-stages (Azamjah et al. 2019; Biswas et al. 2022; Kumaratharan 2015). Numerous transcriptome studies have been conducted on VCAN, and the results indicate that VCAN expression is upregulated in a number of common solid malignancies, including ovarian, pancreatic, breast, lung, esophageal, and colorectal cancers (Hirani et al. 2021). However, no research have been conducted on VCAN expression that are related to VCAN overexpression (Fadholly et al. 2020a, b, c). So, the novel aspect of this study is its comprehensive examination of VCAN transcriptional expression in breast cancer and its evaluation as a prognostic biomarker using in silico techniques, which may be able to reveal important information about the molecular mechanisms underlying the development of breast cancer and identify a new biomarker for forecasting patient outcomes. This study has the potential to significantly impact breast cancer research and clinical practice, guiding tailored treatment plans and enhancing patient outcomes.

Methods

Differential expression analysis

UALCAN (http://ualcan.path.uab.edu/) (Chandrashekar et al. 2017, 2022), TIMER2.0 (http://timer.cistrome.org/) (version 2.0) (Li et al. 2016, 2017, 2020; Ricciardelli et al. 2009), GEPIA2 (http://gepia2.cancer-pku.cn/#index) (Tang et al. 2019), and GENT2 (Park et al. 2019) (http://gent2.appex.kr/gent2/) databases were utilized to analyze mRNA expression of VCAN in a number of cancers and normal tissues. The TCGA dataset on the UALCAN server includes 738 normal tissues and 8609 tumors. There are 10,656 tumors and 790 datasets of normal tissue on the TIMER2.0 server. In GEPIA2, samples from the TCGA and GTEx projects make up 8587 normal samples and 9736 tumor samples. GENT2 delivers gene expression across 72 different tissues and more than 68,000 samples.

Expression analysis of targeted biomarker in normal tissues and human BC-affected tissues

The GEPIA2 (Tang et al. 2019) and UALCAN (Chandrashekar et al. 2017, 2022) servers were used to assess the differences in mRNA expression between healthy and cancer-affected tissues. The GEPIA2 server features 291 samples of normal tissues and 1085 samples of human BC tumor tissues matching TCGA and GTEx datasets where Log2FC and q-value were considered 1 and 0.01 using the ANOVA differential method. There are 114 TCGA samples of normal tissues, and 1097 TCGA samples of human BC tumor tissues are used for the investigation on the UALCAN website. Both the servers were utilized to compare the expression between healthy and cancerous human BC tissues. Both servers are renowned for examining the differences in expression between healthy and cancer tissues.

Immunohistochemistry staining

Immunohistochemistry (IHC) is a technique used to detect and localize specific proteins in cells and tissues. The immunohistochemical staining of VCAN at the protein level was examined using the Human Protein Atlas server (HPA) (https://www.proteinatlas.org/) (Uhlén et al. 2015). The HPA server was used to download the IHC image of the VCAN protein expression using GTEx dataset in normal tissues and human BC tissues in order to compare the differences at the protein level.

A relationship of VCAN expression and clinical features of BC patients

Based on the clinicopathological characteristics of the BC patients, the UALCAN server (Chandrashekar et al. 2017, 2022) was utilized to assess the mRNA expression of VCAN in patients' major subclass, age, gender, race, menopause status, and individual cancer stage using the TCGA dataset. Comparing normal, healthy tissues with tumor-affected tissues is allowed for study.

Promoter methylation of VCAN expression

Promoter methylation is a frequent epigenetic alteration that has been linked to several diseases, including cancer. By silencing genes essential for cell proliferation, differentiation, and repair, it is believed that it contributes to the growth of tumors. The connection between promoter methylation and overexpression of VCAN was examined using Mexpress (https://mexpress.be/) (Koch et al. 2015, 2019) and the UALCAN website. A p value of 0.05 or lower was regarded as significant.

Correlation of VCAN expression with mutation and copy number alteration

The somatic copy number changes as a result of unregulated cell division brought on by mutations. cBioPortal server (https://www.cbioportal.org/) (version 5.3.3) (Cerami et al. 2012; Gao et al. 2013) was used for the assessment of the mutation and copy number alteration of the VCAN gene in BC patients. To conduct the analysis, 3835 samples from four different TCGA datasets were employed, including TCGA, Cell 2015, TCGA, Firehose Legacy, TCGA, Nature 2012, and TCGA, PanCancer Atlas.

PPI network of VCAN-enriched genes

A protein–protein interaction (PPI) network is a graph that depicts the physical interactions that take place between proteins in a cell. Protein interactions are represented by the edges and nodes of the graph, respectively.

PPI networks can be utilized to investigate the composition and operation of cells and to find new therapeutic targets. Additionally, they can be used to create new treatments and comprehend the disease's underlying causes. Using the STRING server (https://string-db.org/) (version 11.5) (von Mering et al. 2003) and Cytoscape software (version 3.9.1) (Paul Shannon et al. 1971), a PPI network comprising VCAN-enriched genes was generated. Using the Cytoscape software, the STRING server was utilized to obtain the VCAN-enriched genes dataset.

Gene–drug interaction network of the VCAN

The study of gene–drug interaction networks has the potential to completely change how medications are created and used. Understanding the complicated connections between genes and medications can help create safer and more effective treatments for a variety of disorders. Utilizing the comparative Toxicogenomics Database (CTD) (http://ctdbase.org/) (Mattingly et al. 2003) and Cytoscape software (version 3.9.1) (Paul Shannon et al. 1971), the VCAN gene–drug interaction network was created. Datasets of several chemotherapeutic medicines that can affect VCAN were downloaded from the CTD database (last updated on March 31, 2023) which consists of 2,690,443 number of chemical-gene interaction data (curated), and the interaction network was visualized using the Cytoscape software.

Results

Differential expression analysis of VCAN on several types cancer

Four different web server-based platforms were used to examine the expression analysis of VCAN in various cancer types. According to research on the UALCAN server, VCAN is elevated in a variety of cancers, including those of the bladder, breast, bile duct, colon, esophagus, head and neck, liver, lung, and stomach (Fig. 1A). It is also upregulated in sarcomas of the skin, thyroid, and thymomas. Surprisingly, VCAN is downregulated in a variety of malignancies, including uterine, pancreatic, pancreaticobiliary, pheochromocytoma, and paraganglioma (Fig. 1A) and cervical, KICH, pancreatic, and prostate cancers. Three more servers, TIMER2.0, GEPIA2, and GENT2, were used to investigate VCAN expression in various cancer types; the results are shown in Fig. 1B–D, respectively. To conduct the analysis, the TCGA database was used by all four servers.

Fig. 1
figure 1figure 1

VCAN gene expression pattern in several human cancers; A comparison of cancer and normal tissues; on the GEPIA2 server, high and low mRNA expression are denoted, respectively, by the colors red and green; B the box plot on the UALCAN server displays the gene expression profile of the VCAN gene in 24 different types of human malignancies, including tumors and matching samples of normal tissue. The average expression value in all tumor and normal tissues, respectively, is shown in red and blue boxes. C The comparison of cancerous and normal tissues, where high and low levels of mRNA expression are denoted on the TIMER2.0, respectively, by the colors red and blue; D a box-plot depicting the expression of VCAN mRNA in tumors and their corresponding normal tissues based on the GENT2 database, where boxes stand in for the median, dots for outliers, red boxes represent tumor tissues, and blue boxes represent normal tissues

VCAN expression analysis on normal and breast cancer affected tissues

Using the TCGA dataset, UALCAN and GEPIA2 were used to perform VCAN expression analyses on normal and cancer-affected tissues. A total of 291 healthy and 1085 cancer-affected samples were used in the analysis by the GEPIA2 server (Fig. 2A). The investigation of the various BC subtypes expression of the VCAN was also carried out on the GEPIA2 server. The study discovered four distinct BC subtypes of expression. The expression of VCAN in normal and tumor-affected samples was carried out using the UALCAN server, which is shown in Fig. 2B and contains 114 normal and 1097 tumor TCGA samples. In Fig. 2C, they are depicted as basal-like (291 samples), HER2 (66 samples), Luminal A (415 samples), and Luminal B (194 samples).

Fig. 2
figure 2

VCAN expression analysis on normal and breast cancer-affected tissues in which A, B box-plots displaying the VCAN mRNA expression in breast cancer and normal tissues using the GEPIA2 and UALCAN servers; C tissue-wise expression of a gene signature in different breast cancer sub-types using box plot where gray color denotes normal and red color denotes tumor-affected tissues on the GEPIA2 server

Immunohistochemistry (IHC) staining

VCAN protein-level immunohistochemistry (IHC) was carried out utilizing the HPA (Human Protein Atlas) server, and the dataset's VCAN expression data were compared with the data acquired from the TCGA database. The findings of the analysis revealed that the VCAN expression had a strong IHC staining on the tissues damaged by BC, compared to moderate staining on normal tissues. This is shown in Fig. 3.

Fig. 3
figure 3

Comparison of VCAN gene expression and immunohistochemistry images in normal and tumor tissues. Breast invasive carcinoma (BRCA) tissues had significantly greater VCAN protein expression levels than normal tissues

A relationship of VCAN expression and clinical features of BC patients

UALCAN server was used to investigate the clinical characteristics of BC-afflicted patients and healthy individuals in connection to the expression of VCAN. The findings pointed to increased expression of VCAN in patients' major subclass, age, gender, race, menopause status, and individual cancer stage (Fig. 4A–F). In this study, 1065 TCGA samples with information on different stages of cancer, 967 samples of menopause status, 1087 samples of different ages, 1089 samples of gender, 988 samples of different races, and 833 samples of patient’s major subclass with information on histologic subtypes were used. We used 114 TCGA samples of healthy individuals that were utilized as controls. Despite the fact that there have only been a few cases of BC in male patients where the VCAN expression was substantially higher in men compared to women. In the initial stage of cancer, VCAN expression is higher; subsequently in the fourth stage, it seems a little bit lower. Patients aged between 21 and 40 had higher levels of VCAN expression than the other age groups.

Fig. 4
figure 4

VCAN expression analysis in relation to clinical characteristics of BC patients. The VCAN mRNA expression in BRCA shows A breast cancer major subclasses, B patient's age, C patient's gender, D patient's race, E menopause status, and F individual cancer stages. In order to create these graphs, significant differences between normal variables and other variables were compared

Correlation of promoter methylation with VCAN expression

A key factor in regulating gene expression is promoter methylation. It is an important epigenetic process, and any changes or irregularities in the promoter methylation might result in genetic flaws like cancer. MEXPRESS and the UALCAN server were used to evaluate the relationship between promoter methylation and the overexpression of VCAN. The MEXPRESS server's findings reveal a negative association between promoter methylation and VCAN upregulation (a p value less than 0.05 is considered as significant) (Fig. 5A). The MEXPRESS server hypothesized that the overexpression of VCAN might be caused by hypomethylation. The promoter methylation in the TCGA samples was examined by the UALCAN server using the beta value (Fig. 5B). The Beta number represents the level of DNA methylation, which ranges from 0 (unmethylated) to 1 (Fully methylated). Different beta value cut-offs have been considered to signify hyper-methylation [Beta value: 0.7–0.5] or hypo-methylation [Beta-value: 0.3–0.25]. The UALCAN server employed 97 normal and 793 tumor-affected TCGA samples in its study.

Fig. 5
figure 5

Promoter methylation and VCAN expression correlation were utilized using the MEXPRESS AND UALCAN server. A Analysis of the relationship between the expression of VCAN and the methylation of its promoter using MEXPRESS server. A negative sign shows a negative association between the expression of VCAN and its promoter methylation when a certain probe is used at a particular CpG island, while a positive sign suggests a positive correlation between the two (p < 0.05). B UALCAN server was used to identify the beta value of normal and primary tumors. The Beta number represents the level of DNA methylation, which ranges from 0 (unmethylated) to 1 (fully methylated)

Copy number alteration and mutation

The cBioPortal server was used to assess mutations and copy number alterations. Four TCGA datasets of BC investigations yielded a total of 3835 TCGA samples (Fig. 6A). The findings revealed that VCAN was altered in a total of 75 samples, including 48 duplication mutations in patients with multiple samples; among the samples, 58 were missense mutations, 13 were nonsense mutations, and four splice mutations occurred with a 1.8% somatic mutation rate frequency (Fig. 6B, C and Table 1). With a 3.06% alteration frequency in 817 samples, the breast invasive carcinoma (TCGA, Cell 2015) studies had the highest alteration frequency. This included 2.2% mutation (18 cases), 0.73% deep deletion (six cases), and numerous alterations (0.12%) (one case). A correlation between the overexpression of VCAN in BC and the mutation and copy number alteration may exist, according to studies from the cBioPortal database.

Fig. 6
figure 6

VCAN mutations and genetic changes in BC tissue. A The mutation rates and genomic changes in the VCAN gene are depicted in a bar diagram. B VCAN putative copy number alteration from GISTIC and C mutation types and no mutation occurrence in different mutation count using TCGA dataset

Table 1 TCGA dataset's list of BC's VCAN mutational sites and types

PPI network construction and visualization

The VCAN-enriched genes were found using the STRING database and the Cytoscape program to analyze the protein–protein interaction. According to functional interaction network research, VCAN was closely related to ten different genes shown in Fig. 7 and Table 2. With regard to coexpression rates, VCAN with CD44, FBLN1, FBN1, FN1, and GPC1 are, in that order, 0.062, 0.081, 0.102, 0.097, and 0.062. In contrast, there is no coexpression exhibited with SDC1, SDC4, SELL, TLR2, or TLR6 during protein–protein interaction.

Fig. 7
figure 7

PPI network of genes with VCAN enrichment

Table 2 Coexpression of VCAN-enriched genes

Gene–Drug interaction network analysis of the VCAN

The gene–drug interaction network was built using the CTD database and visualized using Cytoscape software to determine the association between VCAN expression and the numerous medicinal drugs that are currently on the market. It was found that certain medications may be able to regulate the expression of VCAN as shown in Fig. 8 and Table 3. Aflatoxin B1 and bisphenol A increase the methylation, and sodium arsenite affects the methylation.

Fig. 8
figure 8

VCAN-targeted drugs from CTD's gene–drug interaction network. The numbers of arrows between each interaction show the number of investigations that have supported that interaction

Table 3 Detail of VCAN-targeted drugs from CTD's gene–drug interaction network

Discussion

The study provided evidence of a bioinformatics analysis, which was carried out to look at the VCAN expression pattern, assess its link to the prognosis of breast cancer, and forecast the relationship between VCAN and other proteins as well as the interaction of VCAN with drugs. VCAN is a member of the versican/aggrecan family of proteoglycans. Versican is a proteoglycan protein, which means that several sugar molecules are attached to it (Dibha et al. 2022). Versican is found in the extracellular matrix of numerous organs and tissues (Yamaguchi 2000). The extracellular matrix is the complex web of substances, including proteins, that form in the gaps between cells (Fadholly et al. 2020a, b, c; Kharisma et al. 2020; Widyananda et al. 2021). Versican accelerates extracellular matrix synthesis and ensures its stability by interacting with a wide variety of proteins and substances. Cell growth is correlated with VCAN expression as well as it has been labeled as a cancer-causing gene after some studies suggested that it might be carcinogenic (Yu et al. 2003). Breast, ovarian, and lung cancers are only a few of the tumors for which VCAN expression levels have been connected (Du et al. 2013). VCAN is another factor that could be a predictor of breast cancer. We still do not know what part VCAN might play in tumor immunology and tumor growth. We studied the expression levels of VCAN in breast cancer using UALCAN, GENT2, TIMER, and GEPIA databases. VCAN expression was shown to be highly expressed in breast tumors, which is consistent. We evaluated the expression of VCAN in samples from normal and various subtypes of breast cancer and discovered a significant connection suggesting that VCAN expression may be a viable diagnostic biomarker for breast cancer. Using TCGA samples, we also examined the clinical characteristics by the expression of VCAN in BRCA depending on distinct cancer stages, menopause status, and patient’s age, gender and race. Although only a few cases of males have been diagnosed, the results unexpectedly demonstrated that the VCAN expression is significantly higher in males compared to females. Another study has found that while normal tissues do not contain any IHC stain, the expression of VCAN exhibits rough IHC staining in tumor-affected tissues, suggesting that VCAN may be a significant biomarker for breast cancer. Gains or losses in the genomic area can have either suppressive or cancer-promoting effects (Qiu et al. 2008). Investigations on VCAN mutations, copy number changes, and mutant mRNA expressions were conducted using the cBioPortal webserver (Gao et al. 2013). Seventy-five altered individuals made up the sample under analysis, with a somatic mutation frequency of 1.8%; subsequently, we examined the close connection between DNA methylation and VCAN expression. When different DNA methylation clusters are compared, it becomes clear that VCAN expression increases as DNA methylation levels decrease. The state of VCAN DNA methylation in human cancer has also been shown to be decreased. We looked further at the protein–protein interaction network of VCAN using the STRING service (von Mering et al. 2003) and Cytoscape tools (Paul Shannon et al. 1971) to identify the VCAN-enriched genes. VCAN and ten additional genes have strong interactions, according to functional interaction network studies. Using the CTD database (Mattingly et al. 2003) and the Cytoscape tool to identify the VCAN gene and drug interactions, it was further discovered that a number of medications may be able to impact the expression of VCAN. For instance, sodium arsenite affects methylation, while aflatoxin B1 and bisphenol A increase it. In conclusion, the considerable prognosis linked to high VCAN expression in breast cancer raises the idea that VCAN could be employed as a biomarker for BC prediction and diagnosis.

Conclusions

All the analyses revealed that VCAN is overexpressed in the majority of human malignancies. The oncogenesis of these tumors is implicated here, suggesting a role for VCAN. We investigated the expression of mRNA, DNA methylation, mutations and CNAs, associated genes, and prognostic features to assess the viability of VCAN as a potential prognostic marker in BC development. Our significant analysis reveals a clear statement on the overexpression of VCAN that is responsible for the development of breast cancer. According to all of these multi-omics analyses, VCAN represents a novel diagnostic and prognostic biomarker for breast cancer. Our analysis suggested that further research is required for any therapeutic consequences to treat breast cancer.