Introduction

Cancer is an alarming health concern and leading cause of deaths worldwide1. Existing conventional clinical therapeutic methods such as chemotherapy, radiotherapy, immunotherapy, surgical interventions and targeted therapy are widely used for cancer treatment2. But, these methods damage the normal cells and sever negative side effects such as infections, bleeding, pronounced immunosuppression on patient body3,4. At present, the scientific research has proven that therapeutic peptides such as anticancer peptide (ACP) and host defense peptides (HDPs) contain high selectivity, specificity that make it more favorable safe drug agent against cancer5. HDPs exist in both amphibians and plants demonstrate the potential to recognize cancer cells in breast cancer, melanoma, and lung cancer with minimum drug resistance in these cancer types6. Similarly, D-K6LP is a polypeptide found on the surface of cancer cells, exhibits anticancer property because of interacting electrostatically with phosphatidylserine7. However, among these agents, ACP is considered competent alternative due to its divers functions likewise low toxicity, high specificity and, minimal innate immune function for developing anticancer vaccines.

ACPs are short length (5-30) polypeptide sequences known for its anticancer activity. Typically, ACP consists of 10–50 amino acids (AAs) and exhibits a complex structure, functioning as a molecular polymer involving AA and proteins8. The polymerization occurs through peptide bonds, connecting several or dozens of AAs. ACP demonstrates the capability to disrupt the structure of tumor cell membranes, consequently impeding the proliferation and migration of cancer cells. Moreover, it can induce apoptosis in cancer cells without causing harm to normal human cells9. ACP’s unique ability to interact specifically with the anionic cell membrane components of cancer cells enables them to selectively eliminate cancer cells with minimal impact on normal cells10. Additionally, certain ACPs, such as cell-penetrating peptides or peptide drugs, have demonstrated the capability to inhibit the cell cycle or other cellular functionalities, thus enhancing their safety profile compared to traditional broad-spectrum drugs11. These attributes have rendered ACPs a highly competitive choice for therapeutics compared to small molecules and antibodies. Recent research has indicated that ACPs exhibit selectivity towards cancer cells while leaving normal physiological functions unaffected12,13. This makes ACP a promising therapeutic approach for cancer treatment. Over the past decade, numerous peptide-based therapies targeting various types of tumors have been assessed and are presently undergoing evaluation across various stages of preclinical and clinical trials14,15. This underscores the significance of developing novel ACPs for the treatment of cancer. Nonetheless, only a limited number of these ACPs may ultimately progress to clinical treatment due to the rigorous selection process16. Moreover, the validation of potential new ACPs through in vitro or in vivo method is both time-consuming and expensive, further compounded by limitations in laboratory resources17. Considering the significant therapeutic potential of ACPs in biomedical applications, there is a pressing demand for high throughput, rapid, and cost-effective discovery of ACPs using computational methods.

Over the past years, the abundance of peptide sequencing accumulating in postgenomic era, has sparked/emerged the researcher’s interest to utilize artificial intelligence knowledge particularly machine learning(ML)18,19,20,21,22,23,24,25, ensemble learning (EL)26,27,28,29,30,31 and deep learning (DL) methods32,33,34,35,36,37,38,39 in ACPs identification. In ML-based ACP predictors, support vector machine (SVM) and random forest (RF) were the widely adopted classifiers. For example, iACP18, AntiCP19, cACP23, and AntiCP_2.024 predictors were developed using traditional ML-classifiers. Similarly, in EL-based methods, majority voting and stacking ensemble strategies were adopted using multi-feature representation to predict the correct ACPs. For example,ACPred-FL26, StackACPred30, and ACPpred-Fuse28 etc. developed EL-based ACP models. Moreover, advanced DL-based methods further improved the performance of predicting ACPs and non-ACPs using sequence information. For example, S Ahmed et al. proposed multi-head multi-headed deep-convolutional neural network(MHCNN)33, Z Lv proposed iACP-DRLF(identify anticancer peptides via deep representation learning features)38, HC Yi proposed deep learning-based long short-term memory model (ACP-DL)40, and W Zhou developed tri-fusion neural network(TriNet)36 for predicting ACP activity. For more details, the readers are referred to comprehensive review articles on existing ACP tools41,42.

Considering the above backdrop, in this study, we develop PMLACPred, a protein language model for identifying and characterizing ACPs activity (see Fig. 1). First, we used four feature representation schemes, namely (a) component protein sequence representation (CPSR), (b) histogram of oriented gradient-based HOG position-specific scoring matrix (HOG-PSSM), (c) ProtT5, and (d) ESM-2 to extract features from ACP sequence. Then, 2D wavelet denoising (WD) was employed on the extracted features to remove the noise and enhance the prediction of the proposed model. Finally, the processed denoised features are fed into upgraded cascade deep forest (CDF) classifier to build the final ACP-based prediction model. The proposed PMLACPred tool reaches superior performance in term of accuracy up to 94.76%, 96.39% and 98.65% on three independent datasets, i.e., ACPmain, ACPAlter, and ACP740. The contribution of this work can be highlighted as follows:

  1. (a)

    We extracted local and global features from peptide sequences using transformer-based models. i.e., complementary embedding techniques through ProtT5 and ESM-2, evolutionary-based features through HOG-PSSM, and compositional-based features through CPSR encoding method.

  2. (b)

    We implemented 2D WD algorithm to effectively de-noise the extracted feature vector and enhance the prediction performance of the proposed ACP prediction model.

  3. (c)

    We developed PLMACPred, a modified cascade deep forest based model with above mentioned hybrid feature set and obtained the best accuracy outperforming existing models for the same purpose on three benchmark datasets.

Figure 1
figure 1

The schematic diagram of the PLMACPred method.

Results and discussion

Classifier performance using various feature encoding schemes before and after applying Wavelet Denoising method

In this article, we used four types of single-view feature CPSR, ESM-2, ProtT5 and HOG-PSSM and their combinations (multi-view features). We use the notation F1, F2, F3, F4 to represent CPSR, HOP-PSSM, ESM-2 and ProtT5 features in Tables 1 and 2, respectively. We used SVM and NB as baseline models and cascade deep forest (CDF) as proposed learning model for ACP prediction as depicted in Fig. 1. We used individual features as well as their combination for building ML models. During the training process, we utilized three benchmarking datasets namely ACPmain, ACPAlter and ACP740. After using numerous encoding schemes, we implemented 2D wavelet denoising algorithm to reduce the noise and enhance the prediction performance of classifiers. Table 1 depicts the prediction performance of benchmark datasets ACPmain, ACPAlter and ACP740 using 10-fold CV with and without applying 2D WD algorithm. Similarly, Table 2 shows the prediction performance of three independent datasets with and without using 2D WD method.

Table 1 Performance of model on Training set.
Table 2 Performance of model in Test Set.

We can see from Table 1 that prediction performance of CDF classifier using single features particularly (ProtT5 and ESM-2) and hybrid features (F1+F2+F3) and (F1+F2+F4) has significantly increased by using 2D WD. In case of ACPmain, CDF achieved the highest Acc value of (0.992) with MCC of (0.953) on 10-fold CV method and Acc of (0.948) with a corresponding MCC of (0.896) on independent dataset in Table 2. In case of second dataset ACPalter, again CDF classifier afford the highest performance in term of Acc value 0.988 with MCC of 0.996 using hybrid features after apply 2D WD method. The CDF classifier was also validated on the second independent dataset and attained the Acc of 0.990 and MCC of 0.995(Table 4). In contrast the worst performance is archived by NB classifier using HOG-PSSM feature encoding scheme. We further validated and compare the prediction performance of same set of features using CDF, SVM and NB learning engine on ACP740. We observed, again CDF model beat the other classifiers using hybrid feature combination of F1+F2+F4 and affording Acc of 0.992 and MCC of 0.983 on the training dataset (Table 3) and Acc(0.987) with MCC of(0.973) on validation or independent dataset (Table 2).

Table 3 Performance comparison with existing method over ACPmain independent dataset.

Based on the above results, we can observe that better classification performance was achieved by CDF classifier—an ensemble based model. The second observation is, using 2D WD method helped to enhance the overall performance of the model (Fig. 2). Third, after fusing the evolutionary-based, physicochemical-based and deep embedding-based features helped to improve the prediction of ACPs.

Figure 2
figure 2

Performance of PLMACPpred on ACPmain independent dataset with- and with-out WD. We have only highlighted the number of best performing combination of features with ProtT5 + CPSR + PSSMHOG. After WD, performance improve for all experiment setup.

Comparison with existing methods

To understand the strength and weaknesses of newly designed method, it is important to compare it with state-of-the-art methods. For this purpose, we compared performance of proposed PLMACPred model with existing ML-based and DL-based ACP predictors on three independent datasets namely ACPmain, ACPAlter and ACP740. Table 3 lists the prediction results for ACP identification on ACPmain test dataset from previous models, i.e., iACP18, ACP-MHCNN33, iACP-DRLF38, AntiCP_2.024, AntiCP8, ACPred22, ACPred-FL26, ACPpred-Fuse28, ACP-check39, TriNet,36 and ACPPfel37. Figure 3a–c show the success rates of various computational ACP predictors in term of Acc, Sen, Spe, and MCC for the three datasets. We can observe the prediction outcomes in Table 3 and in Fig. 3a, it is clear that overall efficacy of PLMACPred is superior to existing ACP tools with the best performance in term of Acc(96.60%), Sn(94.80%),Sp(97.10%) and MCC(0.896). The prediction score indicate that our proposed method surpass the recent developed ACP-based predictors such as ACPPfel, TriNet and ACP-check by ACC of 18.53%, Sn of 13.51%, Sp of 18.34% and MCC of 30%.

Figure 3
figure 3

Performance comparison between PLMACPpred and other models in (a) ACPmain, (b) ACPAlter, and (c) ACP740 datasets.

For further validating the robustness of the proposed method, PLMACPred was compared with five deep learning models and six ensemble-based ML ACP models on ACPAlter independent dataset. The detailed comparison outcomes are shown in Table 4 and Fig. 3b. We can observe that that PLMACPred model achieved significantly better performance than ACP-MHCNN, iACP-DRLF, AntiCP_2.0, ACP-DL, ACP-check, ME-ACP, TriNet and ACPPfel. For example, the optimal performance of PLMACPred on this dataset in term of ACC is 99.00%, and MCC of 0.989. The second best performance attained by TriNet which is Acc of 96.60% and MCC of 0.871. The ACPPfel method obtained comparable results with other developed tool i.e., ME-ACP and ACP-check with respect to ACC and MCC measure. Overall, PLMACPred produced dominant performance and enhanced the Acc, Sen, Spe and MCC by 2.4–6%, 1.2–8%, 4.8–6.5%, 5.9–12.90% respectively.

Table 4 Performance comparison with existing method over ACPAlter independent dataset.

For making more intuitive comparison, we considered another verified Independent dataset ACP740. The validation results of multiple DL-based predictors on ACP740 are depicted in Table 5 and Fig. 3c. Again the proposed PLMACPred model outperforms ACPPfel, TriNet, ME-ACP and ACP-check with increased Acc of 7.59%, Sen of 6.11%, Spe of 5.88%, and MCC of 13.47%. Thus, the empirical outcomes demonstrates that the PLMACPred could produce promising performance for all three datasets in term of all performance indicators namely Acc, MCC, AUC, Spe and Sen.

Table 5 Performance comparison with existing method over ACP740 independent dataset.

Motif-based analysis and feature contribution in ML model

Sequence motifs exhibits conserved region over sequence collection often linking towards the function of certain gene or peptides. We discover sequence motifs from the collection of ACPs compared to non-ACPs using MEME tool43. Figure 4 shows the top ranked motifs from ACP in the main training dataset. It is observed that certain enrich motifs such as FLPY- LAGVAAKVLPKIFCKIT, IPCGESCVFIPCITP, GCSCKSKVCYR were found exclusively in ACPs and FAKKLAKLAKK, RKAFRWAWRMLKKAA, DTPLDLAIQHLQRLTIQELPDPPTDLPE, in non-ACPs. From the top motifs, we can observe that in motifs from ACP are mainly C,F, L, Y enriched, and the non-ACP motifs are dominantly A, D, K, P,Q, R, W enriched. Supplementary Table S1 shows the list of exclusive motifs in ACPs and non ACPs.

Figure 4
figure 4

The identified top motifs from ACP (ac) vs. non-ACP (df).

In machine learning, model interpretation plays a significant role to quantify the prediction reliability44. Before developing the model for ACP prediction, the sequence properties in training samples were analyzed using SHAP algorithms to visualize the contribution of important attributes. In Fig. 5, we demonstrated the impact of top 25 ranked engineered features namely CPSR, HOG-PSSM, ESM-2 and ProtT5 in predicting ACPs.

Figure 5
figure 5

The SHAP diagram for the important featues for PLMACPred from (a) ESM , (b) CPSR, (c) ProtT5, (d) HOG-PSSM.

Material and methods

Benchmark dataset

In designing computational model, the construction of valid benchmark dataset is crucial step to train and test the prediction system45,46,47,48. In ACP prediction, we collected three valid datasets namely ACPmain and ACPAlter datasets from AntiCP2.049 and ACP740 from ACP-DL40 for fair comparison. The origin of all these derived datasets is CancerPPD database50. A series of preprocessing steps such as (labeling, removal of redundant and ambiguous sequences), were undertaken to ensure the quality of the data. By following these steps, ACPmain dataset includes 861 experimentally verified positive peptide samples as ACPs and an equal number of negative peptide samples as non-ACPs. ACPAlter datasets comprises 970 ACPs as positive samples and same number of non-ACPs as negative samples. Similarly, ACP740 dataset contains 740 peptides (including 374 ACPs and 364 non-ACPs). We split the collected benchmark datasets into training and independent testing subset at an 80:20 ratio. Table 6 shows the statistics of ACP and non-ACPs in all three datasets.

Table 6 Summary of dataset.

Feature encoding methods

Formulating a biological protein/peptide to statistical values using feature encoding methods is a crucial step51,52,53. In this research, to encode ACPs and non-ACPs, we considered compositional-based (CPSR), evolutionary-based (HOG-PSSM) and exploiting the power of large-language-based models to encode microbial peptide sequences into fixed-length feature vector. Each feature encoding method are explained in subsequent subsections.

Composition-based feature extraction method

ACPs are polymers composed of twenty natural AAs residues, while chemically different but structurally identical because of the side chain or functional groups of AAs. These peptides have unique physiochemical properties, frequency of occurrence and sequence length that play an effective role in characterizing the functions of proteins or peptides5. In this study, we use composition-based features called composite protein sequence representation (CPSR) method, previously used for the prediction of membrane proteins54 and anti-MRSA peptides55. CPSR descriptor, extract the set of seven different properties of AAs such as conventional frequency of AAs, sum of hydrophobicity, sequence length, bi-gram exchange group, R-group, electronic and, hydrophobic group given in Supplementary Table S2. The resultant feature vector of CPSR is 71-dimensions. The readers are referenced for more details to our previous study56.

Evolutionary-based feature extraction method

Over the recent years, evolutionary-based features have been successfully used for improving the performance of various bioinformatics predictors such as bacteriophage virion protein prediction57, DNA-binding proteins58, missense mutation59, and phosphorylated proteins60. PSSM descriptor are capable of sufficiently/ effectively explore/demonstrate the hidden evolutionary patterns using the alignment of protein sequences61. The PSSM matrix generate L *20 Dimension feature vector by executing PSI-BLAST program62, in which L denotes the peptide’s length and 20 denotes the twenty kinds of AA residues. The retrieved values in PSSM matrix are either positive or negative. The positive values means high correlated and negative values means less correlated features. Since, in our datasets the peptides sequences are variable length, so we cannot directly use PSSM. Conventional, PSSM composition method is often used for this purpose, however one of the major concern of this is approach is the loss of local sequence information63. To tackle this challenge, we introduce histogram of oriented gradient-based(HOG)64 position-specific scoring matrix (HOG-PSSM)65 method, to capture the fixed length feature vector. In, pattern recognition and computer vision histogram of oriented gradient (HOG)64 algorithm has widely been used as feature extractor for human detection66. Motivated by this, we modified HOG encoding method to transform the PSSM matrix by extracting the biological features. The working principle of HOG-PSSM method is explained in the subsequent steps: First, it is important to compute the horizontal Hx(a, b) and vertical gradients Hy(a, b) of the PSSM matrix by the specified formulation:

$$H_{x} (a,b) = \begin{array}{*{20}c} {PSSM(a + {1}, \,b) - 0,\, a = {1},} \\ {PSSM(a + {1},\, b) - PSSM(a - {1}, \,b),\, {1} < a < 20,} \\ {0 - PSSM(a - {1},\,b),\,a = {2}0} \\ \end{array} ,$$
(1)
$$G_{y} (a,b) = \begin{array}{*{20}c} {PSSM(a,\, \,b + {1}) - 0,\, b = {1},} \\ {PSSM(a,\, b + {1}) - PSSM(a, \,b - {1}),\, {1} < b < L,} \\ {0 - PSSM(a,\,\,b - {1}),\,\,b = L} \\ \end{array} .$$
(2)

Subsequently, the gradient’s direction and magnitude can be calculated by the below mathematical expression:

$$H(a,b) = H_{x} (a,b)^{{2}} + H_{y} (a,b)^{{2}} ,$$
(3)
$$\Theta (a,b) = {\text{tan}}^{{ - }{1}} \frac{{{\text{H}}_{x} (a,b)}}{{H_{y} (a,b)}} ,$$
(4)

where denotes the gradient magnitude H(a, b) and Θ(a, b) gradient direction of the PSSM matrix. For the 3rd step, the image is segmented into 16 by 16 size connected areas known as cells. Each cell encompasses the feature set compressing gradient magnitude and direction within the sub-matrix.

$$H_{ij} (s,t) = H{5} \times i + {1} + s,\,j \times \frac{{{\text{L}} }}{4} + {1} + t$$
(5)
$$\Theta_{ij} (s,t) = \Theta {5} \times i + {1} + ,{\text{n}} \times \frac{{\text{L}}}{4} + {1} + t$$
(6)

here i, j represents the sub-matrix subscripts (0≤i≤2, 0 ≤j ≤2) and the subscripts inside the sub-matrix locations (0 ≤ v≤ 9, 0 ≤ vL/2 -1) are denoted by s, t. Each sub-matrix produces sixteen different histogram channels on the basis of gradient direction. As a result, for each peptide sample HOG-PSSM generates 16*16=256-D (dimensions) feature vector.

Protein language models

In the realm of natural language processing, the evolution of large language models (LLMs)67,68 has become a cornerstone, showcasing remarkable potentials across wide range of problems69. Recently, protein learning representation has become topic of debate to understand protein function and structure70. PLMs71 have been made an extraordinary advancement in the field of bioinformatics and computational biology for the tasks in predicting molecular property72, antihypertensive peptide73, antimicrobial peptides74, etc. PLM typically employ self-supervised learning strategy by using large-scale available protein sequences75. The main advantages of PLM-based feature representation of peptide sequence compared to traditional feature engineering are: embedding vectors are initialized randomly before training and the model could learn the effective feature representation of the peptide sequences automatically rather than consuming much more effort to design the hand-crafted features. Second, the representation vector of the peptide sequence is much denser than lots of statistical features and therefore it can represent more hidden semantics form the sequence. Several PLMs version have been released to explore the protein related texture data. However, we deployed ESM-2 and ProtT5 as explained below as a mainstream methods for representing crucial peptide features.

ESM embedding features

The ESM-2 is the recently developed based on pre-trained language model. ESM framework76, which underwent training on a vast collection of protein sequences trained over 15 billion parameters. This model, leverages the transformer language model architecture. Throughout the initial training phase of ESM, it constructs a contextual sequence feature matrix. This process establishes a dimensional space that captures a variety of dimensions including sequence similarity, site-specific functional attributes, and three-dimensional configurations related to biochemical traits.

ProtT5 embedding features

Recently, Elnaggar et al. proposed ProtTrans71 framework leverages an extensive collection of over 39.3 billion AAs, drawn from the UniRef and BFD data repositories. This model, through its sophisticated pretraining phase, deliver a wide range of biophysical characteristics from proteins without labels. It offers insights into aspects such as the cellular compartmentalization of proteins and their solubility in lipid membranes compared to aqueous environments. Echoing the capabilities of the ESM framework, ProtTrans is similarly equipped to delineate features at the individual residue level within protein chains. It assigns to each protein attribute a 1024-dimensional profile, enriching our understanding of the molecular vicinity of mutations in the sequence. Herein, we named this feature as ProtT5 for easy writing. To our best knowledge, we are the first to introduce ProtT5 and ESM-2 as PLM in peptide sequence encoding to predict bioactive peptide with strong ACP inhibitory activity.

Wavelet denoising

In bioinformatics and machine learning, preprocessing is considered a challenging step to remove the data redundancy and outliers. Over the past years, 2D wavelet denoising (WD) method has been extensively employed in proteomics research77,78. The WD algorithm is also called threshold wavelet, capable of removing irregular non-stationary data and eliminate the noise from 2D data79. The AA residue in peptide sequences are expressed as signal in time and frequency domain. The decomposition of AA signals in peptide sequence through wavelet analysis can efficiently improve the prediction performance of the model by extracting the characteristic signals of each peptide80. In WD process the influencing factors that can significantly reduce the noise effect are comprised of three phases: wavelet transformation or function, the execution of wavelet coefficients and reconstruction of 2D signal81. The entire operation of this algorithm is deduced in Algorithm 1. A 2D data with noise effect can be formulated as:

$$D(x,y) = f(x,y) + \sigma \mu (x,y), \, x,y = {1},{2},{3},...,m - {1}.$$
(7)
Algorithm 1
figure a

The pseudocode for 2D WD algorithm is derived from reference79.

Model development, evaluation and performance metrics

We used off-the-shelf algorithms support vector machine, and Naïve Base classifiers as well as proposed novel model CDF, for training and validating the model.

CDF is an ensemble-based framework, proposed by Zhou et al.82, which can serve as a substitute for deep neural networks (DNNs)65. In recent times, CDF model became a has become a dominant learning algorithm in wide range of domains likewise pattern recognition83,84, and bioinformatics85. CDF model structure is an ensemble of trees hierarchically sequenced in multiple layers86. The top-down architecture of CDF enables the classifier ideal for training even limited number of samples. Furthermore, Zhou and Feng pinpointed in their pioneering work that CDF is much easier in tuning the hyper-parameter compare to DNN82. Considering this, an improved version of CDF were developed containing an ensemble of RF87, XGBoost88, and Extremely Randomized Trees (ERT) classifiers89 to build the model. Each layer of this classifiers is composed of four learners of XGBoost, RF and ERT machine learning classifiers that take the feature-vector of the previous layer. The previous level’s class probability is then passed on to the next layer. In order to produce the augmented attributes, the related heterogeneous feature vectors are merged, averaged and the maximum probability values is generated as output.. We set up K=500 decision trees for RF, XGBoost, and ERT. The node split attributes were selected by randomly selecting d features, where d is the total number of features. This training process was terminated when there was no substantial performance improvement. Figure 6 shows the layer-by-layer architecture of the CDF classifier. Hyperparameters of the models were tuned using GridSearch CV of Python. We utilized four evaluation measures i.e., accuracy (Acc), sensitivity (Sn), specificity (Sp), Matthew’s correlation coefficient (MCC) for comprehensive examination of our proposed ACP predictor. The formulation of the evaluation metrics are shown below:

$$Acc = \, \frac{(tp + tn)}{{(tp + tn + fp + fn)}},$$
(8)
$$Sen = \frac{t p}{{tp + fn}},$$
(9)
$$Spe = \frac{tn}{{tn + fp}},$$
(10)
$$MCC = \sqrt {\frac{tp \cdot tn - fp \cdot fn}{{(tp + fp)(tp + fn)(tn + fp)(tn + fn)}}} .$$
(11)
Figure 6
figure 6

Architecture of the CDF model.

In the above-mentioned notation tp denotes the peptides with ACP activity, and tn denotes the peptides with non-ACP activity. Similarly, fp denotes the number of false peptides that have no ACP activity and fn means the number of false peptides having ACP activity. The aforementioned assessment metrics are threshold dependent. Furthermore, we used the receiver operating characteristic (ROC) curve, along with the area under the ROC curve(AUC) as threshold-independent indexes to evaluate the overall effectiveness of the proposed method90,91. The closer the prediction value is to 1, the better the predictive performance of the classification algorithm and vice versa. We adopted 10-fold CV method to construct an intelligent predictive model for targeting accurate ACPs.

Conclusion

ACPs are short rang therapeutic peptides that play significant role in designing effective anticancer drugs. The pressing demand for predicting novel ACPs via in silico methods remains urgent to understand their functions and potential role in cancer treatment. In this study, we introduced a novel ensemble-based model, PLMACPred for ACP prediction which leveraged the power of PLM, sequence embedding, and biologically relevant features from peptide sequences. The superior performance of PMLACPred on multiple challenging benchmark datasets, solidify its efficiency as valuable prediction tool for the discovery of new ACPs in particular and other therapeutic peptides in general. In future, we will make an effort to develop publicly accessible web server for the proposed method and further extend our research to identifying other activities such as antiviral, antimicrobial, antifungal and anticorona virus etc., in large-scale therapeutics peptides. We believe, PLMACPred will be used heavily as a useful tool for aiding discovery and design of novel ACP in a rapid, high-throughput and cost-effective fashion.