Abstract
Determination of fundamental mechanisms of disease often hinges on histopathology visualization and quantitative image analysis. Currently, the analysis of multi-channel fluorescence tissue images is primarily achieved by manual measurements of tissue cellular content and sub-cellular compartments. Since the current manual methodology for image analysis is a tedious and subjective approach, there is clearly a need for an automated analytical technique to process large-scale image datasets. Here, we introduce Nuquantus (Nuclei quantification utility software) - a novel machine learning-based analytical method, which identifies, quantifies and classifies nuclei based on cells of interest in composite fluorescent tissue images, in which cell borders are not visible. Nuquantus is an adaptive framework that learns the morphological attributes of intact tissue in the presence of anatomical variability and pathological processes. Nuquantus allowed us to robustly perform quantitative image analysis on remodeling cardiac tissue after myocardial infarction. Nuquantus reliably classifies cardiomyocyte versus non-cardiomyocyte nuclei and detects cell proliferation, as well as cell death in different cell classes. Broadly, Nuquantus provides innovative computerized methodology to analyze complex tissue images that significantly facilitates image analysis and minimizes human bias.
Similar content being viewed by others
Introduction
Histological evaluation of tissue samples is a prevailing diagnostic method in the study of cellular pathologies in a variety of diseases. Tissues undergoing pathological remodeling, with changes in the number and types of cells are often evaluated using immunofluorescent staining to define changes in specific cell types in complex multicellular organs. In most instances, a single tissue section is simultaneously stained for multiple cell markers. Advanced microscopy imaging1 enables high-resolution visualization of stained tissue and acquisition of a multitude of images that require detailed analyses. However, most image analytical techniques currently used to define the number and types of cells in complex tissues, as well as the proliferative or apoptotic “state” of specific cell types, are usually subjective and time consuming.
Biomedical image processing is a progressively developing field that applies computerized approaches to facilitate and augment microscopy image analysis2. Well established methods have been previously applied on histopathological images to execute morphological cell analyses3,4,5. Some studies have segmented whole cells6,7 specifically, in isolated cell lines8 or intact tissue sections, in which cell borders were clearly apparent9,10. However, segmentation of intact cells in situ becomes problematic when cell boundaries are not stained and cannot be easily visualized. In addition, in situ cell segmentation is exceedingly challenging due to intact tissue arrangement, condensed cellular structures and the presence of heterogeneous cell populations. Further complications are observed in pathology images because the damaged tissue forms irregular cell structures and undergoes dynamic healing processes11. In consequence, whole cell segmentation may be an ineffective strategy for intact tissue image analysis.
Automated nuclei segmentation is another well-studied approach that has been previously applied on histopathology images12,13,14. Nuclei segmentation employs spatial information such as size, shape and texture to classify nuclei. This information can be further used to classify mitotic nuclei, abnormal chromatin distribution and irregular nuclear boundaries. While nuclei segmentation has been widely applied in cancerous cells studies12,15, its application in other pathology disciplines (e.g. acute tissue injury, ischemia, hypoxia) requires more intricate techniques, because after tissue injury nuclei of existing cell subtypes may morphologically change and new cell types can migrate into the damaged tissue. Some of these different cells subtypes may exhibit similar nuclei features, making global nuclei segmentation indistinguishable between their matching cell subpopulations. Therefore, general nuclei segmentation may not be adequate to quantify the information contained in injured tissue images.
Currently, there is a paucity of reliable automated tools to quantitatively analyze complex fluorescent images obtained from injured tissues. Therefore, manual subjective examination still remains the standard approach. This extremely laborious task is plagued by three major limitations: 1) Time–manual analysis of hundreds and even thousands of images is a meticulous task that is time consuming. Additionally, the manual workload is significantly increased if an image was acquired via multiple fluorescent channels. In this case, an image has to be dissected into multi-channels so that each channel is manually examined. 2) Bias–tendency of the analyst to favorably or unfavorably quantify cells or cellular components that fit one’s hypothesis may severely skew results. 3) Inconsistency–Different analysts may quantify the information perceived from the same set of images in a way that introduces large variability and results in conflicting findings. Considering the complexity of histological tissue images and the key challenges involving the manual approach for in situ image examination, development of automated image analysis algorithms is a fast growing necessity.
To address the shortcomings of the computerized tissue image analysis, we have designed a new image processing software we termed Nuquantus (Nuclei quantification utility software). Nuquantus is a novel supervised machine learning framework that segments, classifies and quantifies nuclei of cells of interest in complicated tissue images. Nuquantus is an open source Matlab program that can be downloaded from https://www.cbica.upenn.edu/sbia/Erdem.Varol/nuquantus.html.
We demonstrate the utility of Nuquantus in cardiac tissue images taken from animal models of ischemic cardiac injury, known as myocardial infarction (MI). Fluorescent histopathology images of the heart after MI illustrate the difficulty and the complexity involved in defining cardiomyocytes (CMs) and CM nuclei with computerized image processing. Firstly, common MI animal models include small and large animals in which cardiac anatomy and physiology differ. For this reason, development of generic software that is adaptive to biological differences between species is challenging. Secondly, the heart is a muscular organ in which muscle fibers are oriented in different directions across the cardiac wall, making the identification of CMs difficult. Upon heart fixation and tissue slicing into histological sections, the orientation of CMs can vary from round cross sections to transverse or longitudinal rectangular shapes. Additionally, the heart contains many different cell types, with CMs comprising less than one-third of the cells16,17. In healthy cardiac tissue, these anatomical issues can complicate the identification of CMs and their corresponding nuclei. In cardiac injury, the identification of CMs and CM nuclei becomes further complicated since CMs can morphologically remodel or uncouple from their neighbors, in concurrence with massive infiltration of inflammatory cells18.
CM nuclei can be manually classified and quantified without cell membrane specific staining. However, as stated above, this technique is time-consuming, prone to bias and can be inconsistent between different analysts. Recent studies on cardiac regeneration have greatly relied on manual measurements of the number or percentages of apoptotic CM nuclei, as well as new CM nuclei that have been recently formed19,20,21,22. The data from these studies can be widely variable, likely due to the subjective analytical tools that were employed.
The innovative approach of Nuquantus allows streamlining image analysis of complex tissue, such as cardiac tissue. While many other automated cell analysis methods explicitly attempt to segment cells in situ using information of the membrane stains, Nuquantus detects and quantifies nuclei corresponding to cells of interest even in the absence of cell membrane information. Moreover, Nuquantus allows customized image analysis of a variety of imaging datasets from different animal models or imaging protocols since it is an adaptive learning based framework. Importantly, the semi-automated pipeline of Nuquantus greatly speeds up image analysis, reduces human bias and produces reproducible data that increases consistency between different analysts.
Results
Nuquantus design
Nuquantus was designed to identify and classify specific cell subtype nuclei in tissue sections with cellular diversity. In this study, we portray the application of Nuquantus in fluorescent stained cardiac tissue images.
Nuquantus is a sequential pipeline of image processing modules that takes raw histological images and outputs the desired quantification of nuclei of cells of interest. The initial step in the Nuquantus pipeline (Fig. 1) includes image preprocessing to correct microscopy illumination artifacts (Fig. 1b and Supplemental Fig. S1 online). All nuclei present in the image are segmented and then approximately hundreds of high-level shape and texture descriptors are extracted from all nuclei and their surroundings (Fig. 1c). Next, prediction of nuclei of cells of interest within the heterogeneous pool of cell populations is made based on patterns learned a priori from a training (labeled) set of images via supervised learning models (Fig. 1d). To further refine the predictions, the user can confirm the automated predictions and rule out possible false positives (Fig. 1e). Lastly, a final output with classified cell specific nuclei is provided to the user along with a quantitative summary (Fig. 1f-1g). The quantitative analysis includes colocalization information of the selected nuclei with additional molecular markers. For instance, additional markers may include specific stainings for DNA synthesis, DNA fragmentation, cytosolic or nuclear proteins.
We exemplify Nuquantus classification performance of CM nuclei in composite cardiac tissue images after ischemic myocardial injury. We used histological samples from two different MI animal models: a large animal model for ischemia reperfusion in swine and small animal model with permanent coronary artery occlusion in mice (see methods). Histological observation of the cardiac tissue reveals morphological variations between the two animal models of post MI remodeling (Fig. 2). These variations are attributable to a variety of factors including differences in heart size and MI experimental protocols23,24. We identified normal viable CMs as nucleated cells positive for α-sarcomeric actin–a cardiac specific structural protein. Nuclei that were not found in the center of α-sarcomeric actin staining represent nuclei of interstitial cells that reside within the cardiac tissue. These nuclei belong to fibroblasts, endothelial cells, vascular smooth muscle cells and immune cells17,23. Three sub-regions of the heart were compared: I) Infarct Area (IA) as the site of ischemic injury where most CMs and supportive tissue die and are eventually replaced by scar (Fig. 2c); II) Border zone (BZ) as the interface between the infarct area and the viable tissue with mixtures of CMs and scar (Fig. 2b) and III) Viable zone (VZ) (Fig. 2a). The IA and BZ contain a mixture of cell types along with variation in CM size and abnormal shape. These differences exist between the three cardiac sub-regions within the same animal model. Moreover, the features of the same cardiac sub-region differ between the large and small animal models, making the overall CMs identification very difficult.
To achieve accurate CM nuclei segmentation in images with high anatomical and morphological variability, we trained a specific machine learning model for each type of animal (Fig. 3). An expert cardiac histologist manually labeled 84 images of mouse post-MI cardiac tissue (26 images of BZ, 34 images of VZ and 24 images of IA). In these images, 16,957 nuclei were labeled out of which 1,673 were identified as CM nuclei. Also, 88 images of swine post-MI cardiac tissue (35 images of BZ, 37 images of VZ and 16 images of IA) were labeled. These images included 19,162 nuclei labels out of which 3,361 were labeled as CM nuclei. All the images were pre-processed as detailed in the method section. Common cell segmentation features such as Haralick25 were extracted for each nucleus. In addition, novel features such as Fourier histograms of gradients (HOG)26 and cell membrane detector (CMD)27,28 were implemented for the first time, to our best knowledge, for the purpose of nucleus segmentation. These extracted features quantitatively captured the shape and the texture information of the nucleus and its surrounding environment to mimic the features recognized by the histologist when attempting to detect CM nuclei. Extracted features of all nuclei–CM and non-CM nuclei were input into a logistic regression model29 that learned the patterns of the ground truth provided by the expert cardiac histologist’s labels. Next, predictions on unseen nuclei (leave-image-out training as described in methods) were computed via probability scores in order to discriminate a non-CM nucleus from a CM nucleus (Supplemental Fig. S2 online). Based on these scores, we generated a precision-recall curve to assess the performance of each animal prediction model.
For both the mouse and swine models, the logistic regression thresholds were determined by leave-one-image-out cross-validation. For the swine model, we selected a logistic regression threshold corresponding to CM nuclei detection at 90% recall and 75% precision (Fig. 3a). With this approach, for every 20 true CM positive nuclei, 5 nuclei may be classified as a false positive. On the other hand, for every 20 true CM positive nuclei, only 2 CM nuclei will be lost as false negatives. For mouse CM nuclei detection, we selected a threshold of 85% recall and 32% precision (Fig. 3b). The interpretation of this precision-recall trade-off is that for every true positive detected CM nucleus, 2 false positive nuclei will be detected. The false positive nuclei can then be removed in the optional user correction step. Image analysis using Nuquantus contests the manual approach as a result of substantial image clean-up. Given that mouse images contain up to 90% non-CM nuclei, as provided by the histologist’s ground truth, Nuquantus automatically removes 80% of non-CM nuclei. The remaining clean-up to be carried out by the user correction step is estimated to be only 20%.
The difference in precision between the mouse and swine models is explained by the relatively more complicated image textures found in mouse images. Mouse cardiac tissue histological specimens contain CMs at a variety of planes of section (cross, longitudinal section and oblique section) within the same slide, due to the small overall heart size. Histological specimens from large animal models are more likely to have CMs at only one plane of section, making analysis more uniform. Nevertheless, the selected high recall for the mouse model successfully reduces the loss of false negative CM nuclei. Furthermore, we tested the model performance on different cardiac sub-regions: VZ, BZ and IA to observe the region specific performance of the prediction models. The prediction of CM nuclei in the mouse BZ performs at higher precision-recall (52%, 85% respectively) than VZ or IA. While IA is the region where the majority of CMs die due to the injury, the BZ remains a relatively enriched zone with CMs. Nevertheless, the CM density in BZ is less compact in comparison to the VZ. Hence, segmentation of individual CM nuclei in the BZ becomes an easier task than in IA or VZ.
Swine CMs nuclei detection is shown in Fig. 3c. The model reliably identifies the vast majority of the true positive CM nuclei, while filtering out most of the non-CM nuclei.
Rapid and reliable image analysis using Nuquantus
The effectiveness and the efficiency of CM nuclei segmentation by Nuquantus were compared to the standard approach of manual CM nuclei identification (see methods). We computed inter-observer agreement between four independent cardiac histologists that manually labeled CM nuclei in a set of five VZ of mouse cardiac tissue images (Fig. 4a). Then, same four histologists used Nuquantus to segment CM nuclei in a new set of five VZ of mouse cardiac tissue images. A different set of images was used in each experiment to avoid bias towards previously seen images. However, the complexity of the cardiac tissue in all images was preserved. The inter-observer agreement between the histologists in each experiment was computed using the Fleiss Kappa30 (see methods). The Fleiss Kappa value between the histologists using the manual approach was (0.511 ± 0.006) (Fleiss Kappa ± standard error), which represents a moderate agreement. The Fleiss Kappa value between the histologists using Nuquantus was improved to (0.695 ± 0.006) (Fleiss Kappa ± standard error), which reflects a substantial agreement. In both manual approach and Nuquantus, the observed agreement was not accidental (P value < 0.0001). We observed a significant improvement of the inter-observer agreement using Nuquantus (Z-score: 21.72, P value < 0.0001). Subsequently, the CM nuclei segmentation via Nuquantus software was more consistent across different users (Fig. 4a).
Another important improvement introduced by Nuquantus was the analysis speed up of CM nuclei segmentation in complex tissue images. To demonstrate the rate of image analysis, a total of 36 mouse cardiac VZ images were analyzed by four histologists and the time of analysis was recorded for each image. Half of the images were analyzed by Nuquantus and the other half was manually analyzed. CM nuclei segmentation using Nuquantus significantly expedited the average time for image analysis. On average, one image analysis via Nuquantus was completed in (2.2 ± 0.9) minutes (mean ± standard deviation) versus (12.6 ± 5.6) minutes via manual analysis (Fig. 4b).
In summary, Nuquantus accelerated the image analysis by 6 fold and significantly improved the level of agreement between different histologists in segmenting CM nuclei.
Cell proliferation and apoptosis quantified by Nuquantus
Cell proliferation with synthesis of new DNA takes place in tissue post injury. Cell proliferation in the post-MI heart occurs primarily in non-CM interstitial cells and potentially in CMs17,31,32. Detection of proliferating cells is possible via 5-ethynyl-2′-deoxyuridine (EdU)–a thymidine analog, which is incorporated into the DNA of a cell nucleus during DNA replication phase. The incorporated EdU reacts with azide–modified fluorescent dye that can be detected by confocal imaging33,34. In our study design, we implemented EdU minipumps in a mouse in vivo MI model (see methods). Detection of DNA replicating cell nuclei in the heart was observed 1 week post-MI in IA, BZ and VZ (Fig. 5a). Nuclei positive for DAPI staining (blue) and EdU staining (green) surrounded by positive α-sarcomeric actin (red) defined CM nuclei undergoing DNA synthesis. Nuclei positive for DAPI and EdU staining but not surrounded by positive α-sarcomeric actin were defined as non-CM nuclei undergoing DNA synthesis. Overall, nuclei subjected to DNA replication were largely detected in IA and BZ. However, the existence of clumped nuclei and lack of clear CM cell borders overburden the quantification and classification of EdU and DAPI positive CM nuclei versus non-CMs nuclei. Additionally, the inhomogeneous light illumination, a known limitation of confocal imaging, overwhelms the identification of any EdU and DAPI positive nuclei in the dark edges of the image.
To quantitatively evaluate Nuquantus performance given the above mentioned challenges, we compared between manual classified nuclei counts obtained by a histologist and Nuquantus counts obtained after image intensity correction and nuclei segmentation following the user correction step (Supplementary Fig. S3 and Fig. S4 online). The ratio of Nuquantus segmented CM nuclei out of total nuclei was similar in all post-MI and sham zones when compared to manual approach (Fig. 5b). As expected, Nuquantus identified a gradually decreasing percentage of CM nuclei in BZ and IA compared to VZ, suggesting that the site of injury contained a reduced number of CM cells. An opposite trend was observed with the percentage of nuclei undergoing DNA synthesis. Similarly to manual approach, Nuquantus confirmed that the percentage of proliferating cells with positive EdU and DAPI nuclei were primarily found in IA (56% manual vs. 44% software) and BZ (50% vs. 42%) and secondarily in VZ (26% vs. 27%) (Fig. 5c). In comparison to the manual counts, Nuquantus shows slightly increased detection of proliferating CM nuclei (Fig. 5d,e), which can be attributed to the image preprocessing and illumination correction. These findings are consistent with the pathogenesis of injured tissue, in which the site of injury becomes an attractive platform for proliferating inflammatory cells, active angiogenesis, scar tissue formation and potentially tissue regeneration. Notably, almost no proliferation was measured in sham mice (<3% using manual and Nuquantus approaches), confirming that these animals did not suffer from any ischemic injury in the heart.
A controversial question in the cardiac field is whether adult CMs are capable of proliferating after MI. Out of cell subtypes that survived the infarct injury in the IA and the newly infiltarating cell subtypes to the region of the insult, only ~1% CM nuclei were identified (Fig. 5b). From this ~1% of CM nuclei, 16% were identified by manual counts and 31% were identified by Nuquantus as containing positive nuclei for EdU and DAPI staining (Fig. 5d), indicating DNA synthesis. Out of ~5% CMs found in the BZ, only 4% vs. 8% (manual vs. Nuquantus, respectively) contained nuclei undergoing DNA synthesis. Among ~12% of CMs found in the VZ, only 2% vs. 6% (manual vs. Nuquantus, respectively) CM nuclei were undergoing DNA synthesis. In shams, both manual and Nuquantus methods indicated <1% EdU+DAPI positive CM nuclei. No statistical difference was obtained between the reported proportions using both approaches (2-sample z-test). In a comprehensive scale, Nuquantus confirmed that less than 1% of nuclei positive for EdU + DAPI staining were classified as CM nuclei in IA (Fig. 5e). A similar tendency was observed in BZ and VZ (<2%) via manual and Nuquantus methods.
These findings suggest that Nuquantus reliably detects nuclei undergoing DNA synthesis and is capable of classifying these nuclei into CM versus non-CM nuclei. Quantification of these data by Nuquantus matches the standard manual approach and confirms that DNA synthesis in CM nuclei is a low frequency event in the post-MI heart.
CMs undergo programmed cell death (noted apoptosis) after an ischemic injury35,36. Histological detection of apoptosis is possible with Fluorometric TUNEL assay (see methods) that identifies nuclear DNA fragmentation37. To validate Nuquantus for detection and classification of nuclei undergoing DNA fragmentation, we obtained tissue images from healthy mouse hearts. In those hearts, no apoptosis or very low physiological rate of apoptosis should be evident. Therefore, these images represent TUNEL negative controls. In positive TUNEL control (see methods) extensive DNA fragmentation is detected indicating a vast apoptosis (Fig. 6a). Staining for sarcomeric tropomyosin was performed to detect CMs. We were able to validate Nuquantus classification of CM nuclei using our mouse training model (Fig. 6B and Supplementary Fig. S5 online). TUNEL positive CM nuclei were determined if nuclei were positive for DAPI staining (blue) and TUNEL staining (green) surrounded by positive sarcomeric tropomyosin staining (red).
In contrast to TUNEL negative controls where the percentage of total apoptotic nuclei remained very low: 0.09% vs. 0.04% (manual vs. Nuquantus, respectively), in the positive TUNEL control the percentage of total nuclei was elevated to 96% vs. 88% (manual vs. Nuquantus, respectively) (Fig. 6c). Similar results were obtained when comparing the percentage of apoptotic CM nuclei in negative and positive TUNEL controls–manually versus Nuquantus segmentation. No statistical difference in percentages obtained between manual and Nuquantus approaches was found (2-tailed 2-sample z-test).
These studies show that, Nuquantus successfully classified CM and non-CM nuclei undergoing DNA fragmentation. Quantification output of Nuquantus was validated in comparison to the quantification using the manual approach.
Discussion
Gaining histological insight into cellular mechanisms that are driving diseases requires efficient analysis tools to reliably detect and quantify sub-cellular components of interest. Yet, a major bottleneck exists in the analysis of complex fluorescence tissue images, which is often dependent on subjective visual assessment and manual labeling. Despite advances made with automated cell and nuclei segmentation6,12, the practical challenge remains to segment cells in intact tissue without visual cell boundaries and to identify specific cell of interest nuclei in the presence of diverse cell subtypes.
In this study, we have trained Nuquantus, a supervised machine learning software, to classify specific cell subtype nuclei, namely CM nuclei in fluorescent tissue images of remodeling heart. Nuquantus training was based on CMs that exhibited mature sarcomeric structure. In overall, Nuquantus extracts hundreds of high-level image features from which it learns patterns of tissue architecture. Subsequent to learning informative patterns in these features, Nuquantus automatically distinguishes between nuclei subclasses–CM versus non-CM nuclei via computation of probability scores. The advantage of this approach is that CM nuclei are classified in an objective computerized fashion rather than relying on subjective human perception. Previously, it was shown that combining information across microscopy multi-fluorescence channels substantially improved the identification of the cellular components in complex tissue images38. Therefore, Nuquantus integrates information from the segmented nuclei (blue channel) and their cellular surroundings (red channel) to achieve accurate nuclei classification. To avoid machine learning overtraining39, we used different subsets of images to train Nuquantus. Images from IA, BZ and VZ were applied during Nuquantus training to make the system robust to variability present in the post MI heart images within the same animal model. This way, we ensured that the training model is generalized and can be adapted to new incoming images without being systematically biased towards specific type of images. In the present study, we have trained CM nuclei classification in two MI animal models: one for small animals (mice) and one of large animals (swine). Ideally, a unified prediction model would be preferable to identify CM nuclei in all animal models. However, the scale and CM morphology differences are too large to learn consistent patterns that can be generalized in all animal types.
Post MI cardiac remodeling elucidates typical structural and cellular modifications that occur in tissue pathology: cellular rearrangement, fibrosis, cell death, cell regeneration, inflammation, angiogenesis, etc. Often these cellular modifications affect multiple cell populations, which in this case are CM and non-CMs17. For this reason, specific CM nuclei identification is highly challenging. Although, fully accurate CM classification may not be attainable, we have optimized the classification precision and recall in each animal model, such that an optimal trade off is maintained between truly classified CM nuclei and potentially misclassified CM nuclei. An ideal classifier would have an absolute separation between CM and non-CM predicted nuclei. However, nuclei that reside in convoluted tissue regions, such as IA and BZ, may be classified as false positives or false negatives. The opted high recall essentially eliminates true non-CM nuclei from the predictions and significantly facilitates image analysis. In addition, the corresponding level of precision enables detection of the vast majority of true CM nuclei and can be further boosted by the user in the semi-automated correction step. Although somewhat labor consumig, semi-automated user correction step focuses on reviewing nuclei segmentation results rather than manually annotating images. Indeed, Nuquantus has been shown to significantly speed up image analysis, decrease the analysis workload and increase the inter-observer agreement of nuclei of interest between multiple observers.
Cardiac regeneration through new CM formation remains an area of controversy40,41,42, partially due to shortage of objective quantitative tools. In this work, Nuquantus was validated against standard approach of manual image analysis. We conclude that Nuquantus is a consistent strategy for image analysis in a quantitative reproducible manner. Notably, measurements of CM proliferation and CM apoptosis ratios are computed after user interaction step, making the introduction of human bias minimal. The user interactive step focuses only on verification of CM nuclei classification while masking out any additional fluorescent channels containing information regarding DNA synthesis (EdU stain) or DNA fragmentation (TUNEL stain). Eventually, the quantification of EdU or TUNEL positive CM nuclei is objective and fully automated. Nevertheless, automated quantification of EdU positive CM nuclei does not determine the source of the putative new CMs and does not distinguish between CM proliferation versus CMs deriving from a precursor (stem) cell.
The limitation of Nuquantus in the present study is that the image analysis is tailored to the same type of images that the supervised training was performed on. As a consequence, the analysis of images obtained on either different magnification or different tissue subtypes requires training of another machine learning model. Morphological diversity exists between different types of tissues and organs. Also, image acquisition at either higher or lower magnification alters the amount and the quality of feature information that can be gathered from a single image. However, the modular design of Nuquantus and its ultimate feature toolbox allow the users to train a new machine learning model to accommodate to their needs and specifications of their imaging datasets (also available at the software website).
In the future, Nuquantus will be applied to other tissue immunohistochemistry protocols such as Hematoxylin and eosin stain (H&E stain) and immunofluorescence combined with cell membrane stains. Additional immunohistochemistry techniques are required to identify small immature CMs. While these staining protocols are in progress, Nuquantus can be further trained to classify subgroup of newly forming CMs. In addition, Nuquantus will be upgraded to allow interactive machine learning model43. In this type of online machine learning, the user’s interactive input, is incorporated into the existing prediction model in real time to further improve the predication accuracy.
In summary, Nuquantus is a free, open source, Matlab software that is customizable for user needs and could be used for annotating cell nuclei in a large variety of histology images, up to the most challenging ones. The machine learning approach of Nuquantus allows cell subtype specific nuclei classification in complex fluorescence tissue images in an accurate, objective and rapid manner. Nuquantus is a high-throughput software, which enables semi-automated image analysis and increases the amount of the data that can be quickly and accurately processed.
Methods
Implementation
The Nuquantus software was developed in Matlab 2014a and is freely available for research purposes. The access to Nuquantus software is provided in the following website: https://www.cbica.upenn.edu/sbia/Erdem.Varol/nuquantus.html. The experiments presented in this paper were carried out on Windows 7–64 bit, 4GB RAM, Intel i5 2.5 GHz CPU.
Animals and myocardial infarction
All animal procedures were approved and performed in accordance with the guidelines and regulations of the Temple University School of Medicine Institutional Animal Care and Use Committee (IACUC). Permanent occlusion myocardial infarction (MI) was induced in 12-week old male C57BL/6 mice (The Jackson Laboratory; Bar Harbor, ME) by ligation of the left anterior descending coronary artery (LAD) as previously described44,45. Sham operated animals underwent surgery without LAD ligation. After recovering from surgery, osmotic mini pumps (Azlet; Cupertino, CA) containing 39 mg/mL 5′-ethynyl-2′-deoxyuridine (EdU) (Life Technologies; Carlsbad, CA) were implanted subcutaneously between the two scapulae to deliver a continuous infusion of EdU (Life Technologies; Carlsbad, CA) at a flow rate of 0.5 μL/hr for 1 week. Mini pumps containing only vehicle (50/50 DMSO/ddH2O) were also implanted in control mice group. All pumps were removed after 1 week; mice were sacrificed and the hearts were extracted via cardioectomy. Gottingen female mini-swine age of 7–9 months (Marshall BioResources, NY, USA) were used for angioplasty catheterization to temporarily occlude the blood flow in the LAD and induce ischemia reperfusion MI model46. The swine were sacrificed 1 month post-MI and cardioectomy was performed. All invasive procedures were performed under anesthesia as previously described45,46.
Tissue Processing
In mice, the heart was arrested in diastole by intracoronary perfusion of 100 mM cadmium chloride and 1 M potassium chloride. Both mouse and pig hearts were gravity perfused with 10% formalin, processed and embedded into paraffin wax blocks. Wax blocks were cut into 5 μm thick sections and mounted onto glass slides (AML Laboratories, Baltimore, MD, USA).
Immunofluorescence histochemistry
Slides underwent deparaffinization for antigen retrieval with citrate buffer. The following primary antibodies were used: α-sarcomeric actin (A2172, Sigma-Aldrich; St. Louis, MO), sarcomeric tropomyosin (Sigma-Aldrich #T9283; St Louis, MO). Secondary antibodies were used to detect the primary antibodies as follows: rhodamine red-X donkey anti-mouse IgM (Jackson Immunoresearch Laboratories; West Grove, PA) to detect α-sarcomeric actin or sarcomeric tropomyosin. Nuclei were stained with 4′, 6-diamidino-2-phenylindole (DAPI, Millipore; Billerica, MA). EdU detection was performed using the EdU Click-iT Imaging kit (Life Technologies; Carlsbad, CA).
Immunostaining with terminal deoxynucleotidyl transferase-mediated deoxyuridine triphosphate nick-end labeling (TUNEL) was performed on paraffin-embedded heart tissues obtained from12 weeks old wild-type male C57BL/6 mice. These mice were subcutaneously injected with 200μl of physiological solution saline and sacrificed 24 hours later. No heart injury is expected to be obtained after saline injections. The DeadEnd Fluorometric TUNEL system was used to label apoptotic nuclei (Promega #G3250; Madison, WI). For positive control slides were treated with DNASE I (10U/l) prior to TUNEL staining.
Image acquisition
Confocal images were randomly acquired in the infarct area, border zone and viable (remote) zone from 19 mice and 4 swine MI animal models. 5–15 images were acquired using at least 3 slides per 1 animal. Each slide contained all 3 post-MI cardiac zones. Nikon Eclipse T1 confocal microscope (Nikon Inc; Mellvile, NY) at 40X magnification was used for image acquisition.
Image Processing
The images were corrected for lighting inhomogeneity and variation of tissue depth inside the field of view. These artifacts were removed by dividing, at each pixel, the intensity measured in the different channels by the local background intensity, that was measured by applying a Gaussian filter of scale 10 pixels to the red (α-actin/tropomyosin staining) channel47. After this intensity correction, the nuclei were segmented by extracting the connected components48 on the blue DAPI channel. Nuclei smaller than 5 pixels were filtered out as noise.
Feature extraction
Five sets of features were extracted for each blue (DAPI staining) and red (α-actin/tropomyosin staining) pixel: (i) rotationally invariant histograms of gradients (Fourier HOG)26, (ii) Haralick features25, (iii) local intensity entropy, (iv) the output of a cell membrane detector (CMD) based on regularized steerable filters27 and tensor voting28 and (v) the size of the extracellular component containing the pixel (ECS). Further details are provided in the supplementary material. Then, four descriptive statistics: mean, median, standard deviation and median absolute deviation were computed to summarize feature information for each nucleus. These features were scaled independently to zero mean and unit variance for each image, in order to produce a set of measures that can be compared across images.
Nuclei Classification
Expert cardiac histologist tagged the nuclei that are inside cardiomyocytes to serve as ground truth labels. Total of 16,957 mouse nuclei and 19,162 swine nuclei were labeled. The labeled mouse and swine nuclei were used for training a logistic regression model using the Liblinear29 package to obtain a mouse and a swine nuclei prediction model, respectively. Classification results were thoroughly cross-validated: the performance of the model was assessed by sequentially leaving out the nuclei from one image, termed unseen image and training a logistic regression model on the remaining images (leave-image-out procedure). The model was applied to the nuclei in the left out image, in order to estimate their likelihood to be CM nuclei. This procedure was repeated for every single image in the training dataset to result in prediction probability scores for all nuclei. The predications of the different models were compared to the labels provided by the expert histologist for generating precision/recall curves and quantifying precisely the prediction performance of the models. The precision and recall were computed using the following formulas:
Once an appropriate operating point for the model was chosen based on the precision/recall curves, the trained prediction models were then used to identify CM nuclei in new images whose features were extracted using the same above steps.
Manual counts of EdU, TUNEL, DAPI and Myocytes
Cells from all images were manually counted in Adobe Photoshop CS6 Extended (Adobe Systems Inc.; San Jose, CA). For each image, DAPI positive staining was labeled with a dot label and counted for the total number of DAPI positive cells. To be considered EdU or TUNEL positive, cells must have contained both EdU/TUNEL and DAPI signals. In a separate photoshop layer, myocyte nuclei were labeled and counted if the nuclei were positive for DAPI and surrounded by α-sarcomeric actin or tropomyosin positive staining. All DAPI negative cells were disregarded. EdU or TUNEL positive myocytes were counted if the cell contained all three stains (EdU/TUNEL, α-sarcomeric actin/tropomyosin and DAPI).
Fleiss Kappa inter-observer agreement
The agreement between four histologists, who manually labeled images for CM nuclei and also used Nuquantus to segment CM nuclei, was measured as follows. All the nuclei (CM and non-CM nuclei) in the images were segmented automatically. The nuclei with a center located within an Euclidean distance of 15 pixels in the label were marked by a histologist. These nuclei were considered as ‘selected by the histologist’ (CM nuclei) and the remaining nuclei as ‘non selected’ (non-CM nuclei). The distance of 15 pixels was chosen because it corresponds roughly to the largest radius of the CM nuclei observed in our images. Then, the agreement was measured by Fleiss Kappa30, considering all the nuclei in the images as samples attributed to two categories-selected nuclei as CM nuclei and non selected nuclei as non-CM nuclei. This procedure was repeated for the labeling obtained using Nuquantus, in order to measure the agreement between the histologists that used Nuquantus. The standard errors associated with these two Kappa values allowed us to derive a P value for the agreement improvement observed via Nuquantus. The interpolation of Kappa (κ) was made as follows: 0.01 ≤ κ ≤ 0.2 is a slight agreement, 0.21 ≤ κ ≤ 0.4 is a fair agreement, 0.41 ≤ κ≤ 0.6 is a moderate agreement and 0.61 ≤ κ ≤ 0.8 is a substantial agreement.
Statistical analysis
Non-parametric Wilcoxon test was used for mean comparisons. The Wilcoxon test was performed because the distribution of the means was not necessarily a Gaussian distribution. Thus, we essentially compared the medians of two distributions of the means. For comparison of sample proportions, 2-sample z-test was applied. P value is indicated where applicable. Z-test, which is the acceptable test to assess significance of Kappa test49, was used to obtain the significance of the difference between the inter-observer agreements in the manual versus Nuquantus methods.
Additional Information
How to cite this article: Gross, P. et al. Nuquantus: Machine learning software for the characterization and quantification of cell nuclei in complex immunofluorescent tissue images. Sci. Rep. 6, 23431; doi: 10.1038/srep23431 (2016).
References
Schmolze, D. B., Standley, C., Fogarty, K. E. & Fischer, A. H. Advances in microscopy techniques. Archives of pathology & laboratory medicine 135, 255–263 (2011).
Eliceiri, K. W. et al. Biological imaging software tools. Nature methods 9, 697–710 (2012).
Chen, S., Zhao, M., Wu, G., Yao, C. & Zhang, J. Recent advances in morphological cell image analysis. Computational and Mathematical Methods in Medicine 2012, 1–10 (2012).
Jones, T. R. et al. Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning. Proceedings of the National Academy of Sciences 106, 1826–1831 (2009).
Sailem, H. Z., Sero, J. E. & Bakal, C. Visualizing cellular imaging data using PhenoPlot. Nature communications 6, 1–6 (2015).
Meijering, E. Cell segmentation: 50 years down the road [life sciences]. Signal Processing Magazine, IEEE 29, 140–145 (2012).
Dimopoulos, S., Mayer, C. E., Rudolf, F. & Stelling, J. Accurate cell segmentation in microscopy images using membrane patterns. Bioinformatics 30, 2644–2651 (2014).
Rämö, P., Sacher, R., Snijder, B., Begemann, B. & Pelkmans, L. CellClassifier: supervised learning of cellular phenotypes. Bioinformatics 25, 3028–3030 (2009).
Baggett, D., Nakaya, M.a., McAuliffe, M., Yamaguchi, T. P. & Lockett, S. Whole cell segmentation in solid tissue sections. Cytometry Part A 67, 137–143 (2005).
Ortiz de Solorzano, C., Malladi, R., Lelievre, S. & Lockett, S. Segmentation of nuclei and cells using membrane related protein markers. journal of Microscopy 201, 404–415 (2001).
Kumar, V., Abbas, A. K. & Aster, J. C. Robbins basic pathology. 59–73. (Elsevier Health Sciences, 2012).
Irshad, H., Veillard, A., Roux, L. & Racoceanu, D. Methods for Nuclei Detection, Segmentation and Classification in Digital Histopathology: A Review—Current Status and Future Potential. Biomedical Engineering, IEEE Reviews in 7, 97–114 (2014).
Wahlby, C., Sintorn, I. M., Erlandsson, F., Borgefors, G. & Bengtsson, E. Combining intensity, edge and shape information for 2D and 3D segmentation of cell nuclei in tissue sections. J Microsc 215, 67–76 (2004).
Wienert, S. et al. Detection and segmentation of cell nuclei in virtual microscopy images: a minimum-model approach. Scientific reports 2, 1–7 (2012).
Oswal, V., Belle, A., Diegelmann, R. & Najarian, K. An entropy-based automated cell nuclei segmentation and quantification: application in analysis of wound healing process. Computational and mathematical methods in medicine 2013, 1–10 (2013).
Doevendans, P. A., Daemen, M. J., de Muinck, E. D. & Smits, J. F. Cardiovascular phenotyping in mice. Cardiovascular research 39, 34–49 (1998).
Takeda, N. & Manabe, I. Cellular interplay between cardiomyocytes and nonmyocytes in cardiac remodeling. International journal of inflammation 2011, 1–13 (2011).
Shinde, A. V. & Frangogiannis, N. G. Fibroblasts in myocardial infarction: a role in inflammation and repair. J Mol Cell Cardiol 70, 74–82 (2014).
Malliaras, K. & Terrovitis, J. Cardiomyocyte proliferation vs progenitor cells in myocardial regeneration: The debate continues. Glob Cardiol Sci Pract 2013, 303–315 (2013).
Ellison, G. M. et al. Adult c-kit(pos) cardiac stem cells are necessary and sufficient for functional cardiac regeneration and repair. Cell 154, 827–842 (2013).
Ahmad, F. et al. Cardiomyocyte-specific deletion of Gsk3alpha mitigates post-myocardial infarction remodeling, contractile dysfunction and heart failure. Journal of the American College of Cardiology 64, 696–706 (2014).
Reuter, S., Soonpaa, M. H., Firulli, A. B., Chang, A. N. & Field, L. J. Recombinant neuregulin 1 does not activate cardiomyocyte DNA synthesis in normal or infarcted adult mice. Plos One 9, 1–15 (e115871) (2014).
Doevendans, P. A., Daemen, M. J., de Muinck, E. D. & Smits, J. F. Cardiovascular phenotyping in mice. Cardiovascular research 39, 34–49 (1998).
Lelovas, P. P., Kostomitsopoulos, N. G. & Xanthos, T. T. A comparative anatomic and physiologic overview of the porcine heart. Journal of the American Association for Laboratory Animal Science : JAALAS 53, 432–438 (2014).
Haralick, R. M. Statistical and structural approaches to texture. Proceedings of the IEEE 67, 786–804 (1979).
Liu, K. et al. Rotation-invariant HOG descriptors using fourier analysis in polar and spherical coordinates. International Journal of Computer Vision 106, 342–364 (2014).
Jacob, M. & Unser, M. Design of steerable filters for feature detection using canny-like criteria. Pattern Analysis and Machine Intelligence, IEEE Transactions on 26, 1007–1019 (2004).
Franken, E., van Almsick, M., Rongen, P., Florack, L. & ter Haar Romeny, B. An efficient method for tensor voting using steerable filters, In Computer Vision–ECCV 2006 228–240 (Springer, 2006).
Fan, R.-E., Chang, K.-W., Hsieh, C.-J., Wang, X.-R. & Lin, C.-J. LIBLINEAR: A library for large linear classification. The Journal of Machine Learning Research 9, 1871–1874 (2008).
Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological bulletin 76, 378–420 (1971).
Engel, F. B. et al. p38 MAP kinase inhibition enables proliferation of adult mammalian cardiomyocytes. Genes & development 19, 1175–1187 (2005).
Senyo, S. E., Lee, R. T. & Kuhn, B. Cardiac regeneration based on mechanisms of cardiomyocyte proliferation and differentiation. Stem cell research 13, 532–541 (2014).
Buck, S. B. et al. Detection of S-phase cell cycle progression using 5-ethynyl-2′-deoxyuridine incorporation with click chemistry, an alternative to using 5-bromo-2′-deoxyuridine antibodies. BioTechniques 44, 927–929 (2008).
Zeng, C. et al. Evaluation of 5-ethynyl-2′-deoxyuridine staining as a sensitive and reliable method for studying cell proliferation in the adult nervous system. Brain research 1319, 21–32 (2010).
Aufiero, B., Tsokos, G., Tsokos, M. & Wong, H. Apoptosis in Tissue Injury, In Forensic Pathology Reviews, Vol. 4. (ed. M. Tsokos ) 331–351 (Humana Press, 2006).
Palojoki, E. et al. Cardiomyocyte apoptosis and ventricular remodeling after myocardial infarction in rats. American Journal of Physiology-Heart and Circulatory Physiology 280, H2726–H2731 (2001).
Loo, D. T. In situ detection of apoptosis by the TUNEL assay: an overview of techniques, In DNA Damage Detection In Situ, Ex Vivo and In Vivo 3–13 (Springer, 2011).
Schüffler, P. J. et al. Automatic single cell segmentation on highly multiplexed tissue images. Cytometry Part A 87, 936–942 (2015).
Alpaydin, E. Introduction to machine learning. 39, 82, 172. (MIT press, 2014).
Malliaras, K. & Terrovitis, J. Cardiomyocyte proliferation vs progenitor cells in myocardial regeneration: The debate continues. Global cardiology science & practice 2013, 303 (2013).
Molkentin, J. D. & Houser, S. R. Are resident c-Kit+ cardiac stem cells really all that are needed to mend a broken heart? Circulation research 113, 1037–1039 (2013).
Buja, L. M. & Vela, D. Cardiomyocyte death and renewal in the normal and diseased heart. Cardiovascular Pathology 17, 349–374 (2008).
Porter, R., Theiler, J. & Hush, D. Interactive machine learning in data exploitation. Computing in Science & Engineering 15, 12–20 (2013).
Tarnavski, O. et al. Mouse cardiac surgery: comprehensive techniques for the generation of mouse models of human diseases and their application for genomic studies. Physiological genomics 16, 349–360 (2004).
Duran, J. M. et al. Bone-derived stem cells repair the heart after myocardial infarction through transdifferentiation and paracrine signaling mechanisms. Circulation research 113, 539–552 (2013).
Duran, J. M. et al. A characterization and targeting of the infarct border zone in a swine model of myocardial infarction. Clinical and translational science 5, 416–421 (2012).
Boulanger, J. et al. Patch-based nonlocal functional for denoising fluorescence microscopy image sequences. Medical Imaging, IEEE Transactions on 29, 442–454 (2010).
Samet, H. & Tamminen, M. Efficient component labeling of images of arbitrary dimension represented by linear bintrees. Pattern Analysis and Machine Intelligence, IEEE Transactions on 10, 579–586 (1988).
Fleiss, J. L., Levin, B. & Paik, M. C. Statistical methods for rates and proportions. 378–420 (John Wiley & Sons, 2013).
Acknowledgements
This study was supported by grants from NIH/NHLBI, HL108806 and NIH/NHLBI, HL091799.
Author information
Authors and Affiliations
Contributions
P.G., N.H. and E.V. equally contributed to this work. P.G., N.H. and E.V. designed the software and planned the experiments. P.G. performed the experiments, analyzed the data, prepared the figures and wrote the main manuscript. N.H. designed the software features, contributed to the design of statistical analysis and wrote parts of the manuscript methods. E.V. wrote the software toolbox, contributed to the design of statistical analysis and wrote parts of the manuscript methods. M.W. provided the TUNEL mouse histological samples and participated in the experiments performance. D.M.T. provided the EdU mouse histological samples and participated in the experiments performance. T.E.S. provided the swine histological samples and carried out the myocardial infarction surgical procedures in swine. T.S. contributed to the image acquisition and participated in the experiments performance. J.D. provided mouse histological samples. S.K. carried out myocardial infarction surgical procedures in mice. C.D. and S.R.H. supervised the work. All authors reviewed the manuscript.
Ethics declarations
Competing interests
The authors declare no competing financial interests.
Electronic supplementary material
Rights and permissions
This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/
About this article
Cite this article
Gross, P., Honnorat, N., Varol, E. et al. Nuquantus: Machine learning software for the characterization and quantification of cell nuclei in complex immunofluorescent tissue images. Sci Rep 6, 23431 (2016). https://doi.org/10.1038/srep23431
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/srep23431
- Springer Nature Limited