Abstract
The clustering of objects-individuals is one of the most widely used approaches to exploring multidimensional data. The two common unsupervised clustering strategies are Hierarchical Ascending Clustering (HAC) and k-means partitioning used to identify groups of similar objects in a dataset to divide it into homogeneous groups. The proposed Topological Clustering of Individuals, or TCI, studies a homogeneous set of individual rows of a data table, based on the notion of neighborhood graphs; the columns-variables are more-or-less correlated or linked according to whether the variable is of a quantitative or qualitative type. It enables topological analysis of the clustering of individual variables which can be quantitative, qualitative or a mixture of the two. It first analyzes the correlations or associations observed between the variables in a topological context of principal component analysis (PCA) or multiple correspondence analysis (MCA), depending on the type of variable, then classifies individuals into homogeneous group, relative to the structure of the variables considered. The proposed TCI method is presented and illustrated here using a real dataset with quantitative variables, but it can also be applied with qualitative or mixed variables.
Chapter PDF
Similar content being viewed by others
Keywords
References
Abdesselam, R.: A topological clustering of variables. Journal of Mathematics and System Science. Accepted (2022)
Abdesselam, R.: A topological approach of Principal Component Analysis. International Journal of Data Science and Analysis. 77(2), 20–31 (2021)
Abdesselam, R.: A topological Multiple Correspondence Analysis. Journal of Mathematics and Statistical Science, ISSN 2411–2518, 5(8), 175–192 (2019)
Abdesselam, R.: A topological Discriminant Analysis. Data Analysis and Applications 2, Utilization of Results in Europe and Other Topics, Vol.3, Part 4. pp. 167–178 Wiley, (2019)
Batagelj, V., Bren, M.: Comparing resemblance measures. Journal of Classification, 12(1), 73–90 (1995)
Caillez, F., Pagès, J. P.: Introduction à l’Analyse des Données. S.M.A.S.H., Paris (1976)
Kim, J. H. and Lee, S.: Tail bound for the minimal spanning tree of a complete graph. In Statistics & Probability Letters, 4(64), 425–430 (2003)
Lebart, L.: Stratégies du traitement des données d’enquêtes. La Revue de MODULAD, 3, 21–30 (1989)
Lesot, M. J., Rifqi, M., Benhadda, H.: Similarity measures for binary and numerical data: a survey. In: IJKESDP, 1(1), 63–84 (2009)
Panagopoulos, D.: Topological data analysis and clustering. Chapter for a book, Algebraic Topology (math.AT) arXiv:2201.09054, Machine Learning (2022)
Park, J. C., Shin, H., Choi, B. K.: Elliptic Gabriel graph for finding neighbors in a point set and its application to normal vector estimation. Computer-Aided Design Elsevier, 38(6), 619–626 (2006)
SAS Institute Inc. SAS/STAT Software, the Cluster Procedure, Available via DIALOG. https://support.sas.com/documentation/onlinedoc/stat/142/cluster.pdf
Selectra: Electricité renouvelable: quelles sont les régions les plus vertes de France ? http://selectra.info/energie/actualites/expert/electricite-renouvelable-regions-plus-vertes-france (2020)
Toussaint, G. T.: The relative neighbourhood graph of a finite planar set. Pattern Recognition, 12(4) 261–268 (1980)
Ward, J. R.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58(301), 236–244 (1963)
Zighed, D., Abdesselam, R., Hadgu, A.: Topological comparisons of proximity measures. In: Tan et al. (Eds). In Proc. 16th PAKDD 2012 Conference, pp. 379–391. Springer, (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2023 The Author(s)
About this paper
Cite this paper
Abdesselam, R. (2023). A Topological Clustering of Individuals. In: Brito, P., Dias, J.G., Lausen, B., Montanari, A., Nugent, R. (eds) Classification and Data Science in the Digital Age. IFCS 2022. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-031-09034-9_1
Download citation
DOI: https://doi.org/10.1007/978-3-031-09034-9_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09033-2
Online ISBN: 978-3-031-09034-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)