Abstract
We propose a technique that we call HodgeRank for ranking data that may be incomplete and imbalanced, characteristics common in modern datasets coming from e-commerce and internet applications. We are primarily interested in cardinal data based on scores or ratings though our methods also give specific insights on ordinal data. From raw ranking data, we construct pairwise rankings, represented as edge flows on an appropriate graph. Our statistical ranking method exploits the graph Helmholtzian, which is the graph theoretic analogue of the Helmholtz operator or vector Laplacian, in much the same way the graph Laplacian is an analogue of the Laplace operator or scalar Laplacian. We shall study the graph Helmholtzian using combinatorial Hodge theory, which provides a way to unravel ranking information from edge flows. In particular, we show that every edge flow representing pairwise ranking can be resolved into two orthogonal components, a gradient flow that represents the l 2-optimal global ranking and a divergence-free flow (cyclic) that measures the validity of the global ranking obtained—if this is large, then it indicates that the data does not have a good global ranking. This divergence-free flow can be further decomposed orthogonally into a curl flow (locally cyclic) and a harmonic flow (locally acyclic but globally cyclic); these provides information on whether inconsistency in the ranking data arises locally or globally. When applied to statistical ranking problems, Hodge decomposition sheds light on whether a given dataset may be globally ranked in a meaningful way or if the data is inherently inconsistent and thus could not have any reasonable global ranking; in the latter case it provides information on the nature of the inconsistencies. An obvious advantage over the NP-hardness of Kemeny optimization is that HodgeRank may be easily computed via a linear least squares regression. We also discuss connections with well-known ordinal ranking techniques such as Kemeny optimization and Borda count from social choice theory.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Ailon, N., Charikar, M., Newman, A.: Aggregating inconsistent information: ranking and clustering. In: Proceeding ACM Symposium Theory Computing (STOC ’605), vol. 37, pp. 684–693 (2005)
Arrow K.J.: A difficulty in the concept of social welfare. J. Polit. Econ. 58(4), 328–346 (1950)
Arrow K.J.: General economic equilibrium: purpose, analytic techniques, collective choice. Nobel Memorial Lecture, 12 December 1972. In: Lindbeck, A. (ed.) Nobel Lectures: Economic Sciences 1969–1980, pp. 109–131. World Scientific, Singapore (1992)
Barber B.M., Lehavy R., McNichols M., Trueman B.: Can investors profit from the prophets? Security analyst recommendations and stock returns. J. Finance 56(2), 531–563 (2001)
Bartholdi, L., Schick, T., Smale, N., Smale, S., Baker, A.W.: Hodge theory on metric spaces. preprint. http://www.arxiv.org/abs/0912.0284 (2009)
Bell, R.M., Koren, Y.: Scalable collaborative filtering with jointly derived neighborhood interpolation weights. In: Proceedings IEEE International Conference Data Mining (ICDM ’607), vol. 7, pp. 43–52. (2007)
Bradley R., Terry M.: The rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 324–345 (1952)
Brin S., Page L.: The anatomy of a large-scale hypertextual web search engine. Comput. Networks ISDN Syst. 30(1–7), 107–117 (1998)
Choi, S.-C.: Iterative methods for singular linear equations and least-squares problems. Ph.D. thesis, Stanford University, Stanford, CA. http://www.stanford.edu/group/SOL/dissertations/sou-cheng-choi-thesis.pdf (2006)
Choi, S.-C., Paige, C.C., Saunders, M.A.: minres -qlp: a Krylov subspace method for indefinite or singular symmetric systems, preprint. http://www.arxiv.org/abs/1003.4042 (2010)
Chung, F.: Spectral Graph Theory. In: CBMS Regional Conference Series in Mathematics, vol. 92. AMS, Providence, RI (1997)
Cortes, C., Mohri, M., Rastogi, A.: Magnitude-preserving ranking algorithms. In: Proceeding International Conference Machine Learning (ICML ’607), vol. 24, pp. 169–176 (2007)
David H.A.: The Method of Paired Comparisons, 2nd edn. Griffin’s Statistical Monographs and Courses, vol. 41. Oxford University Press, New York, NY (1988)
de Borda J.-C.: Mémoire sur les élections au scrutin. Histoire de’Académie Royale des Sci. 102, 657–665 (1781)
de Condorcet A.-N.: Éssai sur l’application de l’analyse à la probabilité des décisions rendues à la pluralité des voix. Imprimerie Royale, Paris (1785)
Diaconis P.: A generalization of spectral analysis with application to ranked data. Ann. Statist. 17(3), 949–979 (1989)
Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the web. In: Proceedings International Conference World Wide Web (WWW ’01), vol. 10, pp. 613–622 (2001)
Freund Y., Iyer R., Shapire R., Singer Y.: An efficient boosting algorithm for combining preferences. J. Mach. Learn. Res. 4(6), 933–969 (2004)
Friedman J.: Computing Betti numbers via combinatorial Laplacians. Algorithmica 21(4), 331–346 (1998)
Hastie T., Tibshirani R.: Classification by pairwise coupling. Ann. Statis. 26(2), 451–471 (1998)
Herbrich, R., Graepel, T., Obermayer, K.: Large margin rank boundaries for ordinal regression. In: Advances in Large Margin Classifiers, pp. 115–132. MIT Press, Cambridge, MA (2000)
Hochbaum D.: The separation, and separation-deviation methodology for group decision making and aggregate ranking. In: Hasenbein, J.J. (ed.) Tutorials in Operations Research. vol. 7, pp. 116–141. INFORMS, Hanover, MD (2010)
Hochbaum D., Levin A.: Methodologies and algorithms for group-rankings decision. Manag. Sci. 52(9), 1394–1408 (2006)
Kahle, M.: Topology of random clique complexes. Discrete Math. (to appear, 2008)
Kemeny J.G.: Mathematics without numbers. Daedalus 88, 571–591 (1959)
Kemeny J.G., Snell L.J.: Preference ranking: an axiomatic approach. In: Kemeny, J.G., Snell, L.J. (eds) Mathematical Models in the Social Sciences, pp. 9–23. MIT Press, Cambridge (1973)
Kendall M.G., Gibbons J.D.: Rank Correlation Methods. 5th ed. Oxford University Press, Oxford (1990)
Kendall M.G., Smith B.B.: On the method of paired comparisons. Biometrika 31(3–4), 324–345 (1940)
Kleinberg J.: Authoritative sources in a hyperlinked environment. J. ACM 46(5), 604–632 (1999)
Ma, M.: A matrix approach to asset pricing in foreign exchange market. preprint. http://www.ssrn.com/abstract=921755 (2008)
Miller G.A.: The magical number seven, plus or minus two some limits on our capacity for processing information. Psychol. Rev. 63(2), 81–97 (1956)
Morton J., Pachter L., Shiu A., Sturmfels B., Wienand O.: Convex rank tests and semigraphoids. SIAM J. Discrete Math. 23(3), 1117–1134 (2009)
Mosteller, F.: Remarks on the method of paired comparisons I, II, III. Psychometrika 16(1), pp. 3–9, no. 2, pp. 203–218 (1951)
Noether G.E.: Remarks about a paired comparison model. Psychometrika 25, 357–367 (1960)
Paige C.C., Saunders M.A.: lsqr: an algorithm for sparse linear equations and sparse least squares. ACM Trans. Math. Software 8(1), 43–71 (1982)
Saari D.G., Merlin V.R.: A geometric examination of Kemeny’s rule. Soc. Choice Welf. 17(3), 403–438 (2000)
Saaty T.L.: A scaling method for priorities in hierarchical structures. J. Math. Psych. 15(3), 234–281 (1977)
Sen A.K.: The impossibility of a paretian liberal. J. Polit. Econ. 78(1), 152–157 (1970)
Sen A.K.: The possibility of social choice, Nobel Lecture, 8 Dec 1998. In: Persson, T. (ed.) Nobel Lectures: Economic Sciences 1996–2000, pp. 178–215. World Scientific, Singapore (2003)
Tahbaz-Salehi, A., Jadbabaie, A.: Distributed coverage verification in sensor networks without location information. In: Proceeding IEEE Conference Decision Control, vol. 47. (2008), to appear
Thurstone L.L.: The method of paired comparisons for social values. J. Abnorm. Soc. Psychol. 21, 384–400 (1927)
Xu, Y., Dyer, J.S., Owen, A.B.: Empirical stationary correlations for semi-supervised learning on graphs. preprint (2009)
Young H.P.: Condorcet’s theory of voting. Am. Polit. Sci. Rev. 82, 1231–1244 (1988)
Young H.P., Levenglick A.: A consistent extension of Condorcet’s election principle. SIAM J. Appl. Math. 35(2), 285–300 (1978)
Zhang J.: Binary choice, subset choice, random utility, and ranking: a unified perspective using the permutahedron. J. Math. Psych. 48(2), 107–134 (2004)
Acknowledgments
The authors thank the anonymous referees for their useful suggestions. The authors also gratefully acknowledge helpful discussions with Gunnar Carlsson, Leo Guibas, and Steve Smale.
Open Access
This article is distributed under the terms of the Creative Commons Attribution Noncommercial License which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
X. Jiang acknowledges support from ARO Grant W911NF-04-R-0005 BAA and the School of Engineering fellowship at Stanford. L.-H. Lim acknowledges support from the Gerald J. Liebermann fellowship at Stanford and the Charles B. Morrey assistant professorship at Berkeley.
Y. Yao acknowledges supports from the National Basic Research Program of China (973 Program 2011CB809105), NSFC (61071157), Microsoft Research Asia, DARPA Grant HR0011-05-1-0007, NSF Grant DMS 0354543, and a professorship in the Hundred Talents Program at Peking University. Y. Ye acknowledges support from AFOSR Grant FA9550-09-1-0306 and DOE Grant DE-SC0002009.
Rights and permissions
Open Access This is an open access article distributed under the terms of the Creative Commons Attribution Noncommercial License (https://creativecommons.org/licenses/by-nc/2.0), which permits any noncommercial use, distribution, and reproduction in any medium, provided the original author(s) and source are credited.
About this article
Cite this article
Jiang, X., Lim, LH., Yao, Y. et al. Statistical ranking and combinatorial Hodge theory. Math. Program. 127, 203–244 (2011). https://doi.org/10.1007/s10107-010-0419-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10107-010-0419-x
Keywords
- Statistical ranking
- Rank aggregation
- Combinatorial Hodge theory
- Discrete exterior calculus
- Combinatorial Laplacian
- Hodge Laplacian
- Graph Helmholtzian
- HodgeRank
- Kemeny optimization
- Borda count