Abstract
Some data are collected on circular (rather than linear) scales. Often researchers are interested in comparing two samples of such circular data to test the hypothesis that they came from the same underlying population. Recently, we compared 18 statistical approaches to testing such a hypothesis, and recommended two as particularly effective. A very recent publication introduced a novel statistical approach that was claimed to outperform the methods that we had indicated were highest performing. However, the evidence base for this claim was limited. Here we perform simulation studies to offer a more detailed comparison of the new “Angular Randomisation Test” (ART) with existing tests. We expand previous evaluations in two ways: exploring small and medium sized samples, and exploring a range of different shapes for the underlying distribution(s). We find that the ART controls type I error rates at the nominal level. The ART had greater power than established methods in detecting a difference in underlying distribution caused by a shift around the circle. Its performance advantage in this case was strongest when samples where small and unbalanced in size. When the difference between underlying unimodal distributions was in shape rather than central tendency, then the ART was at least as good (and sometimes considerably more powerful) than the established methods, except when distributions samples were small and uneven in size, and the smaller sample came from a more concentrated underlying distribution. In such cases its power could be markedly inferior to established alternatives. The ART was also inferior to alternatives in dealing with axially distributed data. We conclude that under widely-encountered circumstances the ART test can be recommended for its simplicity of implementation, but researchers should be aware of situations where it cannot be recommended.
Similar content being viewed by others
Introduction
Some variables (often related to orientations or timings) are recorded on circular scales. Such data need different statistical treatment from variables recorded on linear scales (see overviews in, for example, Mardia and Jupp1, Jammalamadaka and Sengupta2, and Ley and Verdebout3). A common question in circular statistics involves testing to see if two samples of circular data appear to come from different underlying distributions. For example, researchers interested in the effect of magnetic cues on the resting orientation of rodents might record the orientations of the long axis of some animals asleep under control conditions and some others under a manipulation of the prevailing magnetic field. Any substantial difference in these two samples might then be seen as evidence of magnetic sensitivity in rodents. Researchers have a wide choice of published methodologies for exploring this question statistically: recently we compared the performance of 18 such tests4. We concluded that two of these (Watson’s U2 test5 and a MANOVA approach6) could be recommended as controlling type I error rate near the nominal level and offering good statistical power over a broader range of situations than the other tests. Soon after the publication of our study, Ali and Abushilah7 published a novel angular randomisation test (ART) that they claimed was more powerful than the Watson’s U2 test. This would suggest that this new test might become the most attractive published so far and (combined with the simplicity of the test) this would argue for supporting its widespread uptake. Our aim here is to provide further exploration of the power and control of type I error rate of the ART. The investigations of Ali and Abushilah7 need to be expanded in two important ways. Firstly, they only explored the performance of their test for large sample sizes. The smallest single sample size considered was 100, which is an unrealistically high sample sizes for many fields of biological research. For example, in a survey of published studies on animal behaviour, Taborsky8 reported that the average sample sizes where 32 for field studies and 18 for studies based on captive animals. Secondly, they only considered a single shape of underlying distribution in their study, the von Mises distribution, which is a unimodal symmetrical bell-shaped distribution specified by two parameters (central location and dispersion, see e.g. Pewsey et al.9 for further discussion of its properties). However, a much broader range of distributions can occur, and the relative performance of tests can vary markedly with different underlying distributions (e.g., Landler et al.4,10). We will relax both these restrictions on the extent of investigations of the test here, as well as provide an easy-to-use R function for interested researchers to facilitate potential wider uptake of the ART.
Materials and methods
Defining the angular randomisation test (ART)
We assume that we record data in radian measure on a scale [0, 2π). We further assume that we have two samples of data of sizes m and n: {φ1, φ2, …. , φm} and {ϕ1, ϕ2 , … , ϕn}. Then the test statistic (G) is
where \(D\left( {a,b} \right) = \pi - \left| {\pi - \left\lceil {a - b} \right\rceil } \right|\).
D is the shortest angular (geodesic) distance between two points. Therefore, the test statistic is simply the sum of these distances from every point in one sample to every point in the other. The original formulation7 included a scalar multiplier, which we omit for brevity, since it would not influence our evaluation of the test.
To obtain the p-value associated with two samples we perform a permutation test. Firstly, we record the test statistic associated with the observed data (G*). We also attach a label to each data point associating it with either sample 1 or sample 2. We then produce a large number N of permutations of these m + n labels. For each permutation we can calculate a G value. If the number of permutations that produce a G value greater than G* is Q, then the p value is simply (Q + 1)/(N + 1). This is a standard way of carrying out a two-sample test by permutation—see Manly11, for example, for further discussion.
Simulations
Our methods closely follow the approach we took in Landler et al.4. We compare the angular randomisation test with six other tests. Ali and Abushilah7 compared this test with tests using the same randomisation approach but the test statistics of Watson’s U2 test and Watson-Wheeler test. For these tests we obtain the test statistic from the implementation of those tests in the circular package in R. These implementations also provide p-values calculated using the analytic asymptotic version of the test statistic. In addition, we applied the recently proposed Rao spacing frequency test using the R code provided in Jammalamadaka et al.12. We call the six tests considered ART, WU2, pWU2, WW, pWW and Rsf, with the p suffix denoting the permutation version. In all cases we used 10,000 permutations.
We use the rcircmix function in the NPcirc package13 in R to produce either a unimodal von Mises, an axial von Mises (two modes on the opposite sides of the circle) or a wrapped skew-normal distribution (see Pewsey14 for a full description of the latter). We selected the wrapped skew-normal distribution because a previous investigation of tests of uniformity based on a single sample of circular data10 suggested that the relative performance of tests under this distribution was a good representation of their performance against plausible alternative skewed distributions. To specify a von Mises distribution two parameters need be specified: the mean (μ) and concentration parameter (K). K takes the value zero for a circular uniform distribution, with the distribution becoming increasingly concentrated for higher positive values of K. Three parameters are needed to specify a wrapped skew-Normal distribution: a location parameter (ξ), a dispersion parameter (ρ) and a shape parameter (α). The location parameter describes the central tendency of the distribution; whereas increasing (positive) values of the dispersion parameter indicate greater variance in values. Negative values of the shape parameter indicate a right skew; and positive values a left skew. The larger the magnitude of this parameter the stronger the skew (α = 0 indicates a symmetric distribution).
Having defined the parameters of the two underlying distributions, we report statistical power to detect a difference (or type I error rate for identical distributions) on the basis of 10,000 samples from the distributions. With 10,000 replicates, binomial theory suggests our estimated rates should be accurate to within 0.005.
Results
The good control of type I error rate reported by Ali and Abushilah7 for ART using large samples from a von Mises distribution, held for small samples sizes too—even when the sample sizes were as small as ten in each, and even if sample sizes were strongly unbalanced (Fig. 1). This was true for a broad range of common concentration parameters (K) and for the majority of tests investigated (only the Rsf had slightly elevated type I errors in specific situations). We observed the same low type I error rates for identical skewed, as well as axial von Mises distributions (Figs. 2 and 3).
We further explored the power to detect an underlying difference in dispersion (spread) of the data points when the mean values of two underlying von-Mises distributions were identical (Fig. 4). Here, the ART offered more power than the other tests when sample sizes were small and balanced, i.e., when sample sizes were comparable. In unbalanced cases power was low when the higher dispersed sample had the higher sample size. The performance was similar for skewed data. Here the ART offered superior performance for balanced samples sizes, but substantially less power for unbalanced sample sizes, if the smaller sample had lower dispersion (Fig. 5). This drop in performance was most dramatic in the most uneven situation tested, with sample sizes of 10 (with low dispersion) and 50 (with high dispersion), where the ART offered close to zero power. The power for differences between samples drawn from axial von-Mises distributions (same mean values, different concentration), showed low power overall with superior power of the Rfs and no usable detection rate of the ART (Fig. 6).
We also explored the situation where both distributions had the same shape—but one was shifted around the circle relative to the other (Figs. 7, 8 and 9). Here we find the ART was overall the most powerful test for both unimodal symmetric and skewed distributions, with the most pronounced power advantage when sample sizes were small and balanced. However, for axial distributions the Rsf showed best power, with unusable power levels for ART (Fig. 9).
Discussion
Here, we have evaluated the power of a recently proposed statistical test for the comparison of two circular samples, the ART7. For the most part we were able to strengthen the foundation for arguing for greater uptake of the ART (at least when underlying distributions are expected to be unimodal). Specifically, we show that it offers good control of type I error rate even if sample sizes are small, and/or the underlying distribution is quite different from a von Mises one. We also show that it offers good power in unimodal situations, regardless of whether the difference between underlying distributions is in central location or dispersion. Most importantly, the new test offers generally better power than the asymptotic and randomisation versions of Watson’s U2 test, the former of which was the joint winner in our comparison of 18 previously introduced tests4.
However, we have also uncovered two situations where the ART offers very poor or no power relative to alternatives. If samples are small and uneven in size, and the more dispersed sample is the larger sample, and the suspected difference in distributions is in dispersion (rather than shift), then the ART offers low power and cannot be recommended. If researchers find themselves in such a situation, then Watson’s U2 test can still be recommended. Also, for symmetrical bimodal (i.e., axial) distributions the ART offers almost no power and should not be used. Many commonly-used tests perform poorly in this situation15. This problem likely extends to other symmetric multimodal situations. The power of the ART for asymmetric multimodal situations has not been explored. Pending further exploration, we would not recommend the ART when underlying distributions are expected to be multimodal. In this case, the recently proposed Rao spacing frequency test described in Jammalamadaka et al.12 can be recommended. In other unimodal circumstances, our results and those of Ali and Abushilah7 argue that the ART is worthy of consideration for widespread uptake for the comparison of two circular distributions. There is no practical barrier to its implementation—we demonstrate above that its formulation is simple, and we offer an implementation in R here (Code can be downloaded at https://github.com/Malkemperlab/Geodesic-distance-test). The ART appears to offer more statistical power without any potential drawbacks in many standard situations, which should compel researchers to consider adding this novel test to their statistical repertoire.
Although we have developed the empirical support for the ART substantially over that offered by Ali and Abushilah7, there are certain unimodal situations we did not investigate. We do not know, for example, how the test behaves when faced with data rounded to a finite number of possible values (often called group data). However, similar tests seem relatively insensitive to even high levels of grouping16. Further, it may be possible to extend the methodology to compare more than two samples. Given the performance of the test in standard situations as reported here, such further explorations of its potential are warranted.
Conclusions
We offer a considerably extended investigation of the properties of the recently introduced ART for comparing two samples of circular data. We conclude that under many circumstances the ART can be recommended for its simplicity of implementation combined with excellent control of type I error rate and power. Its power is generally superior to any of the previously introduced tests for this common research question. We caution, however, that we have uncovered situations where the newly introduced test has markedly poorer power than many previous tests—when underlying distributions are axially symmetric (or more generally symmetrically multimodal); or when underlying unimodal distributions vary in degree of concentration rather than location, samples are small and uneven in size, and the smaller sample comes from a more concentrated underlying distribution. If experimenters avoid these situations then uptake of this new test can be recommended.
Data availability
All code to rerun our analysis are available on https://github.com/Malkemperlab/Geodesic-distance-test.
References
Mardia, K. V. & Jupp, P. E. Directional Statistics (Wiley, Hoboken, 2000).
Jammalamadaka, S. R. & Sengupta, A. Topics in Circular Statistics Vol. 5 (World Scientific, Singapore, 2001).
Ley, C. & Verdebout, T. Modern Directional Statistics (Chapman and Hall/CRC, New York, 2017).
Landler, L., Ruxton, G. & Malkemper, E. P. Advice on comparing two independent samples of circular data in biology. Sci. Rep. 11, 20337 (2021).
Watson, G. S. Goodness-of-fit tests on a circle. II.. Biometrika 49, 57–63 (1962).
Landler, L., Ruxton, G. D. & Malkemper, E. P. The multivariate analysis of variance as a powerful approach for circular data. Mov. Ecol. 10, 21 (2022).
Ali, A. J. & Abushilah, S. F. Distribution-free two-sample homogeneity test for circular data based on geodesic distance. Int. J. Nonlinear Anal. Appl. 13, 2703–2711 (2022).
Taborsky, M. Sample size in the study of behaviour. Ethology 116, 185–202 (2010).
Pewsey, A., Neuhäuser, M. & Ruxton, G. D. Circular Statistics in R (Oxford University Press, Oxford, 2013).
Landler, L., Ruxton, G. D. & Malkemper, E. P. Circular data in biology: Advice for effectively implementing statistical procedures. Behav. Ecol. Sociobiol. 72, 128 (2018).
Manly, B. F. Randomization, Bootstrap and Monte Carlo Methods in Biology (Chapman and hall/CRC, New York, 2018).
Jammalamadaka, S. R., Guerrier, S. & Mangalam, V. A two-sample nonparametric test for circular data—Its exact distribution and performance. Sankhya B 83, 140–166 (2021).
Oliveira Pérez, M., Crujeiras Casais, R. M. & Rodríguez Casal, A. NPCirc: An R package for nonparametric circular methods. J. Stat. Softw. 61, 1–26 (2014).
Pewsey, A. Problems of inference for Azzalini’s skewnormal distribution. J. Appl. Stat. 27, 859–870 (2000).
Gatto, R. & Jammalamadaka, S. R. On two-sample tests for circular data based on spacing-frequencies. In Geometry Driven Statistics (eds Dryden, I. L. & Kent, J. T.) 129–145 (Wiley, Chichester, 2015).
Landler, L., Ruxton, G. & Malkemper, E. P. Grouped circular data in biology: advice for effectively implementing statistical procedures. Behav. Ecol. Sociobiol. 74, 100 (2020).
Acknowledgements
This research was funded in whole, or in part, by the Austrian Science Fund (FWF) Grant Number: P32586 to LL. For the purpose of open access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission. EPM receives funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant Agreement No. 948728). EPM acknowledges the HPC facility at the Max Planck Institute for Neurobiology of Behavior—caesar.
Author information
Authors and Affiliations
Contributions
L.L. and G.D.R. wrote the code. All authors discussed the results and prepared the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ruxton, G.D., Malkemper, E.P. & Landler, L. Evaluating the power of a recent method for comparing two circular distributions: an alternative to the Watson U2 test. Sci Rep 13, 10007 (2023). https://doi.org/10.1038/s41598-023-36960-1
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-023-36960-1
- Springer Nature Limited