Abstract
Alexander Serebrenik
You have full access to this open access chapter, Download chapter PDF
Multiple studies of gender in software engineering require identifying gender of the individuals involved either by asking them (when conducting interviews and surveys) or by “guessing” it from archival data recorded in software repositories. In this chapter we discuss ways to ask about gender in surveys and interviews as well as three groups of automated genderization approaches proposed in the literature: name-to-gender, face-to-gender, and artifact-to-gender. For each one of the approaches, we discuss the way they work, the associated ethical concerns, the reliability and accuracy concerns, and the assumptions they make.
Introduction
When it comes to studies of diversity in software engineering, gender is by far the most studied diversity dimension: 61% of the scientific studies recently surveyed by Rodriguez-Perez et al. have considered gender [34]. Indeed, previous studies have shown that women participating in open source projects disengage faster than men [32], that while women concentrate their work across fewer projects and organizations, men contribute to a higher number of projects and organizations [14], men and women follow different comprehension strategies when reading source code [40], and men tend to switch more frequently between debugging strategies [8]. Several studies in this volume also consider gender as a variable of interest: for example, Hashmati and Penzenstadler in Chapter 5, “How Users Perceive the Representation of Non-binary Gender in Software Systems: An Interview Study,” report on an interview study of representation of gender in software; Gama et al. in Chapter 16, “Toward More Gender-Inclusive Game Jams and Hackathons,” focus on experiences of transgender (binary and non-binary) and gender-nonconforming people related to jams and hackathons; Kohl and Prikladnicki in Chapter 11, “Gender Diversity on Software Development Teams: A Qualitative Study,” conduct a survey of gender diversity in software development teams; Simmonds et al. in Chapter 23, “Rethinking Gender Diversity and Inclusion Initiatives for CS and SE in a University Setting,” discuss the findings of the focus group of women and non-binary students; and Happe in Chapter 25, “Effective Interventions to Promote Diversity in CS Classroom,” studies frustrations steering women away from computer science. All these studies require the researchers to obtain information about gender identity of the study participants (for controlled experiments, interviews, and surveys) or of the individuals that have contributed to the dataset analyzed (for data-driven archival studies such as repository mining). As we are going to see in the following, obtaining such an information is fraught with challenges, and inappropriate ways of doing this might both alienate study participants and threaten validity of the scientific results. The challenges are not limited to researchers: indeed, everyone conducting internal surveys, performing marketing analysis, adding “gender” questions in the user interface of the software, or aiming at understanding user satisfaction necessarily has to find their way of recording information about gender identity.
To support both researchers and practitioners, in this chapter we take a look at the techniques used to obtain information about gender and the associated advantages and challenges.
Before we even start discussing how information about gender can be obtained, one has to remember that gender is a complex social construct of norms, behaviors, and roles that varies between societies and over time. Hence, study of gender in general, gender identities, or gender expression of individuals should be done with utmost care. Whatever technique we use, we should keep in mind that gender is privacy sensitive and should be treated as such even if such regulations as the General Data Protection Regulation (GDPR) do not consider this information as sensitive. In particular, open source contributors might be hiding their gender on purpose, for example, many women developers prefer not to disclose their gender due to safety concerns. Moreover, some open source projects do not necessarily want us to know the genders of their members (but some do!), and companies might be sensitive to this topic as well.
Talking to People
One of the most popular ways of obtaining information about gender is asking the individuals themselves, as part of a survey or an interview. We should keep in mind, however, that reliability of this method strongly depends on the ability of the respondents to understand the question and find an answer corresponding to the way they see themselves.
In Asking Questions: The Definitive Guide to Questionnaire Design, Bradburn et al. [6] suggest recording the respondent’s gender by asking, “What is your/NAME’s sex?” and offering two answer options, male and female. This question conflates biological sex and socially constructed gender and reduces the spectrum of options to merely two. However, by now it is well known that both the biological reality of sex and the social reality of gender are much more complex [17]. For example, the recent survey of Stack Overflow indicates that 1.42% of software developers identify as non-binary, genderqueer, or gender nonconforming and 0.92% prefer to self-describe.Footnote 1 In the survey of the Linux Foundation, 4% of the respondents have indicated their gender as “non-binary/third gender.”Footnote 2 Surprisingly, in December 2018, a popular survey platform SurveyMonkey was still offering “female” and “male” as the only options for the “What is your gender?” question [43].
Hence, at the very least, the phrasing of the question about gender should reflect existence of genders other than women and men. One of simplest ways of phrasing such a question would be “Are you…male, female, something else? Specify ____.” In December 2018, a similar phrasing has been the default in Google forms [43], and a similar question has been used by Roberts et al. in a 2022 study of Australian adolescents’ eating pathology [33]. Such a phrasing is profoundly problematic as it expresses a preference toward “male” and “female” pushing all other gender identities outside of the norm – this process is known as othering [10, 47], “differentiating discourses that lead to moral and political judgments of superiority and inferiority between ’us’ and ’them’” [12]. Moreover, by allowing the respondents to select only one answer option, this phrasing excludes people who are, for example, women and non-binary. Finally, in the empirical evaluation performed by Bauer et al. [3], cisgender participants had no problems answering this question, but transgender participants tried to understand what exactly the researchers were asking and reached different conclusions: both transfeminine (assigned male at birth and identify as women/non-binary) and transmasculine (assigned female at birth and identify as men/non-binary) respondents have given all three possible answers (male, female, other) rendering this question useless. When used in the interviews, this item was cognitively taxing for transgender interview participants [3].
The previous discussion suggests that (a) one should avoid referring to certain gender identities as “other”; (b) if answer options are provided, respondents should be able to select several options; and (c) the phrasing should explicitly refer to gender identity. Several proposals satisfying these requirements have been made in the literature. For example, Spiel et al. [43] recommend asking, “What is your gender?” with the following five checkboxes: “woman,” “man,” “non-binary,” “prefer not to disclose,” and “prefer to self-describe.” If the last option is checked, a free-form field opens up. Nikki Stevens, author of the Open Demographics project,Footnote 3 suggests phrasing this question as “Where do you identify on the gender spectrum?” followed by a list of 30 gender identities taken from The ABC’s of LGBT+ by Ashley Mardell [24], as well as “prefer not to answer” and “self-identify: ____.” One should be aware, however, that a lengthy list of gender identities might be experienced as confusing and take too much time if used as part of a larger survey.
Instead of offering answer options, one might also ask an open question as recommended by Scheuerman et al. in “HCI Guidelines for Gender Equity and Inclusivity.”Footnote 4 In this case researchers will be required to manually code the responses, so the expected number of participants should not be too large, which is often the case for software engineering surveys. Moreover, open questions might elicit absurd reactions such as “bagel” or aggressive reactions such as “attack helicopter,” originating from a meme ridiculing non-binary gender identification [16].
Summary
When conducting interviews or small surveys, and the risk of aggressive or absurd responses is deemed small, prefer an open question such as “Where do you identify on the gender spectrum?” For larger surveys or surveys of populations that are more likely to provide absurd or aggressive responses, consider offering the following five checkboxes: “woman,” “man,” “non-binary,” “prefer not to disclose,” and “prefer to self-describe ____.”
Mining Software Repository Data
Repository mining studies analyze contributions from tens of thousands [32] to tens of millions of individuals [35]. This wealth of data allows one to carefully distinguish fine-grained statistical effects or perform longitudinal studies spanning over 50 years. However, when analyzing these amounts of data, it becomes no longer feasible to contact every single individual and ask them about their gender identity. In case of longitudinal studies, individuals might have retired or passed away; in case of large-scale studies of contemporary software development practices, contacting tens or hundreds of software developers might be technically possible, but it will likely lead to community disengagement, threatening the already low response rates [41]. To address this challenge, multiple tools have been proposed to automatically obtain gender information from the way developers present themselves, for example, by selecting their username or an avatar, or from the artifacts they produce such as source code or comments. The tools can be broadly classified as name-to-gender, face-to-gender, and artifact-to-gender. Many of these tools have not been designed with the software engineering data in mind, but software engineering data has its own peculiarities we discuss in the following.
Name-to-Gender
As Bradburn et al. [6] have put it, “<s>ometimes a person’s gender is obvious from his or her name.” Phrasing this more carefully, we can say that many cultures tend to associate specific names with specific genders: For example, Božidar is a Bulgarian name commonly given to men, while Nijol is a Lithuanian name commonly given to women. At the same time, לט (Tal) is a Hebrew name that can be given to a child of any gender. In their most basic form, name-to-gender tools merely look up a given name in lists of names typically associated with women and men and return “woman,” “man,” or “unknown” depending on relative prevalence of a certain name within a specific gender. Prevalence is sometimes used to express degree of confidence of the tool in the gender inferred. For example, genderize.io states “male” with 0.99 confidence for “bozidar,” “female” with 0.99 confidence for “nijole,” and “male” with 0.68 confidence for “tal”.
However, this kind of basic approach fails to take into account differences between cultures: for example, Andrea is more commonly associated with men in Italy and with women in Germany, while Karen is mostly associated with women in the United States and with men in Armenia. International collaboration means that the same software engineering project or the same software engineering dataset might involve contributors from different cultures. This requires more advanced name-to-gender genderizers to take the cultural background into account. genderComputer that has been designed to analyze Stack Overflow data uses location as a proxy for cultural background [45]. However, less than 20% of Stack Overflow users in the sample analyzed by Vasilescu et al. have indicated their location, and location as indicated by users does not necessarily correspond to an actual geographic location (e.g., The Matrix) [45]. Moreover, using location as a proxy for national culture fails to take into account immigration-related effects.
This is why NamsorFootnote 5 uses the individual’s surname as a proxy for national culture. This allows Namsor to infer that Andrea Rossini is (more likely) to be a man, while Andrea Parker is (more likely) to be a woman.Footnote 6 This also makes Namsor one of the most accurate name-to-gender tools [36, 38]. Closer inspection reveals a different story, however. Santamaria and Mihaljević [36] reported that confidence of Namsor is almost perfect for European names, but the median confidence drops to 70% for Asian names. In particular, Eastern and Southeastern Asian names are difficult to genderize. Half of the East Asian names have a confidence score of 0, indicating that Namsor is essentially guessing randomly. This Eurocentric bias is problematic when trying to apply automatic gender inference techniques to software developers: the recent Stack Overflow survey shows that almost one out of four software developers have indicated different Asian regions as their ethnic background; in particular, 4.2% of the respondents are East Asian and 4.39% Southeast Asian.Footnote 7 Recognizing this limitation of Namsor, Qiu et al. combined it with genderComputer and designed a classifier trained on public name lists and celebrity name lists [32]. The features of this classifier included the last character (e.g., in Spanish, names ending in a are usually associated with women), the last two characters (e.g., in Japan, names ending in ko are usually associated with women), and tri-grams and 4-grams to capture romanized Chinese, Japanese, and Korean names. The combined name-to-gender tool outperformed both genderComputer and Namsor: for example, accuracy on Chinese names was 60% as opposed to 7% of Namsor and 18% of genderComputer [32]. A similar, character-based approach has been combined with deep learning models by Hu et al. [13]. The work of Qiu et al. has also inspired the Namsor developers to further develop special techniques for ChineseFootnote 8 and Japanese names.Footnote 9 Still, a 2022 study of Sebo shows that even for the current version of Namsor, the overall proportion of errors (misclassifications and non-classifications) is 53% [39]. What is even more problematic is that Namsor tends to perform worst for names associated with women as opposed to those associated with men (19.2% of the former names have been categorized correctly as opposed to the 66.5% of the latter) [39].
However, with all the improvements, Namsor cannot be applied if the individual is known by a mononym: for example, an Indian-American scientist and educator Govindjee is known by a single name only. This also limits the applicability of Namsor to such datasets as Stack Overflow: 43% of the Stack Overflow usernames do not use spaces and hence cannot be analyzed using Namsor. Since genderComputer has been designed for Stack Overflow, it implements several heuristics targeting software developers. In particular, if the name cannot be easily split into first name(s) and last name(s), genderComputer assumes it is formatted according to common naming conventions for usernames (e.g., “johns” for “John Smith”) [4] and restarts the genderization process (e.g., with “john” derived from “johns”) [45].
Another limitation of all the aforementioned approaches is their inability to take age into account. Indeed, for example, in Pennsylvania such names as Morgan and Robin that have been predominantly associated with men in the past have evolved to being associated with people of any gender and more recently to be more commonly associated with women [2].
Summarizing the discussion of name-to-gender tools, we can say that multiple name-to-gender tools have been developed by open source practitioners, academic researchers, and company-based software engineers. These tools tend to approximate cultural background by analyzing the location or the surname of the individual. While age might have affected popularity of names among individuals of different genders, to the best of our knowledge, no currently available name-to-gender tool takes age into account.
To conclude this discussion, we list several examples of name-to-gender tools. As providing a complete overview of those tools would not be feasible, Table 28-1 only lists examples of the name-to-gender tools that (a) are available at the moment of writing and (b) have been empirically evaluated in scientific publications other than the paper that has introduced those tools.
Face-to-Gender
Another way developers present themselves on social platforms such as Stack Overflow and GitHub is by using avatars. This means that face recognition techniques such as Facelytics,Footnote 10 Face Analysis by Visage Technologies,Footnote 11 and PicPurifyFootnote 12 can be applied to the avatars to identify gender of the individuals on these avatars. Indeed, on the task of identifying gender of Stack Overflow users based on their avatars, a face-to-gender tool Face++ has been shown to have a performance comparable to genderComputer [22], while on different avatar datasets, Face++, Amazon, and MS achieve more than 90% accuracy when identifying gender based on automatically detected faces [18]. However, not all faces can be correctly detected: in the study of Jung et al. [18], the very best tool has correctly identified faces in merely 76% of the analyzed images. Moreover, not everybody has a profile picture representing a human face. For instance, approximately 30% of the Stack Overflow users only have a default profile picture automatically generated based on the MD5 hash of the user’s mail, rendering approximately 70% of the Stack Overflow users possibly amenable for face-to-gender inference. However, not all Stack Overflow profile images represent faces (rather than logos or cat pictures). This is why Lin and Serebrenik have carefully selected 900 non-generated profile images of users of different ages and reputations and classified them manually. Reputation classes were selected according to different privileges Stack Overflow users might have; age intervals according to the general distribution of the ages on Stack Overflow. Among the 900 profile images, only 53% represent faces [22], suggesting that overall face-to-gender tools might be applicable to approximately 37% = 70% * 53%.
Artifact-to-Gender
Artifact-to-gender tools are based on the assumption that people of different genders express themselves differently in writing. Not surprisingly, the lion’s share of the research in this area has been based on personal writing on social media such as tweets and Facebook posts [21]. For example, the work of Company and Wanner [42] has been designed in the first place for attribution of authorship of blog posts and novels to one of the authors within a predefined set, and then the same technique has been retrained to predict gender of the author. Authorship attribution techniques have been designed for the source code as well [11, 15]; similarly to Company and Wanner [42], they are aiming at associating code fragments with one of the authors from the predefined set of approximately 100–160 candidates. This shows that deanonymization of source code is possible despite a much more constrained use of language compared to social media texts. This is why Naz and Rice have applied similar techniques to predict gender of the authors. On a dataset of 100 student assignments, their approach has achieved the accuracy of 72% [29]. It remains to be seen whether these techniques can scale to tens of thousands of contributors common in repository mining studies.
Limitations and Concerns
The automated techniques discussed in the previous sections have shown that gender-related information can be obtained from such names, avatars, and code/text written. However, we need to remember that these methods are far from being perfect and one has to be very careful when applying them.
The first group of concerns are ethical. They are mostly raised in relation to face-to-gender techniques, but similar concerns can be raised for any automated genderization methods and are related to assigning any kind of categories to human beings without their explicit consent. While as humans we might be assigning categories to other people continuously, for example, when we are describing people, this kind of automation might be dangerous, for example, what if a tool recognizes a woman driving a car in a country where women are not allowed to drive cars? In fact, Nature has surveyed approximately 500 researchers in facial recognition, CS, and AI, and about two-thirds believe that application of facial recognition methods to recognize or predict personal characteristics (such as gender, sexual identity, age, or ethnicity) from appearance should be done only with the informed consent of those whose faces are used or after discussion with representatives of groups that might be affected [31]. Getting an informed consent from all GitHub or Stack Overflow developers is, of course, not realistic. Furthermore, individuals do not necessarily want to disclose their gender and sometimes take steps to hide it: one of the developers surveyed by Vasilescu et al. stated that they “have used a fake GitHub handle (my normal GitHub handle is my first name, which is a distinctly female name) so that people would assume I was male” [46]. In this case “correct” genderization would explicitly contradict the individual’s intention, which can hardly be seen as ethical.
The second concern is related to the gender binary assumption perpetrating the automatic techniques discussed previously. These are percentages of papers reviewed in two meta-studies. Keyes has shown that 92.9% of papers that introduce automatic face-to-gender tools assume gender binary, and this is also the case 96.7% of papers that use automatic gender recognition [19]. For artifact-to-gender literature surveyed by Krü ger and Hermann, this percentage goes up to 100% [21]. Finally, name-to-gender tools are doing a bit better: while they are still ignorant of non-binary genders, they at least tend to provide confidence scores, that is, they at least recognize their own lack of confidence [36]. Due to this gender binary assumption, automatic genderization tools can harm non-binary individuals as well as individuals with a limited ability to appear and be treated as their preferred gender [37].
Third, both the applicability and the accuracy of the techniques are not perfect. Restricted applicability might bias conclusions of a study since it is based only on data that the tools could analyze. Moreover, applicability and accuracy can be even lower for some subcommunities, for example, for Chinese names, when some of the gender-specific information is lost during the romanization.
All these reasons can lead to tools assigning an individual a gender that they do not agree with (e.g., because they do not want to disclose it, because this gender cannot be identified by the tool, or because the tool is imprecise), a problem known as misgendering, which can be seen as a form of verbal violence [26]. This is why we believe that (a) automated techniques should never be applied at the level of an individual subject but only at the level of large groups, (b) techniques should not be showing unequal performance on specific groups (e.g., if we know that name-to-gender techniques underperform on Asian names, conclusions based on application of these techniques to Asian names might be wrong), and (c) one has to continuously reflect on potential risks of the application of these techniques.
Summary
Automated tools are necessary when analyzing large-scale data. When using the tools, one should never apply them at the level of an individual subject, but only at the level of large groups, and either ensure that performance of the tools is equal across different subpopulations or recognize unequal performance as a threat to validity of the conclusions derived.
Beyond Software Engineering
Several insights discussed previously can be also applied outside of the realm of software engineering. As the guidelines related to interviews and surveys are borrowed from the field of Human-Computer Interaction, they can be expected to be applicable to any interview and survey looking to collect information about gender. Similarly, techniques discussed in the context of mining software repositories are applicable to analysis of any large-scale archival data ranging from social media sites such as Twitter and Facebook [20] to corpora of scientific publications [23], from a movie-related knowledge-sharing platform [25] to museum catalogs [44], and from Wikipedia [1] to collections of crowd-sourced recommendations [9]. Application of those techniques beyond software engineering might, however, require rethinking the aforementioned limitations and concerns as their relevance and importance might depend on the application domain.
Summary
Aforementioned techniques can be applied beyond software engineering, but their application might require careful rethinking the aforementioned limitations and concerns.
Conclusions
Gender and gender diversity are popular topics in contemporary software engineering research. To conduct this research, one has to identify gender of the individuals involved. To this end we have discussed two large groups of identifying the contributor’s gender: by asking questions and by applying algorithmic tools. None of the techniques is perfect: questionnaires do not scale, and algorithmic tools guessing gender from GitHub information assume gender binary. Choice of the technique should, of course, be made in function of the research questions one is trying to answer. However, it might be equally important to discuss the limitations and problems of these techniques (and not only their advantages that made us choose them in the first place).
Summary
-
For interview studies and small surveys, ask an open question: “Where do you identify on the gender spectrum?”
-
For larger surveys use the same phrasing and the following five checkboxes: “woman,” “man,” “non-binary,” “prefer not to disclose,” and “prefer to self-describe ____.”
-
When mining repositories evaluate name-to-gender and face-to-gender tools and either ensure that performance of the tools is equal across different subpopulations or recognize unequal performance as a threat to validity of the conclusions derived.
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
- 10.
- 11.
- 12.
Bibliography
David Bamman and Noah A. Smith. Unsupervised Discovery of Biographical Structure from Text. Transactions of the Association for Computational Linguistics, 2:363–376, October 2014.
Herbert Barry and Aylene S. Harper. Feminization of unisex names from 1960 to 1990. Names, 41(4):228–238, 1993.
Greta R. Bauer, Jessica Braimoh, Ayden I. Scheim, and Christoffer Dharma. Transgender-inclusive measures of sex/gender for population surveys: Mixed-methods evaluation and recommendations. PLOS ONE, 12(5):1–28, May 2017.
Christian Bird, Alex Gourley, Premkumar T. Devanbu, Michael Gertz, and Anand Swaminathan. Mining email social networks. In Stephan Diehl, Harald C. Gall, and Ahmed E. Hassan (editors), Proceedings of the 2006 International Workshop on Mining Software Repositories, MSR 2006, Shanghai, China, May 22–23, 2006, pages 137–143. ACM, 2006.
Hanjo D. Boekhout, Inge van der Weijden, and Ludo Waltman. Gender differences in scientific careers: A large-scale bibliometric analysis. CoRR, abs/2106.12624, 2021.
Norman M. Bradburn, Seymour Sudman, and Brian Wansink. Asking Questions: The Definitive Guide to Questionnaire Design – For Market Research, Political Polls, and Social and Health Questionnaires. Research Methods for the Social Sciences. Jossey-Bass, revised edition, 2004.
Nicolas Brub, Gita Ghiasi, Maxime Sainte-Marie, and Vincent Larivire. Wiki-Gendersort: Automatic gender detection using first names in Wikipedia. March 2020.
Jill Cao, Kyle Rector, Thomas H. Park, Scott D. Fleming, Margaret M. Burnett, and Susan Wiedenbeck. A debugging perspective on end-user mashup programming. In Christopher D. Hundhausen, Emmanuel Pietriga, Paloma D́ıaz, and Mary Beth Rosson (editors), IEEE Symposium on Visual Languages and Human-Centric Computing, VL/HCC 2010, Legané s-Madrid, Spain, September 21–25, 2010, Proceedings, pages 149–156. IEEE Computer Society, 2010.
Abhijnan Chakraborty, Johnnatan Messias, Fabricio Benevenuto, Saptarshi Ghosh, Niloy Ganguly, and Krishna Gummadi. Who makes trends? Understanding demographic biases in crowdsourced recommendations. Proceedings of the International AAAI Conference on Web and Social Media, 11(1):22–31, May 2017.
Ben Colliver, Adrian Coyle, and Marisa Silvestri. The online “othering” of transgendering and non-binary people. In Karen Lumsden and Emily Harmer (editors), Online Othering: Exploring the Dark Side of the Web. Palgrave Macmillan, 2019.
Edwin Dauber, Aylin Caliskan, Richard E. Harang, Gregory Shearer, Michael J. Weisman, Frederica Free-Nelson, and Rachel Greenstadt. Git blame who? Stylistic authorship attribution of small, incomplete source code fragments. Proc. Priv. Enhancing Technol., 2019(3):389–408, 2019.
Fred Dervin. Discourses of Othering, pages 43–55. Palgrave Macmillan UK, London, 2016.
Yifan Hu, Changwei Hu, Thanh Tran, Tejaswi Kasturi, Elizabeth Joseph, and Matt Gillingham. What’s in a name? Gender classification of names with character based machine learning models. Data Mining and Knowledge Discovery, 35(4):1537–1563, July 2021.
Nasif Imtiaz, Justin Middleton, Joymallya Chakraborty, Neill Robson, Gina R. Bai, and Emerson R. Murphy-Hill. Investigating the effects of gender bias on GitHub. In Joanne M. Atlee, Tevfik Bultan, and Jon Whittle (editors), Proceedings of the 41st International Conference on Software Engineering, ICSE 2019, Montreal, QC, Canada, May 25–31, 2019, pages 700–711. IEEE/ACM, 2019.
Aylin Caliskan Islam, Richard E. Harang, Andrew Liu, Arvind Narayanan, Clare R. Voss, Fabian Yamaguchi, and Rachel Greenstadt. De-anonymizing programmers via code stylometry. In Jaeyeon Jung and Thorsten Holz (editors), 24th USENIX Security Symposium, USENIX Security 15, Washington, DC, USA, August 12–14, 2015, pages 255–270. USENIX Association, 2015.
Samantha Jaroszewski, Danielle Lottridge, Oliver L. Haimson, and Katie Quehl. ”Genderfluid” or ”attack helicopter”: Responsible HCI research practice with non-binary gender variation in online communities. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, CHI ’18, page 115. Association for Computing Machinery, New York, NY, USA, 2018.
Joy L. Johnson and Robin Repta. Sex and gender: Beyond the binaries. In John L. Oliffe and Lorraine Greaves (editors), Designing and Conducting Gender, Sex, & Health Research, pages 17–38. SAGE Publications, Inc., Thousand Oaks, July 2012.
Soon-gyo Jung, Jisun An, Haewoon Kwak, Joni Salminen, and Bernard Jansen. Assessing the accuracy of four popular face recognition tools for inferring gender, age, and race. Proceedings of the International AAAI Conference on Web and Social Media, 12(1), June 2018.
Os Keyes. The misgendering machines: Trans/HCI implications of automatic gender recognition. Proc. ACM Hum.-Comput. Interact., 2(CSCW), November 2018.
Athanasios Kokkos and Theodoros Tzouramanis. A robust gender inference model for online social networks and its application to LinkedIn and Twitter. First Monday, 19(9), August 2014.
Stefan Krüger and Ben Hermann. Can an online service predict gender?: on the state-of-the-art in gender identification from texts. In Ivica Crnkovic, Karina Kohl Silveira, and Sara Sprenkle (editors), Proceedings of the 2nd International Workshop on Gender Equality in Software Engineering, GE@ICSE 2019, Montreal, QC, Canada, May 27, 2019, pages 13–16. IEEE/ACM, 2019.
Bin Lin and Alexander Serebrenik. Recognizing gender of stack overflow users. In Proceedings of the 13th International Conference on Mining Software Repositories, MSR ’16, pages 425–429. Association for Computing Machinery, New York, NY, USA, 2016.
Sarah Jill Mah, Mallika Makkar, Kathy Huang, Tharani Anpalagan, Clare J. Reade, and Julie My Van Nguyen. Gender imbalance in gynecologic oncology authorship and impact of COVID-19 pandemic. International Journal of Gynecologic Cancer, 32(5):583–589, 2022.
Ashley Mardell. The ABC’s of LGBT+. Mango Media Incorporated, 2016.
Antoine Mazières, Telmo Menezes, and Camille Roth. Computational appraisal of gender representativeness in popular movies. Humanities and Social Sciences Communications, 8(1):137, June 2021.
Chan Tov McNamarah. Misgendering. California Law Review, 109(6), December 2021.
David Arroyo Menéndez, Jesús M. González-Barahona, and Gregorio Robles. Damegender: Writing and comparing gender detection tools. In SATToSE, 2020.
Jrg Michael. 40000 namen, anredebestimmung anhand des vornamens. c’t, (17):182–183, 2007.
Fariha Naz and Jacqueline E. Rice. Sociolinguistics and programming. In IEEE Pacific Rim Conference on Communications, Computers and Signal Processing, PACRIM 2015, Victoria, BC, Canada, August 24–26, 2015, pages 74–79. IEEE, 2015.
Ehsan Noei and Kelly Lyons. A study of gender in user reviews on the google play store. Empirical Software Engineering, 27(2):34, December 2021.
Richard Van Noorden. The ethical questions that haunt facial-recognition research. Nature, 2020.
Huilian Sophie Qiu, Alexander Nolte, Anita Brown, Alexander Serebrenik, and Bogdan Vasilescu. Going farther together: The impact of social capital on sustained participation in open source. In 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE), pages 688–699, 2019.
Savannah R. Roberts, Phillipa Hay, Kay Bussey, Nora Trompeter, Alexandra Lonergan, and Deborah Mitchison. Associations among relationship status, gender, and sexual attraction in Australian adolescents’ eating pathology. International Journal of Eating Disorders, n/a(n/a).
Gema Rodŕıguez-Pérez, Reza Nadri, and Meiyappan Nagappan. Perceived diversity in software engineering: a systematic literature review. Empir. Softw. Eng., 26(5):102, 2021.
Davide Rossi and Stefano Zacchiroli. Worldwide gender differences in public code contributions and how they have been affected by the COVID-19 pandemic. In 44th IEEE/ACM International Conference on Software Engineering: Software Engineering in Society ICSE (SEIS) 2022, Pittsburgh, PA, USA, May 22–24, 2022, pages 172–183. IEEE, 2022.
Lucia Santamaria and Helena Mihaljević. Comparison and benchmark of name-to-gender inference services. PeerJ Computer Science, cs156, 2018.
Morgan Klaus Scheuerman, Jacob M. Paul, and Jed R. Brubaker. How computers see gender: An evaluation of gender classification in commercial facial analysis services. Proc. ACM Hum.-Comput. Interact., 3(CSCW), November 2019.
Paul Sebo. Performance of gender detection tools: a comparative study of name-to-gender inference services. Journal of the Medical Library Association, 109(3):414–421, 2021.
Paul Sebo. How accurate are gender detection tools in predicting the gender for Chinese names? a study with 20,000 given names in Pinyin format. Journal of the Medical Library Association, 110(2):205–211, 2022.
Zohreh Sharafi, Zéphyrin Soh, Yann-Gaël Guéhéneuc, and Giuliano Antoniol. Women and men – different but equal: On the impact of identifier style on source code reading. In Dirk Beyer, Arie van Deursen, and Michael W. Godfrey (editors), IEEE 20th International Conference on Program Comprehension, ICPC 2012, Passau, Germany, June 11–13, 2012, pages 27–36. IEEE Computer Society, 2012.
Edward K. Smith, Robert T. Loftin, Emerson R. Murphy-Hill, Christian Bird, and Thomas Zimmermann. Improving developer participation rates in surveys. In 6th International Workshop on Cooperative and Human Aspects of Software Engineering, CHASE 2013, San Francisco, CA, USA, May 25, 2013, pages 89–92. IEEE Computer Society, 2013.
Juan Soler Company and Leo Wanner. On the role of syntactic dependencies and discourse relations for author and gender identification. Pattern Recognit. Lett., 105:87–95, 2018.
Katta Spiel, Oliver L. Haimson, and Danielle Lottridge. How to do better with gender on surveys: A guide for HCI researchers. Interactions, 26(4):6265, June 2019.
Chad M. Topaz, Bernhard Klingenberg, Daniel Turek, Brianna Heggeseth, Pamela E. Harris, Julie C. Blackwood, C. Ondine Chavoya, Steven Nelson, and Kevin M. Murphy. Diversity of artists in major US museums. PLOS ONE, 14(3):1–15, 3 2019.
Bogdan Vasilescu, Andrea Capiluppi, and Alexander Serebrenik. Gender, representation and online participation: A quantitative study. Interacting with Computers, 26(5):488–511, 2014.
Bogdan Vasilescu, Vladimir Filkov, and Alexander Serebrenik. Perceptions of diversity on git hub: A user survey. In Andrew Begel, Rafael Prikladnicki, Yvonne Dittrich, Cleidson R. B. de Souza, Anita Sarma, and Sandeep Athavale (editors), 8th IEEE/ACM International Workshop on Cooperative and Human Aspects of Software Engineering, CHASE 2015, Florence, Italy, May 18, 2015, pages 50–56. IEEE Computer Society, 2015.
Jock Young. The Vertigo of Late Modernity. SAGE Publications, 2007.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
Copyright information
© 2024 The Author(s)
About this chapter
Cite this chapter
Serebrenik, A. (2024). How to Ask About Gender Identity of Software Engineers and “Guess” It from the Archival Data. In: Damian, D., Blincoe, K., Ford, D., Serebrenik, A., Masood, Z. (eds) Equity, Diversity, and Inclusion in Software Engineering. Apress, Berkeley, CA. https://doi.org/10.1007/978-1-4842-9651-6_28
Download citation
DOI: https://doi.org/10.1007/978-1-4842-9651-6_28
Published:
Publisher Name: Apress, Berkeley, CA
Print ISBN: 978-1-4842-9650-9
Online ISBN: 978-1-4842-9651-6
eBook Packages: Professional and Applied ComputingApress Access BooksProfessional and Applied Computing (R0)