Paper on Impact of Sample Sizes on Calculating Pillai Scores in Journal of the Acoustical Society of America

One of our lab co-directors, Dr. Betsy Sneller, recently had a paper published in the Journal of the Acoustical Society of America. The article is titled “Sample size matters in calculating Pillai scores”, and it is authored by Joey Stanley (Brigham Young University) and Betsy Sneller.

“The article takes a look at how sociolinguists measure mergers in pronunciation (like when people pronounce “Don” and “Dawn” the same). We provide some suggestions for how to handle datasets of different sizes, including running through a small case study analyzing vowel mergers in conversational speech compared with a word list.”

Betsy Sneller

Continue Reading Paper on Impact of Sample Sizes on Calculating Pillai Scores in Journal of the Acoustical Society of America

Jack Rechsteiner accepted into the Linguistics PhD program at Pittsburgh University

Jack Rechsteiner (MA Linguistics) has accepted a funded PhD position in the Department of Linguistics at the University of Pittsburgh, starting Fall 2023.

Jack received their B.A. in Linguistics at Michigan State University in 2021, and is currently a 2nd year MA student in the Linguistics program at MSU. Their research focuses primarily on sociophonetic variation in nonbinary speakers.

Jack’s interest lies in many fields — sociolinguistics, data analysis, and natural language processing. They are particularly passionate about understanding the inner workings of language and the interplay between language and society, as well as how insights into these topics can be applied to other areas.

Jack shared some thoughts on why they chose the program at University of Pittsburgh and their goals in the coming years:

I’m excited to study linguistics at the University of Pittsburgh because its linguistics department considers applied and descriptive methods to be equally important in examining the intersections of language, culture, and society. My research focuses on applying linguistic theory to identity and gender, and studying at the University of Pittsburgh would allow me to learn from linguists who have done great work on gender identity in language and the ways that social meanings become attached to linguistic variation. My goal is to become a researcher and professor who works to support a diverse range of backgrounds in academia while producing societally relevant research and communicating it with the community at large, and the University of Pittsburgh presents a great opportunity for pursuing that path. 

In addition, Jack has also received a grant from the MSU Multilingual Lab to attend the Lavender Languages Institute this summer.

Congratulations, Jack!

Continue Reading Jack Rechsteiner accepted into the Linguistics PhD program at Pittsburgh University

Mikayla Thompson accepted for NSF Research Experience for Undergraduates

Socio Lab member Mikayla Thompson (Linguistics major) has been accepted for a competitive NSF Research Experience for Undergraduates (NSF-REU) opportunity this summer.

Mikayla will spend 8 weeks at the University of Oregon as part of its initiative “Increasing American Indian/Alaska Native Perspectives in Field and Experimental Linguistics“. The REU includes instruction on topics in descriptive linguistics and experimental linguistics, hands-on research in two labs, and input from local Indigenous educators and researchers.

Mikayla shared why she chose this program and her goals and hopes:

“This opportunity to study language revitalization methods at the University of Oregon stood out to me initially because of the particular nature of the classes and research. The focus on language revitalization processes in relation to my compiled knowledge of linguistics is exactly what I would like to do post-graduation. I intend to utilize the knowledge presented in these classes and fieldwork to better inform myself of methods of preserving and reviving Indigenous American languages. As a descendent of the Cherokee Nation, I know quite intimately the degree to which language repression and subsequent language endangerment has influenced Indigenous communities, and what it means for the future. I hope to apply what is learned at the University of Oregon to my own communities, so that I may more deeply familiarize myself with my ancestral language, Cherokee, and to eventually pass it down to others in my communities.”

Continue Reading Mikayla Thompson accepted for NSF Research Experience for Undergraduates

Newt Kelbley accepted into the Forensic Linguistics MA program at Cardiff University

Socio Lab member Newt Kelbley (BA Linguistics) has been accepted in to the Forensic Linguistics MA program at Cardiff University in Wales. Congratulations, Newt!

Newt is a Linguistics major investigating the syntax of sociolinguistic prompt questions in the MI Diaries project. Newt will start the one-year MA program in the fall of 2023. They shared why they want to study Forensic Linguistics at Cardiff University:

“I want to study there because it is one of the few places that have such a specific degree program, and I want to know more about the interface between language and law. I’m interested in this because this branch of linguistics is still growing and reaching its potential, and the applications seem unlimited. Mostly what’s appealing is what I’ve learned about the work of forensic linguists seeking to critically highlight problems in the judicial system, like comprehension challenges in jury texts, inadequate courtroom translations, or falsified written documents. Using research to inform and enhance the practice of law and make it fairer for the disadvantaged seems like a noble goal.”

Continue Reading Newt Kelbley accepted into the Forensic Linguistics MA program at Cardiff University

Colloquium talk: Dr. Tsung-Lun Alan Wan


Dr. Tsung-Lun Alan Wan is joining us to give a colloquium talk this spring! Please see details of the talk below.

Dr. Tsung-Lun Alan Wan received his PhD from the University of Edinburgh and is a postdoctoral researcher in medical humanities at National Cheng Kung University.  He will be presenting his work on agentive language use among deaf or hard-of-hearing speakers in Taiwan.

Time: April 21, Friday 2023, 8:45-10:45pm Eastern Time

Event: Virtual via Zoom

Abstract: To be announced

If you are interested in joining the talk, please email Yongqing ( for the Zoom link.

Continue Reading Colloquium talk: Dr. Tsung-Lun Alan Wan

Summer Research Opportunities

MI Diaries Research Experience for Undergraduates (REU) 2023

Are you interested in how people tell the stories of their community? 

Or in how the pandemic might have affected the way people speak?

Do you want to gain some research experience?

Apply to join us in summer 2023 at Michigan State Sociolinguistics Lab!

Click here for more information about the MI Diaries Summer 2023 Research Experience for Undergraduates on our project website!

Click here to watch the informal webinar with a presentation by Dr. Betsy Sneller on the details of the MI Diaries Summer 2023 Research Experience for Undergraduates — what it is, how to apply, and Q&A.

NSF Research Experience for Undergraduates (REU)

For students looking for a full-time paid experience, we offer a summer Research Experience for Undergraduates (REU). MI Diaries is a National Science Foundation funded project. We especially encourage students from historically underrepresented groups and/or minority-serving institutions to apply.

  • Location: The Sociolinguistics Lab at Michigan State University‘s East Lansing, MI campus.
  • Eligibility: US citizens registered as undergraduate students in Summer 2023 (depending on the institution, this may include incoming freshmen).
  • Duration: 8 weeks in the summer (June 5 – July 28, 2023).
  • Pay: $600 per week for 30 hours work per week.
  • Background: Students do not need prior linguistics experience to apply!
Continue Reading Summer Research Opportunities

Dan Villarreal talk November 3 on auto-coding

Dr. Dan Villarreal (University of Pittsburgh) is visiting the Sociolinguistics Lab in early November. He’ll be giving a talk, open to the public, on Thursday November 3, 2022. Dan’s presentation is of special interest to us because it’s about automating analyses of large-scale datasets. As we build a corpus of Michigan speech in the MI Diaries project, we’ve been using automatic speech recognition (ASR) to speed up our transcription time, and working with MSU’s Institute for Cyber-Enabled Research (ICER) to move some of our data processing to their supercomputer.

Dr. Villarreal is also giving a talk to the SoConDi group at University of Michigan on Nov 4th, 2022, 3-4pm. If you are interested in joining that talk, please contact Yongqing Ye ( or Suzanne Wagner ( for the Zoom link.

Sociolinguistic auto-coding: Applications and pitfalls

Dan Villareal, University of Pittsburgh

Time: Thursday, Nov 3, 4:30-6:15pm

Location: Wells Hall B342 and on Zoom

Zoom link:   Meeting ID: 984 1836 0065 passcode: sociolab.

Researchers in sociophonetics and variationist sociolinguistics have increasingly turned to computational methods to automate time-consuming research tasks such as data extraction (e.g., Fromont & Hay 2012), phonetic alignment (e.g., McAuliffe et al. 2017), and accurate vowel measurement (e.g., Barreda 2021). In this talk, I discuss the advantages and challenges of using sociolinguistic auto-coding (SLAC), a method in which machine learning classifiers assign variants to variable data (Kendall et al. 2021; McLarty, Jones & Hall 2019; Villarreal et al. 2020; Villarreal under review). 

Villarreal et al. (2020) trained random forest classifiers of two sociolinguistic variables of New Zealand English, non-prevocalic /r/ (varying between Present vs. Absent) and intervocalic medial /t/ (Voiced vs. Voiceless), using over 4,000 previously hand-coded tokens (per variable). Cross-validation revealed accuracy rates of 84.5% for /r/ and 91.8% for /t/. In addition to binary predictions, these auto-coders calculate classifier probabilities: the likelihood that a given /r/ token was Present, or a /t/ token was Voiced. In a listening experiment in which 11 phonetically trained listeners coded 60 /r/ tokens, we found a significant positive linear relationship between classifier probability and human judgments; this indicates that classifier probability successfully captures listeners’ perception of phonetically gradient rhoticity. Finally, auto-coders can report which features were most important in classification, helping to shed light on acoustically complex variables like /r/. In short, SLAC can be used for at least three specific functions: binary coding, gradient ‘coding’, and feature selection. 

Like other machine learning (ML) methods, however, there are inherent concerns about SLAC’s fairness—that is, whether it generates equally valid predictions for different speaker groups  (e.g., Koenecke et al. 2020). First, given that there are multiple definitions of ML fairness that are mutually incompatible (Berk et al. 2018; Corbett-Davies et al. 2017; Kleinberg et al. 2017), fairness metrics must be decided upon within individual research domains; I argue for three fairness metrics relevant to the domain of sociolinguistic auto-coding. Second, I re-analyze Villarreal et al.’s (2020) /r/ auto-coder for fairness; I find poor performance on all three fairness metrics, with women’s tokens coded more accurately than men’s (88.8% vs. 81.4%). Third, to remedy these imbalances, I used the same data to test a variety of unfairness-mitigation strategies from the ML fairness literature; I find substantial improvement with respect to fairness, albeit at the expense of predictive performance. 

Given these fairness issues, I reconsider SLAC under Markl’s (2022) premise that some speech and language technologies are too inherently flawed to use. I argue that while SLAC does not fit into this category, its potential users and consumers deserve a “warts and all” awareness of its drawbacks. To that end, I close with concrete recommendations for using SLAC in large-scale research projects. 


Barreda, Santiago. 2021. Fast Track: fast (nearly) automatic formant-tracking using Praat. Linguistics Vanguard 7(1). 

Fromont, Robert & Jennifer Hay. 2012. LaBB-CAT: An annotation store. Proceedings of Australasian Language Technology Association Workshop 113–117. 

Kendall, Tyler, Charlotte Vaughn, Charlie Farrington, Kaylynn Gunter, Jaidan McLean, Chloe Tacata & Shelby Arnson. 2021. Considering performance in the automated and manual coding of sociolinguistic variables: Lessons from variable (ING). Frontiers in Artificial Intelligence 4(43). 

Markl, Nina. 2022. Language variation and algorithmic bias: Understanding algorithmic bias in British English automatic speech recognition. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22), 521–534. New York, NY, USA: Association for Computing Machinery. 

McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, Michael Wagner & Morgan Sonderegger. 2017. Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In. 

McLarty, Jason, Taylor Jones & Christopher Hall. 2019. Corpus-based sociophonetic approaches to postvocalic r-lessness in African American Language. American Speech 94. 

Villarreal, Dan. under review. Sociolinguistic auto-coding has fairness problems too: Measuring and mitigating bias. Linguistics Vanguard

Villarreal, Dan, Lynn Clark, Jennifer Hay & Kevin Watson. 2020. From categories to gradience: Auto-coding sociophonetic variation with random forests. Laboratory Phonology 11(6). 1–31. 

Continue Reading Dan Villarreal talk November 3 on auto-coding

Colloquium talk: Dr. Annette D’Onofrio

Dr. Annette D’Onofrio is joining us to give a colloquium talk this fall! Please see details of the talk below.

Dr. Annette D’Onofrio is an Assistant Professor in the Linguistics Department at Northwestern University. She will present on her work on Chicagoland project, style, and personae.

Time: Thursday (09/15/2022) 4:30-6:15pm Eastern Time

Event: In-person and Zoom

Talk Abstract

Locating sound change reversal: Racialized and age-based patterns of the Northern Cities Shift in a Chicago community

While dialectological work once indicated that American English regional dialects were becoming increasingly disparate over time (e.g. Labov 2014), recent sociolinguistic studies are revealing the opposite trend in some regions, showing movement away from regionally distinctive language features (e.g. Prichard & Tamminga 2012, Dodsworth & Kohn 2012). Specifically, the Inland North region’s characteristic Northern Cities Vowel Shift (NCS), which had been advancing throughout the 20th century (Labov 2007), has begun to reverse its trajectory in some Inland North locales (Driscoll & Lape 2015; Wagner et al. 2016), including in Chicago (McCarthy 2011, Durian & Cameron 2019). In this talk, I explore the ways in which NCS reversal is socially conditioned in one Chicago neighborhood area. I demonstrate how both broader sociohistorical dynamics of migration and racialization, as well as highly localized oppositions and ideologies, inform patterns of vocalic change in this neighborhood.

Continue Reading Colloquium talk: Dr. Annette D’Onofrio

PhD Research Assistantship with MI Diaries

Thinking about PhD studies in language variation and change? 

Want to work on a big linguistic data collection project from your very first semester? 

Interested in five years of funding? 

Apply to Michigan State University’s Linguistics PhD program!  

Come to the Sociolinguistics Lab at Michigan State University! The MSU Linguistics PhD provides a generous 5 years of funding including a stipend, health insurance, and tuition. First year PhD students work part-time as Research Assistants (RAs). The MI Diaries project would love to recruit a strong RA with a research interest in language variation and change to help with our longitudinal study of self-recorded “audio diaries” from hundreds of people across the state. Become involved with everything from project management, community outreach, data analysis, recruitment, mentoring undergraduates and youth interns, to developing best practices for eliciting speech from a broad range of participants. Work closely with our faculty, Prof. Betsy Sneller and Prof. Suzanne Wagner, and with our team of students and other collaborators. Get started on your own related project, so that you’ll have a great foundation for building the research skills you’ll need for your PhD career and beyond. 

Apply here by November 30, 2022 for full consideration for Fall 2023 admission.

Grad student testimonials 

MSU Linguistics graduate students have had great experiences with MI Diaries.

Being involved with the MI Diaries project has enhanced my graduate school experience because it has given me the chance to work on a large-scale collaborative research project. Thanks to this project, I’ve been able to gain knowledge and experiences that can be applied to my own research that I would not have been able to acquire on my own. Working with the MI Diaries has also been incredibly enriching because it has provided me with so many opportunities to deepen my connections with other students and faculty in the department in a professional, but enjoyable setting. It’s also been a great opportunity to mentor undergraduate students and high school students on participating in an academic project and performing linguistic research which has been a personally fulfilling experience.” 

Jack Rechsteiner

“I am able to get hands-on experience of nearly every aspect of a research project — collaboration with faculties and students, mentoring, public outreach, writing, turning research ideas into conference presentations and papers, etc. I am grateful for the professional development opportunities this project offers, as well as all the wonderful personal connections I made working with people in this project. “

Yongqing Ye

“If you are a student interested in sociolinguistics who thrives in a supportive, tight-knit departmental community, continuing your education at MSU is a wonderful choice. In my time here so far, I have not only enjoyed the instruction and guidance of a host of brilliant scholars – including two world-class sociolinguists doing research on the cutting edge – I have also been embedded in one of the most innovative and largest-scale sociolinguistics projects being conducted today. Even after just a year of working in MI Diaries, my knowledge of sociolinguistics, and my ability to both approach research in an ethical, community-conscious manner as well as to operate within a big team of faculty and fellow students, have increased drastically. “

Adam Barnhardt

Continue Reading PhD Research Assistantship with MI Diaries

Talk on language choice in Ukraine

The lab’s Visiting Research Scholar, Dr. Irina Zaykoskaya, gave a talk at MSU on April 18, 2022 titled When native language is a matter of choice: The linguistic situation in Ukraine before and during the War. Irina provided some background on multilingualism in Ukraine, historical and 21st century attitudes to the Ukrainian language, and closed by discussing the phenomenon of language rejection. Anecdotal evidence suggests that since Russia’s recent invasion of Ukraine, some Ukrainians have symbolically given up speaking Russian through resistance or disgust. Irina compared this with German-speaking Holocaust refugees in the early 20th century who similarly gave up their language and in some cases lost it altogether. Irina touched on the ethics of gathering data from traumatized individuals, and cautioned that we cannot know the true linguistic situation in Ukraine at this time.

The talk was co-hosted by the MSU Sociolinguistics Lab and the MSU Language Policy and Practice Lab. It was delivered in a hybrid format. We were delighted that so many people could join via Zoom, in addition to the audience in Wells Hall. The talk abstract is below, and the slides can be found here.


Ukraine is a large and multilingual country, with Ukrainian and Russian especially dominating its linguistic landscape for decades. However, not only are the statuses of these languages different (i.e., Ukrainian being the official state language and Russian currently not having any formal status), but the attitudes towards them among the Ukrainian people differ as well. Even before the Russian attack on Ukraine on February 24, 2022, Ukrainians, including those from the Eastern, historically considered Russian-speaking parts of the country, would demonstrate symbolic preference for Ukrainian over Russian: for example, in a 2020 poll, only 21.8% of Eastern Ukrainians admitted speaking Ukrainian at home but 44.3% of the same respondents named it as their native language, which implies the view of one’s native language as a matter of choice rather than a matter of chance. Now, Russian-speaking Twitter is getting flooded by tweets like “I want lightning to strike me so that I forget the Russian language”. This talk will present an overview of historical events and policies that led to the current linguistic situation in Ukraine as compared to a few other post-Soviet countries, such as Belarus and Latvia. It will also attempt to capture the ongoing shift in attitudes among Ukrainians, from recognizing Russian as the language the enemies speak to perceiving it as the essence of the enemy.

Continue Reading Talk on language choice in Ukraine