Dr. Gareth Roberts Colloquium Talk on Investigating Sociolinguistic Indexicality

Dr. Gareth Roberts was invited to give an in-person talk on Thursday, March 16th as part of the Linguistics colloquium series this year. Dr. Gareth is an associate professor in Linguistics at the University of Pennsylvania and a former co-author of Dr. Sneller, one of the Socio Lab co-directors here at MSU. It was great to have Dr. Gareth here!

Continue ReadingDr. Gareth Roberts Colloquium Talk on Investigating Sociolinguistic Indexicality

MI Diaries at 2023 MSU Science Festival

MI Diaries, run by the Sociolinguistics Lab at MSU, partnered with Inquiry Arts as part of the STEAM Expo Day at the MSU Science Festival on April 1st and 2nd, 2023. We invited Michigan residents and visitors to reflect on their lives, curiosities, and hopes for the future while learning about sociolinguistic research. In addition to providing information about MI Diaries, we shared some of our featured participant stories with with Science Festival attendees and welcomed them to record their own stories!

Continue ReadingMI Diaries at 2023 MSU Science Festival

Colloquium talk: Dr. Tsung-Lun Alan Wan

 

Dr. Tsung-Lun Alan Wan is joining us to give a colloquium talk this spring! Please see details of the talk below.

Dr. Tsung-Lun Alan Wan received his PhD from the University of Edinburgh and is a postdoctoral researcher in medical humanities at National Cheng Kung University.  He will be presenting his work on agentive language use among deaf or hard-of-hearing speakers in Taiwan.

Time: April 7, Friday 2023, 8:30-10:30am Eastern Time

Event: Virtual via Zoom

Abstract:

Deaf identity and style-shifting in read speech


Within a medical discourse of disability, deaf ways of speaking spoken languages are approached from pathological perspectives. In this talk, I instead focus on speaker agency among deaf speakers of Taiwan Mandarin in utilizing speech style-shifting to performhearingness/deafness. Looking at the linguistic variable ㄕ sh /ʂ/, in the first part of the talk,I will emphasize the importance of identifying indexical fields of variants from the perspectives of deaf speakers. In the second part of the talk, I will look at topic-based shifting which takes place when deaf speakers read aloud a passage about the oppression upon deaf signers by hearing people. The data show that even if the participants argue deaf speakers should conform to hearing ways of speaking Mandarin, some of them shift to deaf ways of realizing the variable when engaging with the identity politics topic, and the others instead shift to hearing ways of realizing the variable. I argue that this difference in topic effect is mobilized by different stances toward the content of the passage, and the stance-taking is mediated by the presence of a hearing interviewer. 

If you are interested in joining the talk, please email Yongqing (yeyongqi@msu.edu) for the Zoom link.

Continue ReadingColloquium talk: Dr. Tsung-Lun Alan Wan

Dan Villarreal talk November 3 on auto-coding

Dr. Dan Villarreal (University of Pittsburgh) is visiting the Sociolinguistics Lab in early November. He’ll be giving a talk, open to the public, on Thursday November 3, 2022. Dan’s presentation is of special interest to us because it’s about automating analyses of large-scale datasets. As we build a corpus of Michigan speech in the MI Diaries project, we’ve been using automatic speech recognition (ASR) to speed up our transcription time, and working with MSU’s Institute for Cyber-Enabled Research (ICER) to move some of our data processing to their supercomputer.

Dr. Villarreal is also giving a talk to the SoConDi group at University of Michigan on Nov 4th, 2022, 3-4pm. If you are interested in joining that talk, please contact Yongqing Ye (yeyongqi@msu.edu) or Suzanne Wagner (wagnersu@msu.edu) for the Zoom link.

Sociolinguistic auto-coding: Applications and pitfalls

Dan Villareal, University of Pittsburgh

Time: Thursday, Nov 3, 4:30-6:15pm

Location: Wells Hall B342 and on Zoom

Zoom link:  https://msu.zoom.us/j/98418360065   Meeting ID: 984 1836 0065 passcode: sociolab.

Researchers in sociophonetics and variationist sociolinguistics have increasingly turned to computational methods to automate time-consuming research tasks such as data extraction (e.g., Fromont & Hay 2012), phonetic alignment (e.g., McAuliffe et al. 2017), and accurate vowel measurement (e.g., Barreda 2021). In this talk, I discuss the advantages and challenges of using sociolinguistic auto-coding (SLAC), a method in which machine learning classifiers assign variants to variable data (Kendall et al. 2021; McLarty, Jones & Hall 2019; Villarreal et al. 2020; Villarreal under review). 

Villarreal et al. (2020) trained random forest classifiers of two sociolinguistic variables of New Zealand English, non-prevocalic /r/ (varying between Present vs. Absent) and intervocalic medial /t/ (Voiced vs. Voiceless), using over 4,000 previously hand-coded tokens (per variable). Cross-validation revealed accuracy rates of 84.5% for /r/ and 91.8% for /t/. In addition to binary predictions, these auto-coders calculate classifier probabilities: the likelihood that a given /r/ token was Present, or a /t/ token was Voiced. In a listening experiment in which 11 phonetically trained listeners coded 60 /r/ tokens, we found a significant positive linear relationship between classifier probability and human judgments; this indicates that classifier probability successfully captures listeners’ perception of phonetically gradient rhoticity. Finally, auto-coders can report which features were most important in classification, helping to shed light on acoustically complex variables like /r/. In short, SLAC can be used for at least three specific functions: binary coding, gradient ‘coding’, and feature selection. 

Like other machine learning (ML) methods, however, there are inherent concerns about SLAC’s fairness—that is, whether it generates equally valid predictions for different speaker groups  (e.g., Koenecke et al. 2020). First, given that there are multiple definitions of ML fairness that are mutually incompatible (Berk et al. 2018; Corbett-Davies et al. 2017; Kleinberg et al. 2017), fairness metrics must be decided upon within individual research domains; I argue for three fairness metrics relevant to the domain of sociolinguistic auto-coding. Second, I re-analyze Villarreal et al.’s (2020) /r/ auto-coder for fairness; I find poor performance on all three fairness metrics, with women’s tokens coded more accurately than men’s (88.8% vs. 81.4%). Third, to remedy these imbalances, I used the same data to test a variety of unfairness-mitigation strategies from the ML fairness literature; I find substantial improvement with respect to fairness, albeit at the expense of predictive performance. 

Given these fairness issues, I reconsider SLAC under Markl’s (2022) premise that some speech and language technologies are too inherently flawed to use. I argue that while SLAC does not fit into this category, its potential users and consumers deserve a “warts and all” awareness of its drawbacks. To that end, I close with concrete recommendations for using SLAC in large-scale research projects. 

References 

Barreda, Santiago. 2021. Fast Track: fast (nearly) automatic formant-tracking using Praat. Linguistics Vanguard 7(1). https://doi.org/10.1515/lingvan-2020-0051. 

Fromont, Robert & Jennifer Hay. 2012. LaBB-CAT: An annotation store. Proceedings of Australasian Language Technology Association Workshop 113–117. 

Kendall, Tyler, Charlotte Vaughn, Charlie Farrington, Kaylynn Gunter, Jaidan McLean, Chloe Tacata & Shelby Arnson. 2021. Considering performance in the automated and manual coding of sociolinguistic variables: Lessons from variable (ING). Frontiers in Artificial Intelligence 4(43). https://doi.org/10.3389/frai.2021.648543. 

Markl, Nina. 2022. Language variation and algorithmic bias: Understanding algorithmic bias in British English automatic speech recognition. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’22), 521–534. New York, NY, USA: Association for Computing Machinery. https://doi.org/10.1145/3531146.3533117. 

McAuliffe, Michael, Michaela Socolof, Sarah Mihuc, Michael Wagner & Morgan Sonderegger. 2017. Montreal Forced Aligner: Trainable text-speech alignment using Kaldi. In. 

McLarty, Jason, Taylor Jones & Christopher Hall. 2019. Corpus-based sociophonetic approaches to postvocalic r-lessness in African American Language. American Speech 94. https://doi.org/10.1215/00031283-7362239. 

Villarreal, Dan. under review. Sociolinguistic auto-coding has fairness problems too: Measuring and mitigating bias. Linguistics Vanguard

Villarreal, Dan, Lynn Clark, Jennifer Hay & Kevin Watson. 2020. From categories to gradience: Auto-coding sociophonetic variation with random forests. Laboratory Phonology 11(6). 1–31. https://doi.org/10.5334/labphon.216. 

Continue ReadingDan Villarreal talk November 3 on auto-coding

Colloquium talk: Dr. Annette D’Onofrio

Dr. Annette D’Onofrio is joining us to give a colloquium talk this fall! Please see details of the talk below.

Dr. Annette D’Onofrio is an Assistant Professor in the Linguistics Department at Northwestern University. She will present on her work on Chicagoland project, style, and personae.

Time: Thursday (09/15/2022) 4:30-6:15pm Eastern Time

Event: In-person and Zoom

Talk Abstract

Locating sound change reversal: Racialized and age-based patterns of the Northern Cities Shift in a Chicago community

While dialectological work once indicated that American English regional dialects were becoming increasingly disparate over time (e.g. Labov 2014), recent sociolinguistic studies are revealing the opposite trend in some regions, showing movement away from regionally distinctive language features (e.g. Prichard & Tamminga 2012, Dodsworth & Kohn 2012). Specifically, the Inland North region’s characteristic Northern Cities Vowel Shift (NCS), which had been advancing throughout the 20th century (Labov 2007), has begun to reverse its trajectory in some Inland North locales (Driscoll & Lape 2015; Wagner et al. 2016), including in Chicago (McCarthy 2011, Durian & Cameron 2019). In this talk, I explore the ways in which NCS reversal is socially conditioned in one Chicago neighborhood area. I demonstrate how both broader sociohistorical dynamics of migration and racialization, as well as highly localized oppositions and ideologies, inform patterns of vocalic change in this neighborhood.

Continue ReadingColloquium talk: Dr. Annette D’Onofrio