Jerid Francom has recently released a novel corpus of ‘everyday’ Spanish language from TV/film dialogues from Argentina, Mexico, and Spain. Initially supported by an NEH Digital Start-up grant, the ACTIV-ES corpus is now available to scholars, instructors, and the general public to use to explore dialect variation in colloquial Spanish.  The data from this project was acquired through an online repository for TV/film subtitles and was subsequently text-normalized and part-of-speech annotated. To evaluate the extent to which the language contained in the corpus approximates the usage by native populations, in field psycholinguistic testing was conducted.

 

 

Comments are closed.