Jerid Francom has recently released a novel corpus of ‘everyday’ Spanish language from TV/film dialogues from Argentina, Mexico, and Spain. Initially supported by an NEH Digital Start-up grant, the ACTIV-ES corpus is now available to scholars, instructors, and the general public to use to explore dialect variation in colloquial Spanish. The data from this project was acquired through an online repository for TV/film subtitles and was subsequently text-normalized and part-of-speech annotated. To evaluate the extent to which the language contained in the corpus approximates the usage by native populations, in field psycholinguistic testing was conducted.
Our goalThe DH Community is a program of Wake Forest's Humanities Institute. We are faculty from across campus interested in investigating the emergence of digital humanities as a field of study, and its relevance and usefulness as a research and teaching tool in the humanities.
Join the conversation!
Use your Wake Forest username and password to login and contribute to DH Talk.
Tag Cloudadministration alan liu careers close reading cloud culturomics definitions DH2014 digital pedagogy digital projects digital scholarship digitization distant reading funding hastac history humanities data curation internet language liberal arts libraries manuscripts maps media collections methods multimedia multimodal net neutrality omega organization pedagogy peer review quantitative analysis resource science spatial analysis Stanford DH statistics symposium teaching textual analysis THATCamp timelines Turing Test word frequency