Jerid Francom has recently released a novel corpus of ‘everyday’ Spanish language from TV/film dialogues from Argentina, Mexico, and Spain. Initially supported by an NEH Digital Start-up grant, the ACTIV-ES corpus is now available to scholars, instructors, and the general public to use to explore dialect variation in colloquial Spanish. The data from this project was acquired through an online repository for TV/film subtitles and was subsequently text-normalized and part-of-speech annotated. To evaluate the extent to which the language contained in the corpus approximates the usage by native populations, in field psycholinguistic testing was conducted.
Our goalThe DH Community is a program of Wake Forest's Humanities Institute. We are faculty from across campus interested in investigating the emergence of digital humanities as a field of study, and its relevance and usefulness as a research and teaching tool in the humanities.
Join the conversation!
Use your Wake Forest username and password to login and contribute to DH Talk.
Tag Cloudadministration advocacy alan liu Artificial Intelligence big data careers close reading cloud database design definitions DH2014 digital curation digital pedagogy digital projects digital scholarship digitization distant reading funding hastac history internet italy language liberal arts manuscripts maps media collections methods multimedia multimodal net neutrality organization pedagogy peer review quantitative analysis resource science Stanford DH statistics symposium teaching textual analysis THATCamp transcription word frequency