Virginia Tech CS4984: Computational Linguistics

Instructor: Edward A. Fox


With support from a grant from the National Science Foundation, Computing in Context, NSF DUE-1141209, and resulting subaward from Villanova to Virginia Tech, this course will give students the opportunity to engage in active learning about how to work with large collections of text, one aspect of 'big data'.

An 11-node Hadoop cluster, along with other tailored computing resources, will aid handling of over 500 million tweets and over 11 terabytes of webpages. Using methods employed in search engines, including linguistic analysis and natural language processing, as well as statistical techniques, students will engage in problem based learning with the semester long challenge of analyzing content collections automatically, extracting key information, and generating easily readable summaries of important events in English. Just-in-time learning will allow development of an understanding of concepts, techniques, and toolkits so students will master the key methods related to computational linguistics (CL).


Professor Edward A. Fox, fox @,, 540-231-5113


senior standing in CS, or instructor permission



Different Aspects of the Common Project:


Prototypes, Iterative Refinement :



Connection with Ensemble:

Logistics for Fall 2014:

Last updated 7/4/2014