CS5604 - Information Retrieval
Why take CS5604?
- To prepare you for working at Google, Microsoft, or any company
involved in searching, WWW, text, and/or machine learning.
- To prepare you for research involving search engines, social
media, natural language processing, text mining, classification,
clustering, indexing, and/or information seeking/exploration.
- To gain proficiency with parallel processing on clusters.
- To prepare you to take CS6604 (Digital Libraries) in Spring 2017.
- 20+ node Hadoop Cluster with 10Gbit network connection
- Cloudera software including HBase, HDFS, Hive, Mahout,
MapReduce, Nutch, Pig, Solr, Spark, Sqoop`
NLTK, and Python toolkits
Introduction to Information Retrieval by Christopher D.
Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008, 496 pages,
Cambridge University Press, ISBN-10: 0521865719, ISBN-13:
978-0521865715. See also online versions, slides, etc.
- Free (through Library download), recommended:
ChengXiang Zhai and Sean Massung. 2016. Text Data Management and
Analysis: a Practical Introduction to Information Retrieval and Text
Mining. Association for Computing Machinery and Morgan & Claypool, New
York, NY, USA.
- VtechWorks reports from CS5604 projects
About the Instructor
- Professor Edward A. Fox, firstname.lastname@example.org, 540-231-5113, Torg. 2160G.
Office hours are TBD or by appointment.
- Dr. Fox's 1983 Ph.D. was supervised by Prof. Gerard Salton at
Cornell University, often called "the father of information
- GRAs, working in 2030 Torg.:
- Sunshin Lee, email@example.com
- Liuqing Li, firstname.lastname@example.org
Edward A. Fox (CV, directions, hours,
of Computer Science
Last Updated: September 18, 2016
© Edward A. Fox 2016