INFORMATION RETRIEVAL
17:610:551
Spring 2002
Gheorghe Muresan

SCHEDULE AND ASSIGNED READINGS (expect revisions)

 

You are requested to read at least two chapters or articles from the list provided each week. According to your background, interest and potential project you have in mind, as well as the time available, you are encouraged to read the other items in the list, or to read background material recommended, or to implement some of the algorithms we discuss.


Lecture 1 ?Jan 24: Introduction and overview of  the course.


Lecture 2 ?Jan 31: The goals of IR. IR problems, the IR situation, and IR systems.

Readings: Hersh, Chapters 1 and 2. In Sparck Jones & Willett, "Overall introduction", and Chapter Two, "Introduction".  Belew, R. K. (2000), Chapter 1, “Overview? Belkin, N.J. (1980) Anomalous states of knowledge as a basis for information retrieval. Canadian Journal of Information Science, v. 5: 133-143. Also: Belkin & Vickery (1985) Chapters 1 and 2; Ingwersen (1992), Chapter 3; the introductory chapters to any of: Lancaster (1978); Lancaster & Warner (1993); Meadow (1992); van Rijsbergen (1979), chapter 1: Introduction; Salton & McGill (1983).


Lecture 3 ?Feb 7: Fundamental concepts in IR. Information, meaning, aboutness, relevance.


Readings: In Sparck Jones & Willett, from Chapter 3: the "Introduction".  Belkin, N.J. (1978) Information concepts for information science. Journal of Documentation, v. 34, no.1: 55-85. Hutchins, W.J. (1978) The concept of "aboutness" in subject indexing. Aslib Proceedings, vol. 30: 172-181 (Also in Sparck Jones & Willett, pp. 93-97). Saracevic, T. (1975) Relevance: a review of and a framework for the thinking on the topic. Journal of the American Society for Information Science, vol. 26: 321-343 (Also in Sparck Jones & Willett, pp. 143-165).


Lecture 4 ?Feb 14: Actors and processes in IR systems. What do we want from Information Retrieval ?

Readings: Belkin, N.J. (1993) Interaction with texts: Information retrieval as information-seeking behavior. In: Information Retrieval `93: Von der modellierung zur Anwendung. Konstanz: Universitaetsverlag Konstanz, 55-66. Croft, W.B. (1995) What do people want from information retrieval? D-Lib Magazine, November. In Kowalski and Maybury: Chapter 2 “Information Retrieval System Capabilities? Belkin N.J. & Croft, W.B. (1992) Information filtering and information retrieval: Two sides of the same coin? Communications of the ACM, v. 35 no. 12: 29-38.

 


Lecture 5 ?Feb 21: Document and query representation. Manual vs. automatic indexing.


Compulsory readings: Hersh, Chapters 5: “Indexing? J. D. Anderson & J. Perez-Carballo, “The nature of indexing: how humans and machines analyze messages and texts for retrieval. Part I: Research, and the nature of human indexing; Part II: Machine indexing, and the allocation of human versus machine effort? Information Processing and Management, vol. 37 (2001), p. 231-254, p. 255-277.

Other readings: In Sparck Jones & Willett, from Chapter 6, the "Introduction" (especially the section on Indexing). Foskett, D.J. (1980) Thesaurus. In A. Kent, J. Lancour & J.E. Daily, eds., Encyclopedia of Library and Information Science, v. 30, pp. 416-462. New York: Marcel Dekker (Also in Sparck Jones & Willett, pp. 111-134).

 


Lecture 6 ?Feb 28: Automatic indexing. Lexical analysis. Weighting. Data structures.


Compulsory readings: van Rijsbergen (1979), Chapter 2: “Automatic text analysis? Also review “Automatic indexing?from last week.

Other readings: Hersh, Chapters 8: “Lexical-statistical systems? Belew, R. K. (2000), Chapter 2, “Extracting lexical features? In Sparck Jones & Willett, from Chapter 6, the "Introduction" (especially the section on Indexing). Salton, G. & Buckley, C. (1988) “Term weighting approaches in automatic text retrieval? Information Processing and Management, vol. 24: 513-523 (Also in Sparck Jones & Willett, pp. 323-328). Robertson, S. E. and Sparck Jones, K. (1997), ?SPAN style="mso-bidi-font-size: 12.0pt">Simple, proven approaches to text retrieval?/SPAN>, University of Cambridge Computer Laboratory Technical Report no. 356, 1994 (updated 1996,1997).

For stemming code or a demo, see Martin Porter’s site.

Presentations:

Eakins, J. P. and Graham, M. E.  "Content-based Image Retrieval: A Report to the JISC Technology Applications Programme" - Stacy Adduci.

Mikheev, Andrei “Document Centered Approach to Text Normalization?/A>, SIGIR 2000, Athens ?Craig Willard.

Homework !


Lecture7 - Mar 7: Models of IR. Interaction models. Indexing models. Relevance feedback.

Readings: In Sparck Jones & Willett, from Chapter 5, the "Introduction". Cooper, W.S. Getting beyond Boole. Information Processing and Management, vol. 24: 243-248. Also in Sparck Jones & Willett, pp. 265-267. Robertson, S.E. The probability ranking principle in IR. Journal of Documentation. vol 33: 294-304 (Also in Sparck Jones & Willett, pp. 281-286). Salton, G., Wong, A. & Yang, C.S. (1975) “A vector space model for automatic indexing?/A>, Communications of the ACM, vol 18: 613-620. Also in Sparck Jones and Willett, pp. 273-280. Saracevic, T. (1996). Interactive models in information retrieval (IR): Progress, problems, proposal. In Proceedings of the 1996 ASIS Annual Meeting. Medford, NJ: Learned Information. Turtle, H. & Croft, W.B. (1990) “Inference networks for document retrieval?/A>, SIGIR 1990,  New York: ACM, 1-24.

Readings proposed for presentation:

Rajashekar, T. B. and Croft, W. B. “Combining Automatic and Manual Index Representations in Probabilistic Retrieval?/A>, JASIS, 1995.

Campbell, I. “Supporting Information Needs by Ostensive Definition in an Adaptive Information Space?/A>, MIRO?5.

Presentation:

 Bates, Marcia J.  “The Design of Browsing and Berrypicking Techniques for the Online Search Interface." Online Review 13 (October 1989): 407-424 ?Sharon Kaye.

 

 


Lecture 8 ?Mar 14: User interfaces for IR systems.

Part I: Interaction models.


Compulsory r
eadings: Chapter 10: “User Interfaces and Visualization?/A> by Marti Hearst in ?/SPAN>Modern Information Retrieval?/SPAN>.

Recommended readings: Journal of the American Society of Information Science, vol. 43, issue 2, 1992, special issue on Human-Computer Interface: “Introduction and Overview?/A> by Lunin and Harman, ?SPAN style="mso-bidi-font-size: 12.0pt">Interfaces for end-user information seeking?/SPAN> by Gary Marchionini, “User-friendly systems instead of user-friendly front-ends?/A> by Donna Harman, “Intelligent information retrieval: An introduction?/A> by Susan Gauch, “Models for hypertext?/A> by Mark F. Frisse and Steve B. Cousins; Muresan, G. and Harper, D. J. ?SPAN lang=EN-US style="mso-bidi-font-size: 12.0pt; mso-ansi-language: EN-US">Document Clustering and Language Models for System-Mediated Information Access?/SPAN>, ECDL?1, Darmstadt, p. 438-449.

Presentations:

Bates, M. (1990) “Where should the person stop and the information search interface start??/A> Information Processing and Management, v 26(5): 575-591 ?Cheryl Milburn.

O’Day, V. L. and Jeffries, R. ?SPAN lang=EN style="mso-bidi-font-size: 13.0pt; mso-ansi-language: EN">Orienteering in an information landscape: how information seekers get from here to there?/SPAN>, InterCHI?3, Amsterdam ?Tamara Richman.

Hendry, D. G. and Harper, D. J. “An informal information-seeking environment?/A>, JASIS 48 (11), 1997 ? Roman Santillan.

 


 

Spring break !


Lecture 9 ?Mar 28: User interfaces for IR systems.

Part II : Tools and techniques. Information Visualization. Structure. Categorization vs. clustering.


Readings: Shneiderman, Ben, chapter “Information Search and Visualization?in “Designing the user Interface? 3rd ed., 1997 (and associated
webpage); Belkin, N.J., Marchetti, P.-G., Cool, C. (1993) BRAQUE: Design of an interface to support user interaction in information retrieval. Information Processing and Management, 29 (3): 325-344; Chalmers, M. and Chitson, P. “Bead: Exploration in information visualization?/A>, SIGIR?2, Copenhagen, p. 330-337; Nowell, L.T., France, R.K., Hix, D., Heath, L.S., Fox, E.A. (1996) “Visualizing search results: Some alternatives to query-document similarity? SIGIR?96, New York, p. 67-75; Williamson, C., Shneiderman, B. (1992) “The Dynamic HomeFinder: Evaluating dynamic queries in a real-estate information exploration system?/A>, SIGIR?2, New York, p. 338-346; Nowell, L. T. and France R. K. and Hix, D. and Heath, L. S. and Fox, E. A. ?SPAN lang=EN style="mso-bidi-font-size: 13.0pt; mso-ansi-language: EN">Visualizing search results: some alternatives to query-document similarity?/SPAN>, SIGIR?6, Zurich, p. 67-75; Lin, Xia ?SPAN lang=EN-US style="mso-bidi-font-size: 12.0pt; mso-ansi-language: EN-US">Map displays for information retrieval?/SPAN>, JASIS, 48(1), 1997, p. 40-54.

Further readings on HCI:

Preece, J., Rogers, Y. and Sharp, H. (2002) ?“Interaction Design ?Beyond Human-Computer Interaction?/SPAN> (and associated webpage).

Further readings on Information Visualization (IV):

Spence, R. (2000) ? “Information Visualization? ISBN: 0201596261; Chen, C. (1999) ??/SPAN>Information Visualisation and Virtual Environments?/SPAN>, ISBN: 1852331364; Card, S. K., MacKinlay, J. D. and Shneiderman (1999) ??/SPAN>Readings in Information Visualization : Using Vision to Think? ISBN: 1558605339. Also, University of Maryland’s HCI Lab website, and InfoViz, a repository for IV.

Readings proposed for presentation:

Korfhage, Robert R. ?SPAN lang=EN style="mso-bidi-font-size: 13.0pt; mso-ansi-language: EN">To see, or not to see - is that the query??/SPAN>, SIGIR?1, Chicago, p. 134-141;  

Cutting, D. R., Pedersen, J. O., Karger, D. and Tukey, J. W. “Scatter/Gather: A cluster-based approach to browsing large document collections?/A>, SIGIR?2, Copenhagen, p. 318-329.

Presentations:

Gary Marchionini, ?SPAN style="mso-bidi-font-size: 12.0pt">Interfaces for end-user information seeking?/SPAN>, JASIS, 43(2), 1992 ? Minsoo Park.


Lecture 10 ?Apr 4: Evaluation of IR systems. Experimental vs operational IR systems.


Readings: Hersh, chapter 3: “System evaluation? and chapter 7: “Evaluation? In Baeza-Yates & Ribeiro-Neto “Modern Information Retrieval?/A>, chapter 3: “Retrieval Evaluation? In Sparck Jones & Willett, from Chapter 4, the "Introduction" and the articles by Saracevic, et al., Lancaster, and Harman. Su, L. (1992) Evaluation measures for interactive information retrieval. Information Processing and Management, 28(4): 503-516; Harman, Donna “Overview of the first TREC conference?/A>, SIGIR?3, Pittsburg.

In JASIS, 47(1), January 1996, Special Issue: Evaluation of Information Retrieval :- Tague-Sutcliffe, J. M. ? “Some perspectives on the evaluation of information retrieval systems?/A>, Blair, D. C. ?“STAIRS redux: Thoughts on the STAIRS evaluation, ten years after?/A>, Hersh, W. et al. ?“A task-oriented approach to information retrieval evaluation?/A>; Ellis, D. ?“The dilemma of measurement in information retrieval research?/A>; Beaulieu, M. et al. ?“Evaluating interactive systems in TREC?/A>.

In Information Processing and Management, 31 (3), May-June 1995, Special issue: TREC :- Harman, D. - ?SPAN style="mso-bidi-font-size: 12.0pt">Overview of the Second Text Retrieval Conference (TREC-2)?/SPAN>; Sparck Jones, K. ?“Reflections on TREC?/A>; Robertson, S. E. et al. ?“Large Test Collection Experiments on an Operational, Interactive System: Okapi at TREC?/A>; Belkin, N. et al. ?“Combining the Evidence of Multiple Query Representations for Information Retrieval?/A>.

In Information Processing and Management, 36 (1), January 2000, Special issue: TREC :- Harman, D. - ?SPAN style="mso-bidi-font-size: 12.0pt">Overview of the Sixth Text REtrieval Conference (TREC-2)?/SPAN>; Sparck Jones, K. ?“Further reflections on TREC?/A>; Robertson, S. E. et al. ?“Experimentation as a way of life: Okapi at TREC?/A>.

The Text Retrieval Conference (TREC) webpage.

Presentations:

Brajnik, G., Mizzaro, S., Tasso, C. and Venuti, F. “Strategic Help in User Interfaces for Information Retrieval?/A>, JASIST, 53(5), 2002, p. 343-358 ?Tina Marie Doody.

Saracevic, T. “Evaluation of Evaluation in Information Retrieval?/A>, SIGIR?5, Seattle ?Dana Knauff.


Lecture 11 ?Apr 11: Evaluation of interactive IR systems. IR evaluation in context.

Readings: Hersh, Chapters 3, 7. In Sparck Jones & Willett, from Chapter 4, the "Introduction" and the articles by Saracevic, et al., Lancaster, and Harman. Su, L. (1992) Evaluation measures for interactive information retrieval. Information Processing and Management, 28(4): 503-516; Borlund, P. and Ingwersen, P. (1997) “The development of a method for the evaluation of interactive information retrieval systems?/A>, Journal of Documentation, 53(3).

In Information Processing and Management, 37 (3), May 2001, Special issue: Interactive TREC :- Hersh, W. and Over, P. - ?SPAN style="mso-bidi-font-size: 12.0pt">Interactivity at the Text Retrieval Conference (TREC)?/SPAN>; Over, P. - “The TREC interactive track: an annotated bibliography?/A>; Hersh et al. ??SPAN style="mso-bidi-font-size: 12.0pt">Challenging conventional assumptions of automated information retrieval with real users: Boolean searching and batch retrieval evaluations?/SPAN>; Belkin, N. et al. “Iterative exploration, design and evaluation of support for query reformulation in interactive information retrieval?/A>; Allan, J. et al. ?“Evaluating combinations of ranked lists and visualizations of inter-document similarity?/A>; Wu, M. et al. ?“Using clustering and classification approaches in interactive retrieval?/A>; Larson, R. R. - ?TREC interactive with Cheshire II?/A>; Bodner, R. C. et al. ?“The impact of text browsing on text retrieval performance?/A>; Yang, K. - “Passage feedback with IRIS?/A>.

Belkin et al. ?SPAN style="mso-bidi-font-size: 12.0pt">Rutgers' TREC 2001 Interactive Track Experience?/SPAN>, at TREC 2001.

Preece, J., Rogers, Y. and Sharp, H. (2002) ?“Interaction Design ?Beyond Human-Computer Interaction?/SPAN> (and associated webpage) ?chapters on Evaluation.

Hull, D. ?SPAN style="mso-bidi-font-size: 12.0pt">Using Statistical Testing in the Evaluation of Retrieval Experiments?/SPAN>, SIGIR ?3; Wilcox, R. R. “Statistics for Social Sciences?or any other book on Stats; also, a Statistics textbook online.

Presentations:

Borlund, P. “Experimental Components for the evaluation of interactive information retrieval systems?/A>, Journal of Documentation, Vol. 56, no. 1, 2000, 71-90 ?Christine Bates.

Reid, J. ?SPAN style="mso-bidi-font-size: 12.0pt">A Task-Oriented Non-Interactive Evaluation Methodology for Information Retrieval Systems?/SPAN>, Information Retrieval, 2(1), Feb 2000 - Melissa Roll.

 

PROJECT TOPICS DUE.


Lecture 12 ?Apr 18: Structure. Classification. Clustering.


Readings. van Rijsbergen (1979), Chapter 3: “Automatic classification? In Sparck Jones & Willett, from Chapter 6 the article by Griffiths, Luckhurst & Willett; from Chapter 8, the article by Hayes, Knecht and Cellio and the article by Rau; Leuski, Anton "Evaluating Document Clustering for Interactive Information Retrieval", CIKM'01, 33-40; Hearst, Marti ?SPAN style="mso-bidi-font-size: 12.0pt">The Use of Categories and Clusters in Information Access Interfaces?/SPAN>, in Natural Language Information Retrieval, Strzalkowski (ed.), Kluwer Academic Publishers, 1999; Sanderson, M. and Croft, W. B. “Deriving concept hierarchies from text?/A>, SIGIR 1999, Berkeley; Tombros, A., Villa, R. and Van Rijsbergen, C. J. (2002) “The effectiveness of query-specific hierarchic clustering in information retrieval?/A>, Information Processing and Management, 38(4); Yang, Yiming “An Evaluation of Statistical Approaches to Text Categorization?/A>, Information Retrieval 1, 1999, p.69-90.

Presentations:

Hearst, M. A. and Pedersen, J. O. ?SPAN lang=EN style="mso-bidi-font-size: 13.0pt; mso-ansi-language: EN">Reexamining the cluster hypothesis: scatter/gather on retrieval results?/SPAN>, SIGIR?6, Zurich, p. 76-84 ?Mary Ellen Valverde.

Kural, Y. and Robertson, S. and Jones, S. “Deciphering cluster representations?/A>, Information Processing and Management, 37, 2001, p. 593-601 ?Brendan Banks.


Lecture 13 ?Apr 25: IR on the Web.


Readings: See Journal of the American Society for Information Science and Technology, 53(2), 2002 - Special issue on Web research; Almind, T. C. and Ingwersen, P. (1997)
“Informetric Analysis on the World  Wide Web: Methodological Approaches to Webometrics?/SPAN>, Journal of Documentation, 53(4); Chu, H. and Rosenthal, M (1996) ?/SPAN>Search Engines for the World Wide Web: A Comparative Study and Evaluation Methodology?/A>, Proceedings of ASIS?6.

“The Internet: Bringing Order from Chaos?/A>, special report in Scientific American, March 1997.

“PageRank: Bringing Order to the Web?/A> the model behind Google.

Presentations:

Spink, Amanda (2002) ?SPAN style="mso-bidi-font-size: 12.0pt">A user-centered approach to evaluating human interaction with Web search engines: an exploratory study?/SPAN>, Information Processing and Management, 38(3) ?Shilpa Shanbhag.

Ellis, D., Ford, N. and Furner, J. (1998) “In search of the unknown user: indexing, hypertext and the World Wide Web?/A>, Journal of Documentation, 54(1) ?Jinyoung Park.

 


Lecture 14 ?May 2: Current research and future directions for IR systems. Multimedia IR. Collaborative systems. Recommender systems. User modeling. Document summarization. Information extraction.

Course evaluation.

Readings: Hersh, Chapter 9: “Linguistic Systems? In Sparck Jones & Willett, from Chapter 8, the "Introduction" and any other article there that looks interesting; Chapter 9. SIGIR?9 Workshop on Recommender Systems, UC Berkeley; Set of articles on Recommender Systems in Communications of the ACM, 40 (3), March 1997 ?leading article: Resnick, P. and Varian, H. R. “Recommender Systems? Xie, H. “Patterns between Interactive Intentions and Information-Seeking Strategies?/A>, Information Processing and Management, 38, 2002; Chalmers, Matthew “Paths and Contextually Specific Recommendations?/A>, DELOS Workshop, 2001; Pazzani, M. and Billsus, D. “Learning and Revising User Profiles: The Identification of Interesting Web Sites?/A>, Machine Learning 27, 1997, p313-331.

Presentation:

Gaizauskas, R. and Wilks, Y. (1998) “Information Extraction: Beyond Document Retrieval?/A>, Journal of Documentation, 54(1) ?Fran Pfeffer.


Lecture 15 ?May 9: Discussion/presentation of final projects.

PROJECTS DUE.


Mon ?May 13: Grades due.