Knowledge Discovery Group

Web Information Retrieval

Persons

    Lecturer: Prof. Dr. Ansgar Scherp
    Exercises: Falk Böschen

Times

    Lectures: Wednesday, noon to 2pm, LMS2 – R.Ü2
    Exercises: Thursday, 4pm to 6pm, CAP4 – R,1304a

    PLEASE NOTE: First lecture for this class is April 19th in LMS2 - R.Ü1.

Organization

This lecture will be taught either in German or in English (depending on the audience).

Summary

The ability to find information on the web is an essential technique in our digital age. In order to understand today’s search engines and retrieval systems, the course offers an introduction to basic as well as advanced techniques. The topics cover the crawling and processing of large document corpora, different retrieval models, as well as evaluation of information retrieval systems.

Goals

The students will be enabled to understand, reflect, and apply different methods and techniques in web information retrieval.

Content

This course gives an introduction to basic and advanced methods of information retrieval. Specific focus will be put on dealing with Web data. The course introduces the topic by briefly looking into the process of information retrieval and information seeking. Subsequently, the evaluation of information retrieval systems is discussed. This includes the classical Cranfield paradigm, set-based metrics, ranking-aware metrics, and significance tests. Furthermore, different tasks in the pre-processing of the data are presented such as tokenization and filtering. The core part of the course covers different information retrieval models such as the Boolean Retrieval Model, Vector Space Model and Probabilistic Retrieval Models. Further topics of the course include crawling of web documents and understanding the web as graph. The latter notion is used for authority ranking such as PageRank and HITS. Finally, techniques from machine learning for information retrieval are considered such as Learning to Rank and Language Models.

Learning Material

Learning material will be provided in form of presentation slides in OLAT (https://lms.uni-kiel.de/url/RepositoryEntry/1529282568).

Requirements

Knowledge in Algorithms and Data Structures as well as Programming from the Bachelor studies computer science or business informatics.

Course Assessment

The exam will be oral or in written, depending on the size of the class. Active participation in the tutorials is prerequisite for admission to the exam.

News

  • Homepage kicked off!