Person in charge: | (-) |
Others: | (-) |
Credits | Dept. | Type | Requirements |
---|---|---|---|
7.5 (6.0 ECTS) | CS |
|
PRED
- Prerequisite for DIE , DCSYS PS - Prerequisite for DCSFW |
Person in charge: | (-) |
Others: | (-) |
Understanding the problems of retrieving information. Knowing the different components of an information retrieval system, the factors and techniques that can optimize the process, and how to use and adapt them. Knowing some applications of these systems, at least in bioinformatics and the Web.
Estimated time (hours):
T | P | L | Alt | Ext. L | Stu | A. time |
Theory | Problems | Laboratory | Other activities | External Laboratory | Study | Additional time |
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
4,0 | 4,0 | 2,0 | 0 | 2,0 | 10,0 | 0 | 22,0 | |||
Formal definition and basic concepts: Abstract document models and query languages. Boolean model. Vector model. Latent semantic indexing.
|
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
2,0 | 2,0 | 0 | 0 | 0 | 4,0 | 0 | 8,0 | |||
Inverted files. Index compression. Example: Efficient implementation of the cosine rule with tf-idf. Example: Lucene.
|
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
2,0 | 2,0 | 2,0 | 0 | 4,0 | 4,0 | 0 | 14,0 | |||
Precision and Recall. Other measures of performance. Reference collections. Relevance feedback and query expansion.
|
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
8,0 | 6,0 | 0 | 0 | 0 | 8,0 | 0 | 22,0 | |||
Ranking and relevance in Web models. PageRank algorithm. Architecture of web searchers. Web crawling. Link-based analysis of social networks.
|
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
8,0 | 8,0 | 3,0 | 0 | 2,0 | 16,0 | 0 | 37,0 | |||
Pattern search. Algorithms for approximate search and exact search. Hidden Markov models. Tries. Inverted files, suffix tree. Construction algorithms, use and analysis.
|
|
T | P | L | Alt | Ext. L | Stu | A. time | Total | ||
---|---|---|---|---|---|---|---|---|---|---|
6,0 | 6,0 | 3,0 | 0 | 2,0 | 15,0 | 0 | 32,0 | |||
DNA chain patterns. Sequence similarity. DNA sequencing. DNA databases.
|
Total per kind | T | P | L | Alt | Ext. L | Stu | A. time | Total |
32,0 | 29,0 | 12,0 | 0 | 11,0 | 59,0 | 0 | 143,0 | |
Avaluation additional hours | 5,0 | |||||||
Total work hours for student | 148,0 |
The lab classes will implement variations of the algorithms seen in the theory and problem sessions, or will apply them to search for information in lifelike situations.
Students may be required to prepare for some of the lab sessions. Some of them will require the drafting of a short report or submission of code. This work will count towards student assessments.
Nowadays we are using the Lucene package.
There will be a first part exam at about half the course, and at the end the students can choose whether to take a second part exam or a final exam of the whole course.
The lab grade will be based on the reports or the programmes submitted after the lab sessions.
The course note is calculated as follows:
Students who choose to take the second part exam:
0.2 * lab note
+ 0.4 * 1st part exam
+ 0.4 * 2nd part exam
Students who choose to take the final exam:
0.2 * lab note
+ max (0.2 * 1st part exam + 0.6 * final exam, 0.8 * final exam)
Ability to produce medium-sized programmes, preferably of object-oriented nature.
Ability to design and analyze simple data structures.
Know the difference between main memory and secondary memory and its impact on the program's efficiency.