BSc and MSc thesis supervision
I am supervising BSc and MSc projects in the broad areas of information retrieval, natural language processing and data science. Across BSc and MSc, I have supervised more than 45 projects. For all research directions, taking a look at papers at recent conferences (such as SIGIR, CIKM, WSDM, EMNLP, ACL, KDD) and ongoing benchmark efforts (MSMarco, SQUAD 2.0, GLUE, decaNLP, TREC, Kaggle) may help you to figure out a topic of interest.
To give you a few concrete ideas, here are a number of thesis projects I supervised in recent years:
- The Impact of Group Size on Collaborative Search (MSc)
- The Learning Tracker (MSc)
- Context-Based Spelling Correction for the Dutch Language: Applied on spelling errors extracted from the Dutch Wikipedia revision history (MSc)
- A Multi-Language Comparison of Influences on Author Verification using Character N-Grams (MSc)
- Evaluating Collaborative Search for a Learning-Oriented Search Task (MSc)
- MANtIS: a novel information seeking dialogues dataset (MSc)
- Axiomatic Thinking in Neural IR (MSc)
- Presenting Web Search Results over a Speech-Only Channel with Minimal Cognitive Load (BSc)
- Expanding LogUI: Adding Screen Capturing and a Statistical Analysis Dashboard for Web-Based Experiments (Bsc)
- EGA Membership Card System (BSc)
- Search Assistant: Effect of Chatbot on User’s Collaborative Search Behavior (BSc)
Below are the resources I have developed for my courses (some are more up-to-date than others): Big Data Processing, Web and Database Technology and Information Retrieval.
Big Data Processing
Since 2013/2014 I have been teaching the second year Bachelor course Big Data Processing at TU Delft (with 2016/17 being the last time for now). The course covers a range of technologies in the Hadoop ecosystem after a short excursion into the streaming world; I created the material based on a number of great books, including Mining of Massive Datasets, Data-Intensive Text Processing with MapReduce, Hadoop: The Definite Guide, Programming Pig and ZooKeeper.
Slides - 2016/17 Edition
- Streams I
- Streams II
- Algorithm design for MapReduce
- Pig I
- Pig II
- Graph algorithms
- 2 more lecture on Spark completed this course.
Assignments - 2016/17 edition
- Assignment 1: Streaming
- Assignment 2: Streaming and Hadoop
- Assignment 3: Hadoop
- Assignment 4: Pig data
- Assignment 5: Pig data
- Assignment 6: Giraph
- Assignment 7: Spark
A Sample of Previous Exams
- 24 questions on streaming
- 32 questions on MapReduce/Hadoop
- 10 questions on graphs and Giraph
- 12 questions on Pig/Pig Latin
Web (and Database) Technology
Since 2013/2014 I have also been teaching the first year Bachelor course Web and Database Technology (known as TI1506 or CSE1500) at TU Delft, together with Alessandro Bozzon. I teach the Web technology part, which turned out to be quite a challenge due to the wide variety of skill sets our incoming students possess (some work as Web developers, others have never written a single line of HTML before the start of this course).
The web lecture transcripts (with self-check questions, demo code, assignments, etc.) are available here.
Feel free to use the materials with acknowledgement.
Needless to say that this is ongoing work at all times - web tech changes quickly.
In 2019/20 I co-taught the MSc Information Retrieval course with Nava Tintarev, splitting it along an IR and NLP line. The course setup, slides and group projects can be found here.