Top Grant

The Netherlands Organisation for Scientific Research has recently announced the list of proposals awarded funding through the Top Grant Scheme. The TOP2 is a personal grant worth about 220K € which funds a 4-year PhD position and some extras. This year my proposal on Large-Scale Collaborative Search & Sensemaking (or LACrOSSE for short) was one of the convincing ones - in this blog post I describe what exactly LACrOSSE is about.

The Pitch

In 2011, the MOOC revolution began: Stanford University offered the first so-called Massive Open Online Course (MOOC) in Artificial Intelligence followed by more than 160,000 learners worldwide. The idea of MOOCs quickly spread. Today, MOOCs are being offered by many world-renowned universities, reaching millions of learners. Learning though is a complex process and requires more than just watching a set of videos. Research endeavours have began to investigate how to enable active learning in MOOCs, a challenging task due to the learners’ diversity and the extremely skewed learner-staff ratio. These conditions make peer learning a necessity: MOOC learners learn from each other, with little intervention from instructors.

A ubiquitous part of learning is search and sensemaking: learners turn to the Web seeking additional information, and over time create mental representations of the material. Past works have shown that exploratory searches for highly complex information needs, as encountered during learning, yield significantly better results with respect to material coverage and, importantly, knowledge gains when conducted in collaboration.

Based on these insights, I propose to develop collaborative search & sensemaking (CSS) as an important activating element of MOOCs. The computer science research challenges I address are derived from the fact that existing works in CSS have focused on collaborations between extremely small groups of users (<5). I aim for the very opposite: explicit collaborations between large groups of learners (>100) that share similar information needs. How to “scale up” CSS to large numbers of collaborators is the question I will answer in this project.

Why Now

LACrOSSE has at its disposal the resources necessary to study learners’ behavior and develop new techniques. Until now due to study limitations, the research in CSS has focused on collaborations between extremely small groups of users (up to 5) that share a complex information need. Previous works have suggested that larger group sizes may lead to better collaborative search & sensemaking, though this hypothesis has not been tested. Moreover, the vast majority of existing research findings in CSS are based on laboratory studies with a small number of study participants (usually less than 30) and simulated information needs. LACrOSSE will be carried out on the basis of complex real-world information needs. By leveraging a large number of MOOC learners for my studies, I am not only able to use research of small-group collaborative search as a starting point for large-scale CSS, I can also verify and validate previous research findings in a setting that is several magnitudes larger and more realistic.

The Research Topics

(1) Scaling up existing technologies

The need for collaborative search arises regularly, not only in the context of tertiary education, but in general Web search as well, with users relying on commercial Web search engines to plan trips together or decide on a common purchase (two classic tasks of CSS laboratory studies). To enable collaborations, a number of CSS prototypes have been developed in the past, many of which have been confined to small-scale, single-use laboratory studies with small groups of collaborators. In this setting, collaborative search has generally been found to lead to more effective search behaviour (groups find relevant documents faster than single users) and a greater coverage of the search space (groups find more unique relevant documents than single users). LACrOSSE will empirically investigate the boundaries of existing technologies (by employing them in the MOOC setting), guided by the following questions:

  • What group sizes can existing approaches to explicit collaborative search support effectively?
  • What impact does the group coherence and diversity have on the search effectiveness and coverage?
  • At what point does the cognitive effort of collaboration outweigh the benefits?
  • Are the guiding principles of small-group CSS valid in large-group settings?

(2) Iterative collaborative search & sensemaking

One of LACrOSSE’s challenges is how to effectively enable a large number of collaborators (hundreds) to work together towards a goal. The asynchronous nature of online learning (learners are active at different times) offers a natural manner of partitioning the collaborative experience: groups of collaborators work in iterations and in each iteration group members have access to a subset of the results achieved in the previous iteration. The questions to be explored here include:

  • How can iterative search be supported algorithmically? Based on which criteria should the search space be partitioned across collaborators and iterations?
  • How can users visualize their sensemaking process effectively (besides knowledge maps)?
  • For how many iterations do we observe positive effects in knowledge gain, depth of understanding and speed of content assimilation?
  • Do some factors negatively influence the utility of iterative sensemaking, such as the diverse expertise of the collaborators or inaccuracies in the generated mental models?

(3) Different roles for different people

An alternative, orthogonal, strategy to iterative CSS is the assignment of different roles to collaborators based on their preferences, expertise and availability:

  • How many different types of roles (and which ones) are beneficial to assign in large-group collaborative search? How can collaborators effectively share a single role?
  • What is the interplay between iterative collaborative search and role-based collaborative search? Do they lead to additional gains in our assessment metrics when combined in a single iterative and role-based collaborative search setup?

(4) Learning through search metrics

The currently existing indicators of learning whilst searching (e.g. query complexity and document display time), are based on heuristics and intuitions instead of user models and have been investigated for individual users only. Using the previously introduced indicators as a foundation to built upon, questions to investigate include:

  • How can the amount of learning that is taking place in the search process be quantified on a group and individual user level?
  • Which models of user or group behaviour are sufficiently accurate and elementary to enable the development of feasible model-based learning metrics?
  • When and by whom (all users or only a subset of users involved in the search) does learning occur in the collaborative search process?


If this topic interests you, get in touch, I am always happy to collaborate! If you are a looking for a PhD position and have a background in information retrieval, contact me as well - an official PhD position announcement will be made in the upcoming weeks.