$\newcommand{\ones}{\mathbf 1}$ Back to HP More quizzes: Hadoop Graph Pig Streaming
All errors are my own! Though if you find any, an email would be appreciated ....
Be aware that some of these questions may not make a lot of sense outside of the taught course.
-Caudia Hauff

Graphs and Pregel/Giraph

For graphs similar to the one shown below, how many iterations are at most required to compute parallel breadth-first search in Hadoop?
  1. The max. number of iterations depends on the diameter of the graph.
    Incorrect.
  2. The max. number of iterations depends on the number of nodes in the graph.
    Correct!
  3. The max. number of iterations depends on the number of edges in the graph.
    Incorrect.
  4. The max. number of iterations does not depend on either of these three aspects (diameter, number of nodes, number of edges).
    Incorrect.

You are given the Giraph code shown below. You can assume as input a directed graph (all edge weights are 1.0) which is encoded in adjacency list format (each node is encoded together with all its outgoing edges). What does this code compute?
  1. A node's PageRank score.
    Incorrect.
  2. The shortest path between any two nodes in the graph.
    Incorrect.
  3. A node's inlink count.
    Correct!
  4. A node's outlink count.
    Incorrect.

To compute breadth-first search in Giraph, what information needs to be send across the network for each superstep?
  1. The entire graph structure.
    Incorrect.
  2. The recomputed distances.
    Correct!
  3. The nodes and their associated meta-data.
    Incorrect.
  4. The edges and their associated meta-data.
    Incorrect.

To compute PageRank in Pregel, an Aggregator is used. What is it used for?
  1. To compute the out-degree of each node.
    Incorrect.
  2. To compute the PageRank mass of dangling nodes.
    Correct!
  3. To compute the sum of all PageRank mass in the graph.
    Incorrect.
  4. To compute the PageRank score of each node.
    Incorrect.

An aggregator in Giraph/Pregel is somewhat similar to a counter in Hadoop. What is the main difference?
  1. Counters are meant to let machines communicate small values, aggregators are not meant to do that.
    Incorrect.
  2. Aggregators provide reliable values during the execution of a job, Counters only provide reliable values after the job has been executed.
    Correct!
  3. Counters can be used to terminate a job, based on the counter value. Aggregators cannot be used to achieve this.
    Incorrect.
  4. Aggregators and counters are the same. Only their names are different.
    Incorrect.

Which of the following statements is true? Vertices that have voted to halt in Pregel/Giraph ...
  1. cannot be reactivated.
    Incorrect.
  2. can only be reactivated through an aggregator.
    Incorrect.
  3. can be reactivated through incoming messages.
    Correct!
  4. can be reactivated through the combiner.
    Incorrect.

When Pregel/Giraph are used to compute the minimum existing vertex value in a directed graph, the final value read from an arbitrary vertex will always be correct if ...
  1. the graph is strongly connected.
    Correct!
  2. the graph is weakly connected.
    Incorrect.
  3. the graph is bipartite.
    Incorrect.
  4. the graph has a bow-tie structure.
    Incorrect.

What is the minimum number of supersteps required when using Pregel/Giraph to compute the outdegree of each vertex in a directed graph?
  1. One superstep if the adjacency list of a vertex $v$ contains the IDs of all vertices linking to $v$.
    Incorrect.
  2. One superstep if the adjacency list of a vertex $v$ contains the IDs of all vertices that $v$ links to.
    Correct!
  3. Zero supersteps if the adjacency list of a vertex $v$ contains the IDs of all vertices that $v$ links to.
    Incorrect.
  4. Two supersteps if the adjacency list of a vertex $v$ contains the IDs of all vertices linking to $v$.
    Incorrect.

When does the shuffling & sort phase take place in Giraph?
  1. After every superstep.
    Incorrect.
  2. After every $n$ supersteps ($n$ can be set by the developer).
    Incorrect.
  3. Never.
    Correct!
  4. That depends on the task to solve; it can be switched on by the developer.
    Incorrect.

Which of the following statements about Giraph are correct?