The document collection is MS MARCO Passages and has to be stored in collection.tsv
.
Furthermore, for the doc2query
based approaches, descriptive queries for each document in the collection must be stored in doc2query.tsv
.
This file can be automatically generated using this script.
A MS MARCO document collection has been provided here.
A pre-generated doc2query.tsv
file has been made available here.