How to recommend an unseen paper?

The script recommend.py takes in input a trained DSSM and a stream of unseen papers and determines for each of them whether it should be recommended or not.

Usage

In a console, you can use the following command:

>>> python recommend.py -d "../data/dataset-pasi.npz" -n "Gabriella Pasi" -s "pasi" -af "../data/pasi-papers.txt" -v "../data/dssm-pasi.npz" -u "../data/unseen-papers.txt"

API

recommend.compute_features_batch(papers, ngrams=None, verbosity=1)[source]

Compute the features of the given list of papers, w.r.t. the ngrams.

Parameters:
  • papers (list of dicts) – the list of papers whose features are to be computed
  • ngrams (list of strings) – the n-grams with which we compute the features
  • verbosity (int) – 0: quiet, 1: normal, 2: high
Returns:

the features of each paper, identified by its id

Return type:

dict

recommend.main(dataset, author_papers_file, author_name, author_slug, author_dssm, unseen_papers)[source]

Given a stream of unseen papers, decides if each paper should be recommended or not.

Parameters:
  • dataset (string) – path to the user’s dataset
  • author_papers_file (string) – path to the user’s papers file (raw text file)
  • author_name (string) – author’s full name
  • author_slug (string) – short and ASCII string for the author’s name
  • author_dssm (string) – path to the trained DSSM’s parameters file
  • unseen_papers (string) – path to the raw file containing unseen papers’ titles and abstracts