The Wayback Machine - https://web.archive.org/web/20210120085538/https://github.com/ecly/see_et_al_2017_rouge
Skip to content
master
Go to file
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 

README.md

ROUGE scoring See et al. (2017)

Repository to replicate the ROUGE scores from See et al. (2017).

We find that the reported scores correspond to those produced by the python re-implementation py-rouge, instead of those by produced by the official Rouge 155 Perl wrapper pyrouge.

The evaluate.py script accepts a 'hypothesis' folder and a 'reference' folder. The ROUGE scores computed with py-rouge and pyrouge respectively are then computed and printed to standard out.

The test_output folder, contains the test outputs from See et al. (2017), that can be downloaded from the README.md of the official repository.

Setup

pip install py-rouge pyrouge

pyrouge prerequisites

Ensure Perl XML library is installed:
On Arch Linux: sudo pacman -S perl-xml-xpath
On Ubuntu: sudo apt-get install libxml-parser-perl

ROUGE 155 install tips/debugging:
https://stackoverflow.com/questions/47045436/how-to-install-the-python-package-pyrouge-on-microsoft-windows

Evaluate

Note that pyrouge evaluates ~4x as slow as py-rouge, so some patience is required.

# Evaluate Pointer Generator
$> python evaluate.py test_output/pointer-gen test_output/reference

Python (py-rouge) scores:
         ROUGE-1 (F1): 36.43
         ROUGE-2 (F1): 15.66
         ROUGE-L (F1): 33.42

Perl (pyrouge) scores:
         ROUGE-1 (F1): 36.16
         ROUGE-2 (F1): 15.61
         ROUGE-L (F1): 33.21

# Evaluate Pointer Generator + Coverage
$> python evaluate.py test_output/pointer-gen-cov test_output/reference

Python (py-rouge) scores:
         ROUGE-1 (F1): 39.53
         ROUGE-2 (F1): 17.28
         ROUGE-L (F1): 36.38

Perl (pyrouge) scores:
         ROUGE-1 (F1): 39.24
         ROUGE-2 (F1): 17.22
         ROUGE-L (F1): 36.15

About

Recalculating ROUGE scores for See et al. (2017) test outputs.

Topics

Resources

Releases

No releases published

Packages

No packages published
You can’t perform that action at this time.