Oct-26-2016, 07:28 PM
I have modified this code quite a bit, and have put it on GitHub at:
https://github.com/Larz60p/NLTK-Corpora-Catalog
The output is a better JSON file (dictionary format)
A sample of the printed output is:
https://github.com/Larz60p/NLTK-Corpora-Catalog
The output is a better JSON file (dictionary format)
A sample of the printed output is:
Output:RecId: RecId61
url: https://raw.githubusercontent.com/nltk/nltk_data/gh-pages/packages/corpora/sentiwordnet.zip
unzip: 1
license: Creative Commons Attribution ShareAlike 3.0 Unported license
id: sentiwordnet
webpage: http://sentiwordnet.isti.cnr.it/
subdir: corpora
author: Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani
copyright: Copyright (C) 2013 SentiWordNet Project
checksum: 5043f00829b7db4dd5f21507e092b76a
name: SentiWordNet
size: 4686546
unzipped_size: 13591402