data.zip (425.27 kB)

Title and subtitles of Wikipedia articles

Download (425.27 kB)
dataset
posted on 06.06.2017 by David Sanchez-Charles
This dataset contains 871 articles from Wikipedia (retrieved on 8th August 2016), selected from the list of featured articles ({https://en.wikipedia.org/wiki/Wikipedia:Featured_articles}) of the 'Media', 'Literature and Theater', 'Music biographies', 'Media biographies', 'History biographies' and 'Video gaming' categories. From the list of articles, the structure of the document, i.e. sections and subsections of the text, is extracted. The dataset also contains a proposed clusterization of the event names to increase comparability of Wikipedia articles.

History

Contributors

CA Strategic Research

Publisher

4TU.Centre for Research Data

Format

media types: application/zip, text/csv

Exports

Logo branding

Exports