TY - DATA T1 - Title and subtitles of Wikipedia articles PY - 2017/06/06 AU - David Sanchez-Charles UR - https://data.4tu.nl/articles/dataset/Title_and_subtitles_of_Wikipedia_articles/12697145/1 DO - 10.4121/uuid:61fb9665-40ab-4b70-8214-767c521cc950 KW - text events KW - text mining KW - unstructured process N2 - This dataset contains 871 articles from Wikipedia (retrieved on 8th August 2016), selected from the list of featured articles ({https://en.wikipedia.org/wiki/Wikipedia:Featured_articles}) of the 'Media', 'Literature and Theater', 'Music biographies', 'Media biographies', 'History biographies' and 'Video gaming' categories. From the list of articles, the structure of the document, i.e. sections and subsections of the text, is extracted. The dataset also contains a proposed clusterization of the event names to increase comparability of Wikipedia articles. ER -