Data and Code for the PhD Thesis "Sensing the Cultural Significance with AI for Social Inclusion"

doi: 10.4121/42144de2-d61e-48b9-a288-aa4da3a806fe.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/42144de2-d61e-48b9-a288-aa4da3a806fe
Datacite citation style:
Bai, Nan (2023): Data and Code for the PhD Thesis "Sensing the Cultural Significance with AI for Social Inclusion". Version 1. 4TU.ResearchData. dataset.
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

This is the Repository of all the research data for PhD Thesis of the doctoral candidate Nan BAI from the Faculty Architecture and Built Environment at Delft University of Technology, with the title of '*Sensing the Cultural Significance with AI for Social Inclusion: A Computational Spatiotemporal Network-based Framework of Heritage Knowledge Documentation using User-Generated*', to be defended on October 5th, 2023.

Social Inclusion has been growing as a goal in heritage management. Whereas the 2011 UNESCO Recommendation on the Historic Urban Landscape (HUL) called for tools of knowledge documentation, social media already functions as a platform for online communities to actively involve themselves in heritage-related discussions. Such discussions happen both in “baseline scenarios” when people calmly share their experiences about the cities they live in or travel to, and in “activated scenarios” when radical events trigger their emotions. To organize, process, and analyse the massive unstructured multi-modal (mainly images and texts) user-generated data from social media efficiently and systematically, Artificial Intelligence (AI) is shown to be indispensable. This thesis explores the use of AI in a methodological framework to include the contribution of a larger and more diverse group of participants with user-generated data. It is an interdisciplinary study integrating methods and knowledge from heritage studies, computer science, social sciences, network science, and spatial analysis. AI models were applied, nurtured, and tested, helping to analyse the massive information content to derive the knowledge of cultural significance perceived by online communities. The framework was tested in case study cities including Venice, Paris, Suzhou, Amsterdam, and Rome for the baseline and/or activated scenarios. The AI-based methodological framework proposed in this thesis is shown to be able to collect information in cities and map the knowledge of the communities about cultural significance, fulfilling the expectation and requirement of HUL, useful and informative for future socially inclusive heritage management processes.

Some parts of this data are published as GitHub repositories:

WHOSe Heritage

The data of Chapter_3_Lexicon is published as, which is also the Code for the Paper WHOSe Heritage: Classification of UNESCO World Heritage Statements of “Outstanding Universal Value” Documents with Soft Labels published in Findings of EMNLP 2021 (

Heri Graphs

The data of Chapter_4_Datasets is published as, which is also the Code and Dataset for the Paper Heri-Graphs: A Dataset Creation Framework for Multi-modal Machine Learning on Graphs of Heritage Values and Attributes with Social Media published in ISPRS International Journal of Geo-Information showing the collection, preprocessing, and rearrangement of data related to Heritage values and attributes in three cities that have canal-related UNESCO World Heritage properties: Venice, Suzhou, and Amsterdam.

Stones Venice

The data of Chapter_5_Mapping is published as, which is also the Code and Dataset for the Paper Screening the stones of Venice: Mapping social perceptions of cultural significance through graph-based semi-supervised classification published in ISPRS Journal of Photogrammetry and Remote Sensing showing the mapping of cultural significance in the city of Venice.

  • 2023-09-25 first online, published, posted
A .zip file of .py python codes, .ipynb Jupyter notebooks, .png images, and .csv datasets
  • Heriland (grant code 813883) [more info...] European Union’s Horizon 2020 research and innovation programme
TU Delft, Faculty of Architecture and the Built Environment, Architectural Engineering +Technology;
Heriland College of Heritage Planning


files (2)