# Replication data for "Information Frictions, Overconfidence, and Learning: Experimental Evidence from a Floodplain"

This .zip archive contains data for the partial replication (see under "Sharing and access information") of the working paper "Information Frictions, Overconfidence, and Learning: Experimental Evidence from a Floodplain" by [Sofia Badini](https://sofiabadini.github.io/). The pre-analysis plan for this project is hosted at the [Open Source Framework](https://osf.io/yxc3m), while the most current working paper version is [here](https://drive.google.com/file/d/12N7N-KCTPBidlzxtHDkb5e8cirPAoJeh/view?usp=sharing).

**Abstract** (as of 25/06/2024): I use an online experiment to study whether offering information to floodplain residents is sufficient to change their perceived risk exposure and demand for insurance. The participants are offered information on the flood risk profile at their address and on the rules over compensation of flood damages. I find that respondents tend to misperceive their risk category according to publicly available flood maps, but express high levels of confidence in their guesses. When not prompted to engage with the information they are offered, one third of them read nothing. Respondents who are asked to read information on their risk profile tend to stop reading any further and report a lower willingness-to-pay for insurance. However, this effect does not seem to be driven by respondents learning more from the information they are provided with, at least based on how they update their beliefs. Instead, I find suggestive evidence of backlash to information among residents of high risk areas and individuals who initially underestimated their risk category.

## Sharing and access information

This data is automatically downloaded by the replication package of this project, which will be hosted on [GitHub](https://github.com/SofiaBadini). This version only includes data that is open access. A future version will include all data necessary for the replication, and thus will be subject to restrictions.

## Data structure and data sources

The .zip archive *replication_data* contains four sub-folders:

  * Folder *BAG*: contains data about all the (full) addresses in the Netherlands, derived from the Addresses and Buildings Key Registry (Basisregistraties Adressen en Gebouwen, or BAG)    and acquired in .csv format via [Geotoko](https://geotoko.nl/) in July 2022. The file ``bag-adressen-woning-nl.csv.zip`` provides a building-related information, such as the function of each (dwelling within a) building, its perimeter and size, and its year of construction. The dataset codebook ``codebook-bag-2022[DUTCH].pdf`` (in Dutch) is included in the folder.

  * Folder *ENW*: contains shapefiles representing those areas of the Southern regions of The Netherlands (Limburg and North Brabant) that were flooded in July 2021. These maps have been shared by the ENW (Expertise Netwerk Waterveiligheid), the association of Dutch flood protection specialists, via the 4TU.ResearchData repository of Delft University of Technology (see [here](https://data.4tu.nl/search?search=%22Limburg%20floods%22). The maps are based on best available information collected via aerial photography and during fieldwork at the flooded sites, and include realized inundation extent (in the sub-folders *floodsGeul*, *floodsMaas*, and *floodsRoer*), areas evacuated via emergency ordinances (in the sub-folder *evacuations*), and locations where incidents to water management infrastructure occurred (in the sub-folder *incidents*).

  * Folder *RISICOKAART*: contains shapefiles of the flood maps developed for the European Floods Directive (ROR2) delivered at the end of 2019 via the [Risicokaart website](https://www.risicokaart.nl/), obtained in October 2021 by contacting lbo@risicokaart.nl. The sub-folder *floodmaps2019* contains the shapefiles of flood maps developed for four different scenarios: Large probability (10% risk in any given year, named *10depth*), medium probability (1% risk in any given year, named *100depth*), small probability (0.1% risk in any given year, or 1 in 1,000 years flood, named *1000depth*), and scenario of extraordinary events (0.01% risk in any given year, or or 1 in 10,000 years flood, named *10000depth*). These shapefiles consist of polygons classified by maximum water depth. The sub-folder *otherinfo* contains additional data on the location of primary water defences (*primarydef2019*, which protects The Netherlands against floods from major rivers and the sea) and of regional water defences (*regionaldef2019*, which protects The Netherlands against floods from smaller rivers).

  * Folder *SURVEY*: contains synthetic data created by the author (as for now). This project uses survey data collected with the platform [Qualtrics](). This dataset, anonymized and including only the subset of the original Qualtrics data that I used for the analysis, will be shared once this paper is published. The *SURVEY* folder contains the dataset of survey recipients without any kind of identifying information (e.g., no addresses), named ``survey_recipients.csv``, and a synthentic dataset named ``synthetic_survey_data.csv`` generated using [Synthetic Data Vault (SDV)](https://sdv.dev/) and based on the original Qualtrics data. The *SURVEY* folder additionally contains two more datasets: ``original_variables.csv``, which details which original variables will be cleaned (where by original I mean the "raw" data collected by Qualtrics), and ``final_variables.csv``, which details which variables will appear in the final dataset.


## Contact

For further information, please contact sofia.badini@wur.nl
