Data related to the paper "Studying social unrest through the lens of social media"
doi: 10.4121/649e8f5d-8e40-4ab7-9d07-b5ef53d810f0
Dataset corresponding to the paper "Studying social unrest through the lens of social media".
107,674 geolocated visual posts from a social media were collected during and after the 'Nahel Merzouk' riots in the summer 2023 in 7 French cities. These posts were fed to a computer vision model with the objective of identifying riot-related posts. This dataset contains the metadata (date, time, and location) of those posts along with the probability for the post to represent a riot (according to the model). Riot-related posts are then clustered into "events", based on their spatiotemporal proximity (see paper for more details).
Columns:
- "timestamp" (TIMESTAMP): Date and time of the posts
- "latitude" (REAL): Latitude at which the post was published
- "longitude" (REAL): Longitude at which the post was published
- "prob" (REAL): Probability that the post represents a riot according to the model
- "pred_class" (INTEGER): Binary variable with value 1 if it represents a riot, 0 otherwise
- "event" (TEXT): Event associated to the post, structured as follows:
- "No event" if the post is not marked as riot-related
- "day_city_id" with "day" being the day of the month associated to the event, such as "2", "city" being the city in which the event happened, such as "Paris", "id" being an integer. "29_Marseille_0" corresponds to event "0" happening in Marseille on June 29th 2023. If the value of the id is "-1", the post could not be associated to any event.
- 2024-12-12 first online, published, posted
DATA - under embargo
The files in this dataset are under embargo until 2025-05-31.
Reason
Embargo during the review process of the article