============================================================================================================
# General 
============================================================================================================
## Author(s): 
P (Puneet) Mishra 1) ORCID ID: 0000-0001-8895-798x 
P (Peter) Offermans 2) ORCID ID: 0000-0001-5823-270X
MGJ (Manon) Mensink 1) ORCID ID: 0009-0002-9948-4481
HJ (Rick) van de Zedde 3) ORCID ID: 0000-0002-8394-4538
A (Aneesh) Chauhan 1) ORCID ID: 0000-0001-9012-0070
LMJ (Lydia) Meesters 1) ORCID ID: 0000-0002-1917-7119

## Data collection: 
P (Puneet) Mishra 1) ORCID ID: 0000-0001-8895-798x
N (Najim) El Harchioui 1)
MGJ (Manon) Mensink 1) ORCID ID: 0009-0002-9948-4481
MA (Mariska) Nijenhuis 1)
M (Miaomiao) Yao 1) ORCID ID: 0009-0002-8597-2222
GW (Gerwoud) Otten 1) ORCID ID: 0000-0003-1630-1142
P (Peter) Offermans 2) ORCID ID: 0000-0001-5823-270X
MH (Mathijs) van Kilsdonk 2) ORCID ID: 0009-0003-2048-986X

## Affiliation: 
1) Wageningen Food and Biobased Research, Wageningen University & Research  
2) OnePlanet Research Center
3) Wageningen Plant Research, Wageningen University & Research 

## Project: 
Research program Data-Driven and High-Tech KB-38-001-003
https://www.wur.nl/en/research-results/research-funded-by-the-ministry-of-lvvn/kennisbasis-onderzoek/ongoing-kb-projects-2019-2024/datadriven-and-hightech.htm 

## Contact: 
lydia.meesters@wur.nl 

============================================================================================================
# Title 
============================================================================================================
Dataset: Sensing Potential in the Food Supply Chain - Mango

============================================================================================================
# TableOfContents 
============================================================================================================
    - General 
        - Authors(s)
	- Data collection
	- Affiliation
	- Project
	- Contact		 
    - Title  
    - TableOfContents  
    - Methods  
        - Introduction  
        - Description  
    - FolderContents  
    - FileFormats  
    - Software requirements  
    - Contributing  
    - Citation  

============================================================================================================
# Methods 
============================================================================================================
## Introduction

Purpose of the research:
The overall aim of this research is to explore the feasibility of sensing technology to measure non-destructively fruit quality properties on a batch and an individual product level.

The specific objective for this dataset was to utilize various sensor data inputs for AI models aimed at quantifying the brix (sugar content) and firmness of mangoes. AI models were developed using data from individual sensors, as well as models based on the fusion of all sensor data. The brix and firmness were modeled using a multi-sensor approach, incorporating ultrasound and spectroscopy signals. Refractometer measurements for brix and puncture measurements for firmness served as the reference standards.

## Description

Data provenance: 
The data has been collected at the research institute Wageningen Food & Biobased Research on 8 and 12 December 2023.

Origin mangos:
Three lots of mango, from two locations in Brazil, were shipped to Belgium and transported on 1 December 2023 to Wageningen Food & Biobased Research.
Both locations are GLOBALG.A.P. certified; the GGN label is known at Wageningen Food & Biobased Research.
Mango fruit from an Belgium fruit importing company, origin Brazil (reference VO/23/033442, WUR1702334) were transported to Wageningen University & Research (WUR), Wageningen, NL, on December 1st 2023. 
Fruit from three lots were delivered, in boxes containing 4 kg of fruits:
	- Kent size 06, 17 boxes, 102 fruits, number of fruits in the dataset: 99
	- Palmer size 08, 38 boxes, 304 fruits, number of fruits in the dataset: 294
	- Palmer size 10, 10 boxes, 100 fruits, number of fruits in the dataset: 95

Storage of the fruits:
Upon arrival, fruits were put in storage rooms at different temperatures 8°C, 15°C or 22°C and 85% RH for maximal six days, to reach different ripeness stages, until one day prior to measurements.
Development of firmness was checked regularly. Once the fruits reached the desired firmness range, they were stored at 8°C and 85% RH for the remaining storage days (maximum storage days = 6).
One day before measurements, on December 7th 2023, fruits were put at 20°C and 60% RH to avoid condens during measurement and temperature related differences in firmness.

Data collection steps:
In total 488 mangos were measured. Each mango received a unique number (from 1 to 490) which was written on the skin using a waterproof marker, on one of the mango sides, ensuring all measurements corresponded to a specific mango.
Fruit numbers 288 and 404 were missing, resulting in 488 fruits in total in the dataset.

The data were collected by different instruments in a sequential manner. The mangos were measured in 23 batches. 
The device measurement order for each batch of fruit was as follows:
	- Microwave (the data is not included)
	- Ultrasound
	- VNIR-NIR interactance (the data is not included)
	- Hyperspectral diffuse (the data is not included)
	- Raman
	- Felix (NIR)
	- Firmness reference measurement - Fruit texture analyzer (FTA)
	- Brix reference measurement - Refractometer
	- Dry matter reference measurement 

Steps undertaken in processing data:
The data formats have been converted to CSV file format. No additional data processing has been applied.
	- Felix data format *.f750dat has been converted to a CSV file format
	- Raman data format *.DX has been converted to a CSV file format
	 
Steps undertaken in analyzing data:
The data analysis is not included in this dataset publication.

============================================================================================================
# FolderContents 
============================================================================================================
The folder consists of the following files:

- This 01_readme.txt file

- The 02_reference.csv containing the reference data for all 488 measured fruits, numbered 1 to 490, where numbers 288 and 404 are missing.
	The reference measurements on the fresh fruits were performed on December 8th 2023, after completing all non-destructive measurements: firmness, Brix and fresh weight. 
	Dry weight was determined after drying over the weekend on December 12th 2023.

	Firmness
	The skin from one mango side, where all non-destructive measurements were done, was removed using a cheese slicer.
	The FTA fruit texture analyzer from Güss Manufacturing Ltd (Strand, SA) model GS-15, serial nr. FTA21029 (PHT0002) mounted with a stainless steel plunger (diameter 7.9 mm), was used to perform the puncture test.
	The deformation curves were recorded in FTAGraph software (version 3.0035, January 31st, 2010).
	From the deformation curve, the maximum force needed in the puncture test, the slope over 0.5 to 1.5 mm measuring distance (representing limited compression)
	and traveled distance at the maximum force point (mm) were calculated in Excel.
	Settings FTA:
		Measure distance	7.9 mm
		Threshold		0.1 kg
		Speed forward		30 mm/sec
		Speed backward		40 mm/sec
		Measuring speed		10 mm/sec
		Reverse increment	20 mm
	After the firmness measurements, this mango cheek was cut off and three 9mm plugs were taken around the area of the puncture hole. The plugs were trimmed from the seed side to 1cm length.

	Brix
	Juice from one of the plugs was collected in the measuring well of a HANNA HI 96801 refractometer, Hanna Instruments (Temse, BE).
	The measured brix value was entered manually in the data sheet in Excel.

	Dry matter content
	Fresh and dry weight of the other two plugs was determined, using a three digit scale PHT0380, Mettler Toledo (Tiel, NL) type MS403TS/00 serial nr. C150149398
	For automated data transfer in Excel, BalanceLink software from Mettler Toledo was used (version 4.1.0 Mettler Toledo 1992-2012 Chreifensee, CH).
	The samples were dried for 60 hours in aluminium trays in a Ezidri Ultra FD1000 food drier at 65°C, Hydraflow Industries Ltd (Upper Hut, NZ).
	Dry matter content is defined as dry weight to fresh weight ratio expressed as percentage. 

	Merging of the data
	All data were merged in the 02_reference.csv file. The CSV file format is as follows: 
		row 1: column headers
		column A: fruit ID [1-490], numbers 288 and 404 missing
		column B: CV, cultivar Kent or Palmer,
		column C: Brix (°), the amount of soluble sugars,
		column D: DM (%), the dry matter content,
		column E: Max. force (kg), the maximum force (kg) needed in the puncture test to travel 7.9 mm in the fruit flesh,
		column F: Slope (kg/mm), the slope of the deformation curve over 0.5 to 1.5 mm traveling distance, representing limited compression
		column G: Distance at Max. force (mm), the traveled distance (mm) at the maximum force point,
		column H: Size, refers to number of fruit per box of 4 kg, e.g. size 6: fruits are ~ 650 g,
		column I: remark, any remark on fruit or measurement

- The 03_felix_absorbance_spectra.csv file containing the absorbance spectra measured for 488 mangoes, numbered 1 to 490, where numbers 288 and 404 are missing.
	From all fruit, absorbance spectra in the NIR range from 285 tot 1200 nm, in steps of 3 nm, were collected using a handheld NIR device Felix F-750 Produce Quality Meter, serial number 008-16021 (Camas, US).
	The device was placed at a lab bench. Each mango was put on the measuring area of the device, on top of the plain ring around the scan area, at the middle of the mango side holding the written fruit number.
	For every fruit one absorbance spectrum was saved in the Felix data format *.f750dat. These absorbance spectra were merged to a CSV file format:
		row 1: column headers
		column A: fruit ID [1-490], numbers 288 and 404 missing
		column B: Felix filename
		column C: Created Date & Time
		column D: Shutter Closed Integration Time
		column E: Shutter Open Integration Time Scans To Average
		columns F-KY: wavelength from 285 to 1200 nm
	 	
- The 04_raman.csv contains the spectra measured for 488 mangoes, numbered 1 to 490, where numbers 288 and 404 are missing, using the Raman device Avantus AvaSpec-HSC 1024x58TEC-EVO, serial nr. 2201308UI.
	Raman spectra in the range of 56 to 2934 nm were collected using a Avantes AvaSpec-HSC, 1024-58TEC-EVO, serial number 2201308U1. 
	In a dark room the measurement was done at the middle of the mango side holding the written fruit number.
	For every fruit one spectrum was collected and merged to a CSV file format 	 	
	The CSV file format is as follows:
		row 1: column headers
		column A: fruit ID [1-490]
		columns B-AMJ: wavelength from 56 to 2933 nm

- The 05_ultrasound.csv contains ultrasound reflectance data acquired using a Verasonics Vantage 32LE system.
	The system is equipped with an Ultran NCG500-D13-P25 non-contact focused-point 344kHz transducer through a UTA 260-MUX adapter. 
	The transducer was placed at a distance to the fruit such that its focal point approximately coincided with the targeted measurement spot at the fruit surface. 
	The measurement spot was defined by the closest point at the widest perimeter at the marked side of the mango facing the transducer.
	Six measurements were subsequently taken at this spot at decreasing distance to the fruit using 0.5mm steps.
	Each measurement entails sending out a single 33.1V pulse and recording the multiple reflections between fruit surface and probe, using the probe itself as the receiver.
	This results in recorded time traces of 16000 points per measurement, acquired at a sampling frequency of approximately 10 MHz.
	The measured time traces of all fruits were merged to a CSV file format: data_ultrasound.csv
	The spectra were measured for 484 mangoes, numbered 1 to 490, where numbers 195, 242, 288, 396, 404 and 454 are missing. 
	The CSV file format is as follows:
		column A: fruit ID [1-490]
		column B: measurement number [1-6]
		columns C-WQL: time trace consisting of 16000 points
		
============================================================================================================
# FileFormats 
============================================================================================================
- *.csv
- *.txt

============================================================================================================
# Software requirements 
============================================================================================================
- any software that can open or process csv files, e.g. MS Excel
- any software that can open or process txt files, e.g. the text editor NotePad

============================================================================================================
# License 
============================================================================================================
This work is licensed under CC BY-SA 4.0. To view a copy of this license, visit https://creativecommons.org/licenses/by-sa/4.0/

============================================================================================================
# Contributing 
============================================================================================================
Contact us at lydia.meesters@wur.nl for any issues you may have found with the collected data or for any possibilities to help expand upon the data / research.
If you are unable to make contact with lydia.meesters@wur.nl, you can always direct your question to wfbr.secretariat@wur.nl.

============================================================================================================
# Citation 
============================================================================================================
Mishra P, Offermans P, Mensink MGJ, Zedde HJ van de, Chauhan A and Meesters LMJ, 2025, "Dataset: Sensing Potential in the Food Supply Chain - Mango", DOI https://doi.org/10.4121/fb26fd3f-ba3c-4cf0-8926-14768a256933 

============================================================================================================
END readme
============================================================================================================

