*** Oligomerization-dependent and synergistic regulation of Cdc42 GTPase cycling by a GEF and a GAP ***
Authors: S. Tschirpke, W. K-G. Daalman, F. van Opstal, R., and L. Laan
Delft University of Technology, Faculty of Applied Sciences, Department of Bionanoscience
Delft University of Technology, Kavli Institute of Nanoscience
Corresponding author: L. Laan
Contact Information:
l.laan@tudelft.nl
Delft University of Technology, Faculty of Applied Sciences, Department of Bionanoscience
Van der Maasweg 9, 2629 HZ Delft, The Netherlands

***General Introduction***
This dataset contains data collected for the publication 'Oligomerization-dependent and synergistic regulation of Cdc42 GTPase cycling by a GEF and a GAP' (doi: https://doi.org/10.1101/2023.06.26.546500).
The data in this data set was collected in the Laan lab of the Delft University of Technology - Faculty of Applied Sciences, Department of Bionanoscience, between 2023 and 2024.
This research project was made possible by funding from the European Research Council under the European Union’s Horizon 2020 research and innovation programme (grant agreement 758132), funding from the Netherlands Organization for Scientific Research (Nederlandse Organisatie voor Wetenschappelijk Onderzoek) through a Vidi grant (016.Vidi.171.060), and funding from the Kavli Synergy Post-doctoral Fellowship program of the Kavli Institute of
Nanoscience Delft.

***Purpose***
(1) GTPase assay data: The purpose of these experiments was to quantify the GTPase activity of the S.cerevisae GTPase Cdc42, alone and in combination with its GEF Cdc24 and GAP Rga2, and variants thereof (Cdc24-DH3, GAP domain). We used the GTPase assay 'GTPase-Glo assay' by Promega. The data consists of assay data where different concentrations of the mentioned proteins, in different combinations, are used. 
(2) SEC-MALS data: The purpose was to asses the oligomeric state of S.cerevisae proteins Cdc24, Rga2, and the GAP domain of Rga2. 

***Equipment***
(1) All GTPase assays measured on a Synergy HTX plate reader (BioTek) in luminescence mode. 
(2) All SEC-MALS data was collected on using a Superdex 200 Increase 10/300 GL column (Cytiva) connected to a high-performance liquid chromatography (HPLC) unit (1260 Infinity II, Agilent) running in series with an online UV detector (1260 Infinity II VWD, Agilent), an 8-angle static light scattering detector (DAWN HELEOS 8+; Wyatt Technology), and a refractometer (Optilab T-rEX; Wyatt technology).

***Description of the data in this data set***
The data is organised in 2 zip-folders: 'GTPase Glo assay' and 'SEC MALS'. 

--- FOLDER: GTPase Glo assay ---
This folder contains xlsx documents, which contain GTPase Glo assay data and a summary file containing the rates obtained from this data. 

The file 'data_Cdc42.xlsx' contains GTPase Glo assay data collected for the GTPase for Cdc42 and Cdc42-effector mixtures. THe purpose of these experiments was to quantify the GTPase activity of the GTPase Cdc42 and the concentration-dependent effect of the GEF Cdc24 and GAP Rga2. 
The data is organised by protein mixture, which each tab containing data of assays of similar mixtures.  E.g. the tab 'dCdc42' contains assay data for Cdc42 alone, the tab 'dBSA' contains assay data for Cdc42-BSA mixtures. 
For 2 proteins different subsets of the data are given: 
Cdc24: All data is given in the tab 'dCdc24'. These were used for the primary analysis. For the polyfit analysis described in S2 of this publication, only experiments with at least 5 Cdc24 concentrations were used. These are given in the tab 'dCdc24-min5datapoints' which is a subset of the data in 'dCdc24'.
Rga2: We used 2 Rga2 versions: HRga2 and HRga2F. All data is given in the tabs 'dHRga2' and 'dHRga2F'. Because the effect of these proteins saturates at higher concentrations, only a subset of the data can be used for determining the rates (data where the effect of Rga2 is in the linear regime'). The subsets of these data are given in tabs where the addition '-linregime' is added in the tab name.  

Each tab has data with the following headers:
Run: assay/experiment number
Time: incubation time
GTP_remaining: Amount of remaining GTP (normalised to the buffer, see below)
Error: error for the GTP_remaining values
Buffer_error: error of the buffer wells that were used for normalisation
'protein-name'_conc: Concentration of the protein used, in uM

The data shows the amount of remaining GTP for various Cdc42 constructs. The amount of remaining GTP correlates with the measured luminescence. Wells without protein (’buffer’) were used for the normalization and represent 0% GTP hydrolysis. Reactions were carried out with at least 3 replicates (wells) per assay, and the average and standard deviation of each set was used to calculate 'GTP_remaining' and its error ('Error'), and 'Buffer_error' of each set. A detailed description of how this data is obtained is given in REF1. 


The file 'data_Timeseries.xlsx' contains additional time series GTPase Glo assay data for the GTPase Cdc42 and the effector Cdc24 and Rga2. The data is organised by protein mixture, which each tab containing data of assays of similar mixtures. Each tab has data with the same headers as described for 'data_Cdc42.xlsx'. Each tab contains data for 3 incubation times (in contrast to 'data_Cdc42.xlsx', where each Run only consists of data for 1 incubation time). The purpose of this 3-time point data is to verify that the GTP decline is exponential over time. 


The file 'data_Ras_BSA_Casein.xlsx' contains control GTPase Glo assay data for the GTPase Ras and proteins BSA and Casein. The data is organised by protein mixture, which each tab containing data of assays of similar mixtures. Each tab has data with the same headers as described for 'data_Cdc42.xlsx'. 


These 'data_XYZ.xlsx' files can be used as input files for the analysis procedure described in REF1, which the matlab code available at REF2. This analysis/code produces as an output rates. 

The file 'Rate_summary.xlsx' gives an overview of all obtained rates from the analysis. The tab 'Cdc42 + effectors' gives rates obtained from Cdc42-Cdc24 effector mixtures. The tabs 'Ras' and 'BSA, Casein' give rates obtained from the control data from Ras, BSA and Casein. The tab 'Cdc42 + Cdc24 polyfit (S2)' contains rates obtained from the polyfit analysis described in S2 of this publication. 

Each tab has the same headers. A detailed description is given in REF1. In brief, the column ’Run’ shows the assay number, followed by two ’varied protein’ columns stating which protein(s) were varied in the assay. If the assay
contains only one varied protein, the second column is empty. Then Parameter 1 is named, and its values, 95% upper and lower bound, and standard error is shown. The same is repeated for possible parameters 2 and 3. The last column show adjusted R2 of the fit and a Boolean if the experiment is used for pooling (’Used for pooling’).
For each assay ('Run') the rate value estimates are shown. Below, in the row which contains ’pooled’ in the ’Run’ column, the pooled values of rate estimates are stated. The column ’varied protein’ states which proteins were varied in the assay. ’Used for pooling’ indicates if experiment is used for pooling. 


REF1: Tschirpke, S., Daalman, W. K., & Laan, L. (2024). Quantification of GTPase Cycling Rates of GTPases and GTPase:Effector Mixtures Using GTPase Glo Assays. Current Protocols, 4(4), 1–29, https://doi.org/10.1002/cpz1.1000.
REF 2: Tschirpke, S., Daalman, W. K. G., & Laan, L. (2023a). Data accompanying the publication: Quantification of GTPase cycling rates of GTPases and GTPase : effector mixtures using GTPase Glo assays. 4TU.ResearchData (CC BY-SA 4.0), https://doi.org/10.4121/ac196f25-1c20-4c0c-a0b9-f01cd3fadc45.



--- FOLDER: SEC-MALS ---

The folder contains .txt files with SEC-MALS data. The files follow the following naming convention:  '[protein name(s)]_[YYYYMMDD].txt'.
The data in those files is organized by the following headers:
volume (mL): the elution volume of the size exclusion column at the moment of the dRI measurement.
[protein name(s)][Sequence2] (dRI): dRI measurement values, which is indicative of the concentration of protein at a specific elution volume. 	
volume (mL): the elution volume of the size exclusion column at the moment of the Rayleigh ratio measurement.	
[protein name(s)][Sequence2] (Rayleigh ratio): Rayleigh ratio measurement, which is indicative of the concentration of protein at a specific elution volume.
volume (mL): the elution volume of the size exclusion column used for the molar mass calculation.
[protein name(s)][Sequence2] (Molar mass): is indicative of the calculated molar mass in Da for each elution volume and is based on a combination of the dRI and Rayleigh ratio.

[protein name(s)][Sequence2](dRI) or Rga2[Sequence2](Rayleigh ratio) is indicative of the concentration of protein at a specific elution volume. “[Sequence 2]” means that the full experiment was run in two increments. Likely the first increment was a BSA control where no fraction were collected, while sequence2 contained all runs with the proteins with fraction collection. “dRI / Rayleigh ratio” is indicative of the method used to measure the concentration at each elution volume. “dRI” stands for differential refractive index and measure the difference in refractive index of the sample relative to the buffer to determine the concentration. The “Rayleigh ratio” or “reduced Rayleigh ratio” is indicative of the amount of scattered light due to the protein/analyte relative to the laser intensity. The Rayleigh ratio in combination with the dRI measurement is used to determine the molecular mass. (see page three, subheading “The solution: light scattering” in https://www.wyatt.com/files/literature/white-papers/WP1615-SEC-MALS-for-absolute-biophysical-characterization.pdf if you want a more detailed description). 
There is three different volume columns, because the different measurements are not all done on exactly the same parts of the machine. There is a separate volume column which matches the dRI signal and a separate one for the Rayleigh ratio. The molar mass is only calculated within the peaks that you selected, so here the volume column is only for the specifically selected volumes within selected peaks.

The folder also contains an Jupiter notebook used for plotting (SEC-MALS_plotting.ipynb). 

