# Property-Based Testing Python using Hypothesis

This dataset contains data gathered in an effort to investigate the use of property-based testing (PBT) in Python using the Hypothesis framework. The findings are described in [1]. Below we describe what data can be found in each file, as well as how the data was collected.

## Files
Here we describe what data can be found in each file.

### [Repositories](./repositories.csv)
This file contains, for each of the 7 Python repositories:
- their name, 
- a link to the specific commit that was used,
- the total number of lines of code,
- the number of lines of Python code,
- the number of GitHub stars at the time of cloning,
- the total number of tests found,
- the number of property-based tests (PBTs) found,
- the PBT density (defined as the number of PBTs divided by the total number of tests), and
- the number of PBTs divided by the number of millions of lines of Python code.

### [Property-Based Tests](./property-based-tests.csv)
This file contains a digital summary of the manual analysis of 87 property-based tests. The meaning of each column is explained below.
| characteristic | explanation |
| -------------- | ----------- |
| `link` | URL to the first line of the PBT in the analyzed commit of the repository. |
| `repo` | Name of the repository where the PBT is located. |
| `name` | Name of the PBT. |
| `summary` | Description of what the PBT tests for. |
| `min_assertions` | Minimum number of assertions ran if test completes. |
| `max_assertions` | Maximum number of assertions ran if test completes. |
| `amt_assrts_dpndt_inp_size` | Whether the amount of assertions is dependent on the size of the input. |
| `assertions_independent` | Whether all assertions are independent. |
| `can_decompose_assertions` | Whether any pair of assertions is independent. |
| `amt_sut_calls` | Maximum number of calls made to the system under test. |
| `is_local` | Whether the system under test is local (as opposed to a library). |
| `test_level` | Whether the PBT tests functionality, integration, or environment. |
| `is_nested` | Whether the assertions are in a different function than the test. |
| `f_or_nf` | Whether the tested property is functional, non-functional, or both. |
| `tests_state_modification` | Whether the tested property involves modification of state. |
| `category` | The category that the tested property falls under. |
| `has_assumptions` | Whether the test uses Hypothesis assumptions. |
| `generated_input_type` | A list of inputs that are generated by Hypothesis. |
| `is_input_filtered` | Whether any generated input is filtered (i.e. the domain of the input is restricted by some rule other than its type). |
| `is_input_sampled` | Whether any generated input is sampled from a list of possibilities. |
| `uses_custom_generator` | Whether the PBT uses a custom generator to create any of the inputs, a custom generator is considered to be a generator |that is decorated with `@hypothesis.composite` or makes use of non-Hypothesis functions to process the generated values. 
| `uses_dynamic_generation` | Whether inputs are generated using Hypothesis from within the test, as opposed to from decorator |
| `uses_non_hypothesis_generation` | Whether some other module is used to generate random data. |
| `amt_example_inputs` | How many non-generated test cases are specified (if any). |
| `is_parameterized` | Whether some inputs to the test come from a parameterization decorator. |
| `generated_input_used_directly` | Whether the input is passed directly onto the SUT/environment (if not, it is processed first). |
| `uses_custom_shrinker` | Whether the PBT makes use of a custom shrinker. |
| `asserts_exceptions` | Whether the test asserts that a certain exception is thrown by the SUT. |
| `can_throw_exception` | Whether the test includes an exception-throwing clause. |
| `uses_assert_close` | Whether there is an assertion with specified tolerances (i.e. two values are asserted to be close, but not necessarily equal|). 
| `uses_custom_assertion` | Whether the test uses an assertion function that does not originate in a library. |
| `uses_hypothesis_note` | Whether the PBT uses Hypothesis note to display a message when the test fails. |
| `test_precondition` | What condition must be met for the PBT to be run (if any). |
| `explicitly_functionally_system_dependent` | Whether the results of the test are explicitly shown to be affected by hardware. |
| `deterministic_generation` | Whether data are deterministically generated by Hypothesis. |
| `member_test` | Whether the PBT is a member of a class. |
| `generated_SUT` | Whether the system under test is generated by Hypothesis. |
| `xfail` | In which cases the test is expected to fail (if any). |
| `number_of_testcases` | Maximum number of test cases that may be generated by Hypothesis. |
| `deadline (ms)` | Maximum time in milliseconds that Hypothesis is allowed to run the test. |
| `skip_test` | In which cases the test is skipped (if any). |
| `has_test_helper` | Whether the test uses another function that was defined for testing purposes (but not a custom assertion or custom generator). |

### [Results: Categories](./results_categories.csv)
Each of the 87 analyzed property-based tests was put in one or more of 20 categories, which are explained below. Categories marked with a (*) are derived from [this article](https://fsharpforfunandprofit.com/posts/property-based-testing-2/).

| Name | Explanation |
| ---- | ----------- |
| differentPathsSameDestination* | Running operations in different orders yields the same result. |
| idempotence* | The result of an operation does not change if it is ran multiple times. |
| roundtrip* | The result of applying an operation followed by its inverse to some data, results in the original data. |
| testOracle* | A function defined for testing purposes provides the correct result for the system under test to give. |
| transformationInvariance* | Some characteristic of the data does not change when the data is tranformed. |
| additivity | The sum of the results of an operation applied to two values is equal to the result of that operation applied to the sum of the values, i.e. op(a) + op(b) = op(a+b). |
| monotonicity | The results of the system under test do not change direction as its argument solely increases or solely decreases, i.e. the system under test is a non-decreasing or non-increasing function. |
| symmetry | Applying a function to some arguments has the same result as applying it to the symmetric counterpart of those arguments. |
| referenceConsistency | A reference implementation for the system under test provides the correct result for the system under test to give. The system under test is some optimization for the reference function and requires specialized hardware or some extra setup. |
| systemEquivalence | An alternative implementation of the system under test provides the correct result for the system under test to give. |
| constructionIntegrity | The construction of an object based on some data results in an object that is consistent with the provided data. |
| equivalenceExtensionality | Two objects are equal if and only if they have a specific characteristic in common, i.e. obj₁ = obj₂ iff obj₁.x = obj₂.x. |
| argumentIdempotence | Applying a function to a value has the same result as applying the function to that value as multiple arguments, i.e. f(x) = f(x, x) = f(x, x, x) = ... |
| outputJustification | The output of the system under test is consistent with the generated justification. |
| postCondition | The output of the system under test always has a certain characteristic. |
| exceptionGuarantee | The system under test always throws an exception in the provided circumstances (input or environment). |
| noException | The system under test never throws an exception in the provided circumstances (input or environment). |
| testByRunning | The system under test has its own direct means of passing or failing a test (e.g. through exceptions or assertions). |
| objectCached | The system under test caches equal objects, such they have the same memory location. |
| recursionPerformance | The system under test provides a satisfactory output within a certain recursion depth. |

### [Results: Inputs](./results_inputs.csv)
This file summarizes the `generated_input_type` column from [property-based-tests.csv](./property-based-tests.csv) by giving the frequencies (how many tests generated that input type) for each input type.

### [Results: Key Features](./results_key-features.csv)
This file summarizes some columns from [property-based-tests.csv](./property-based-tests.csv) by giving their counts (how many tests had that feature).

### [Results: Filtering and Skipping](./results_filtering-skipping.csv)
This file summarizes columns from [property-based-tests.csv](./property-based-tests.csv) related to restriction of input generation and test execution flow.

## How the data was collected
The property-based tests were collected from open-source GitHub repositories, in the following way:
1. GitHub repositories that list Hypothesis as a dependency were collected.
2. The found repositories were sorted from highest to lowest amount of GitHub stars.
3. The top repositories that had at most 25 property-based tests were selected for analysis.
More detailed information about how the repositories and property-based tests were gathered can be found on https://github.com/DaKoning/hypothesis-dataset.

After collecting 87 property-based tests from 7 repositories, we printed them all out on paper and analyzed them using an open coding technique [2]. We summarized emerging themes and concepts in a spreadsheet ([property-based-tests.csv](./property-based-tests.csv)).



[1]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;D. de Koning, “Property-Based Testing in Practice using Hypothesis: In-depth study on how developers use Property-Based Testing in Python using Hypothesis,” Bachelor Thesis, Delft University of Technology, 2025. Available: https://resolver.tudelft.nl/uuid:aa9cc98d-032f-4544-9447-d6e24bb8ebd2.

[2]&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;R. Hoda, *Qualitative Research with Socio-Technical Grounded Theory: A Practical Guide to Qualitative Data Analysis and Theory Development in the Digital World*. Springer, 2024, ch. 10.