Self-adaptive Executors for Big Data Processing
datasetposted on 06.09.2019 by S. (Sobhan) Omranian Khorasani
Datasets usually provide raw data for analysis. This raw data often comes in spreadsheet form, but can be any collection of data, on which analysis can be performed.
This dataset contains the measurements obtained with Apache Spark using different strategies for adapting the number of executor threads to reduce I/O contention. The two main strategies explored are a static solution (number of executor threads for I/O intensive tasks pre-determined) and a dynamic solution that employs an active control loop to measure epoll_wait time.