This page has all the code and data used in the experiments reported in Paper 910 submitted for consideration at ECML PKDD 2019. We will describe below how the experiments can be reproduced. We start by explaning the system requirements and then how to use the code. Finally we show some auxiliary figures that were not included in the paper due to space constraints.
You will need to use Python 2.7 for conducting the experiments. You will also need the following packages: Sklearn, SciPy, Pandas and Imbalanced-learn.
Tested on Python 2.7.9.
You can download all the code and data sets used in our experiments here. When this is done, you can use the intructions below to reproduce all our experiments.
To download all of the necessary datasets:
python datasets.py
To initialize the databases:
python databases.py
To schedule the experiments associated with the [preliminary|final] analysis:
python experiments/schedule_final.py
To start a runner, pulling unfinished trials until there are none left (note that several runners can operate simultaneously):
python run.py
To export the results from a previously initialized database into a CSV file:
python databases.py
This experimental framework is based on the framework published in:
Koziarski, Michal, Bartosz Krawczyk, and Michal Woźniak. “Radial-Based oversampling for noisy imbalanced data classification.” Neurocomputing (2019).
In this section we present further auxiliary figures that were not included in the paper due to space constraints. These figures illustrate the impact of changing one of the parameters of our CURE algorithm when the other parameter is fixed at a certain value.
Rankings variation of CURE method for IR50 for \(s=0.25\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR50 for \(s=0.45\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR50 for \(s=0.65\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR50 for \(s=0.85\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR50 for \(s=1.0\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR30 for \(s=0.25\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR30 for \(s=0.45\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR30 for \(s=0.65\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR30 for \(s=0.85\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR30 for \(s=1.0\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR10 for \(s=0.25\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR10 for \(s=0.45\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR10 for \(s=0.65\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR10 for \(s=0.85\) and \(0.25 \leq \alpha \leq 0.85\)
Rankings variation of CURE method for IR10 for \(s=1.0\) and \(0.25 \leq \alpha \leq 0.85\)