Let's say we have a new version of FauxPy ready for release. We need to verify two key aspects:
- Ensure the new version runs on real-world programs without crashing.
- Confirm that the results produced by the new version are consistent with those from the previous version. This helps us ensure no regressions occurred during development.
To accomplish this, we run the new version with a selection of real-world Python projects, checking for successful execution without crashes.
Next, we extract features from the results produced by the new version of FauxPy for each project and compare them to the same features extracted from the results produced by the previous version of FauxPy using the same projects.
We expect the features for both FauxPy versions to match. If any discrepancies arise, further investigation is necessary to determine why the new version of FauxPy is not consistent with the previous one.
In the following we introduce the materials and concepts for this testing activity.
For each version of
FauxPy, there is a dedicated directory
named vX_Y
, where
X
and Y
are the major
and minor versions of FauxPy
in that directory.
For example, FauxPy version 0.1
is located in the v0_1 directory.
We have defined three testing scenarios to test each version of FauxPy. Each scenario involves running a subset of the subject scripts created in bash_script_generator.
The three scenarios are as follows
(currently, only tiny
is configured):
tiny
: Should finish within 5 hours on a personal laptop.small
: Must include at least one subject from the projects listed in README.md and is expected to finish within 24 hours on a personal laptop.large
: Runs all subject scripts generated in bash_script_generator with no time limit.
Within each version directory vX_Y
,
and for each testing scenario SCENARIO
,
there is a corresponding directory named
scripts_SCENARIO
.
For example, for FauxPy 0.1,
the directory for the tiny
test scenario
is scripts_tiny
, located in
v0_1/scripts_tiny.
Additionally, in each version directory
vx_y
,
for each testing
scenario SCENARIO
,
there is a bash script named
run_SCENARIO_scripts.sh
(e.g.,
v0_1/run_tiny_scripts.sh)
that simply executes all the scripts
for that specific testing scenario.
After running the current and previous versions of FauxPy with all subject scripts for a given testing scenario, we extract features from their outputs, which are saved in the extracted_features directory.
The features are stored in files named
vX_Y_SCENARIO_fl_features.json
(e.g.,
v0_1_tiny_fl_features.json).
Here, X
and Y
denote the major
and minor versions of FauxPy,
while SCENARIO
refers to the specific
testing scenario from which the features are extracted.
By comparing these feature files, we check for inconsistencies between the current and previous versions of FauxPy.
Currently, we focus on two specific features:
- Generalized E_Inspect
- Output Length
These features are fault localization evaluation metrics outlined in the paper "An Empirical Study of Fault Localization in Python Programs" by Mohammad Rezaalipour and Carlo A. Furia.
In the future, we plan to add more features for more comprehensive testing.
The scripts for running FauxPy are designed to work with projects from BugsInPy, so you need to configure your machine accordingly.
BugsInPy requires three specific Python versions (3.6, 3.7, and 3.8) because different projects within this framework rely on these versions. It's recommended to use Python virtual environments for each of these Python versions.
This is because BugsInPy does not
recognize Python commands like
python3.6
or python3.7
.
Instead, it relies on the python3
command. The BugsInPy shell script
that compiles a project—creating a
virtual environment and installing all
necessary dependencies—uses
the
python3
command for environment
creation, as detailed
here.
To use BugsInPy correctly, you must either:
- Configure your OS's
python
command to point to the correct version for the project being compiled, or - Install three separate virtual environments for the required Python versions and activate the appropriate environment for each project.
We recommend the second approach, which is detailed below.
-
Install
dos2unix
, a requirement for BugsInPy:sudo apt-get update sudo apt-get install dos2unix
-
After installing Anaconda, create three conda environments for Python 3.6, 3.7, and 3.8, which are the versions required by BugsInPy projects. These conda environments are used to create the Python virtual environments in the next step. Once you have created the Python virtual environments, you can remove these conda environments if desired.
conda create --name fauxpy-3.6 python=3.6 conda create --name fauxpy-3.7 python=3.7 conda create --name fauxpy-3.8 python=3.8
-
Use each of the conda environments created in the previous step to set up Python virtual environments. These Python virtual environments are necessary to run the scripts generated in the
bash_script_generator
directory at the root of this repository. The scripts are designed to work with Python virtual environments located in your home directory, with the specific namesbugsinpyenv36
,bugsinpyenv37
, andbugsinpyenv38
. This naming convention is required by the bash scripts generated in thebash_script_generator
directory.cd ~ conda activate fauxpy-3.6 python3.6 -m venv bugsinpyenv36 conda deactivate conda activate fauxpy-3.7 python3.7 -m venv bugsinpyenv37 conda deactivate conda activate fauxpy-3.8 python3.8 -m venv bugsinpyenv38 conda deactivate
-
Create a backup of your
.bashrc
file in your home directory, naming it_bashrc
. Some of the programs in BugsInPy modify your.bashrc
file during compilation, so our scripts require a backup version of.bashrc
to fix these side effects.cp ~/.bashrc ~/_bashrc
Testing involves running
comparisons between two
different versions:
the previous version ($previous
)
and
the current version ($current
).
For example,
the previous version might
be v0_0
, while the current
version could be v0_1
.
In the steps below, we'll
demonstrate how to execute
the tiny
test scenario.
The process for running
the other two scenarios,
small
and large
,
is similar.
-
First, clone this repository and navigate to the
fauxpy-test/versions
directory.git clone git@github.com:mohrez86/fauxpy-test.git cd fauxpy-test/versions
-
Create a Python environment within the
version
directory and activate it:python3 -m venv env source env/bin/activate
-
Navigate to the previous version directory (e.g.,
v0_0
) and execute the script for the selected scenario. For instance, for thetiny
test scenario, runrun_tiny_scripts.sh
.cd $previuos ./run_tiny_scripts.sh
When the script finishes,
you can find the results
produced by FauxPy for
each project in directories
named after each project
(e.g., black
) within
the directory
containing the scripts
(e.g., scripts_tiny
).
-
While you are inside the
$previous
directory, runmain.py
using the following command:python main.py -e tiny
In the above script,
tiny
indicates that
we are executing the
tiny test scenario.
Running this script will
create a directory named
csv_output_tiny
within
$previous
, which will
contain the output of
each fault localization
session in .csv
files.
-
Next, navigate to the
$current
directory (e.g.,v0_1
) and repeat the same steps.cd ../$previuos ./run_tiny_scripts.sh python main.py -e tiny
-
Now, return to the
versions
directory and executemain.py
with the following command:cd .. python main.py -e tiny -p $previous -c $current
Running the above command
generates and saves feature
files for both FauxPy
versions in the
extracted_features
directory.
The script also checks the
features of these two
versions for inconsistencies.
The final output will
indicate whether the
current version's output
matches the previous
version's output.
For example,
a message like
Output of v0_1 is equivalent to the output of v0_0
means that the current
version is consistent
with the previous
one.