This project contains an implementation of all the necessary components as well as a full ViT model. In particular:
- Library headers (
include/
) - A single-threaded serial implementation (
src/
) - An OpenMP parallel implementation (
omp_src/
) - Some utility scripts (
scripts/
) - A pre-defined model to be used for weight initialization (
model/
) - Some datasets (
data_0/
data_1/
data_2/
data_3/
) - Files for individual component testing (
test_src/
) - Scripts for easy benchmarking (
params.sh
run_cpp.sh
run_omp.sh
elaborate.sh
)
A copy of OpenMP source (acc_src/
) that you can work on.
module load nvhpc/24.3
make -f makefile.acc acc_bin/vit.exe
python3 -m venv .venv
source .venv/bin/activate
pip install numpy
Make sure they are executable
chmod +x <your_script>
IMPORTANT: you shall not run these commands directly.
- Serial: ./run_cpp_nsys.sh --profile
- Openmp: ./run_omp_nsys.sh --profile
- Openacc: ./run_acc_nsys.sh --profile
To run properly, you must first request resources via SLURM. There are two ways to do so:
# "Inline"
srun -N 1 -p boost_usr_prod -A tra24_hckunitn --reservation=s_tra_hckunitn -t 00:05:00 --gres=gpu:1 <your_command_here>
# "Batch script"
sbatch slurm_script.slurm
In the second case, stdout and err are saved to files. Look at the script.
./down.sh <account> <sys> <path> [exclude]
or
./down.sh -h
C++ single-threaded serial implementation:
make bin/vit.exe
OpenMP parallel implementation:
make omp_bin/vit.exe
Clean the folder from all compiled file:
make clean
C++ single-threaded serial implementation:
./bin/vit.exe <model.cvit> <in_file.cpic> <out_file.cprd> <measures_file.csv>
OpenMP parallel implementation:
export OMP_NUM_THREADS=<num_threads>
./omp_bin/vit.exe <model.cvit> <in_file.cpic> <out_file.cprd> <measures_file.csv>
Example:
./bin/vit.exe model/vit.cvit data_0/pic_0.cpic out/prd_0.cprd measures/cpp.csv
export OMP_NUM_THREADS=16
./omp_bin/vit.exe model/vit.cvit data_0/pic_0.cpic omp_out/prd_0.cprd measures/omp_16.csv
You can use two scripts to easily automate this process (they both rely on params.sh
):
# Edit params.sh
bash run_cpp.sh
bash run_omp.sh
When the model run, they automatically create output labes as well as measure files.
If you used bash run_cpp.sh
you will already find a cpp_summary.txt
file in the measure folder, same for bash run_omp.sh
.
Otherwise you can manually compute it with:
python3 scripts/analyze_time_measures.py <measure_file.csv> <out_file.txt>
Compare two prediction files:
python3 scripts/compare_cpred.py <prediction_a.cprd> <prediction_b.cprd> <out_file.txt> <high_treshold> <low_treshold>
These comparisons have a section for each batch, so for large datasets you may want to summarize them in a shorter file:
python3 scripts/summary_cpred_comparison.py <comparison_file.txt> <summarized_file.txt>
If you used bash run_cpp.sh
and bash run_cpp.sh
, you can run another script to automatically generate the comparisons and summarize them
in the out_comparison/
folder:
# It relies as well on params.sh
bash elaborate.sh
Plot an input batch file:
python3 scripts/plot_cpic.py <batch_file.cpic>
Plot an output prediction file:
python3 scripts/plot_prediction.py <prediction_file.cprd>
Create an input batch of a random quantity pictures, each one composed by random pixels:
python3 scripts/random_cpic <out_file.cpic> <min_batch_dim> <max_batch_dim> <num_channels> <picture_height> <picture_width> <min_pixel_val> <max_pixel_val>
Generate the lines of code to insert a random value tensor in your test file:
python3 scripts/create_tensor.py <dim_0> <optional_dim_1> <optional_dim_2> <optional_dim_3>
To run an individual component test, first modify the proper file in test_src/test_<module_name>.cpp
.
Where <module_name>
can be: datatypes
, modules
, mlp
, conv2d
, attention
, block
, patch_embed
, vision_transformer
and utils
.
Then compile it:
make test_bin/test_<module_name>.exe
And run it:
test_bin/test_<module_name>.exe