Skip to content

mariantaragel/format-bench

Repository files navigation

FormatBench

FormatBench is a Python benchmark of data formats. This project aims to evaluate different data formats for storing tabular and image data.

Check out also: ASV FormatBench

Usage

python3 main.py (--tabular|--compression|--image) [--webface <path>] [--report]

--tabular - run tabular benchmark suite
--compression - run compression benchmark suite
--image - run image benchmark suite
--webface - run benchmarks with the Webface10M dataset; <path> is a path to the Webface10M dataset
--report - generate report from the benchmark results

Examples of usage

Run tabular benchmarks: python3 main.py --tabular

Run image benchmarks and create report: python3 main.py --image --report

Run compression benchmarks with the Webface10M dataset: python3 main.py --compression --webface ~/synthetic_webface10M.h5

Run tabular benchmarks with the Webface10M dataset and create report: python3 main.py --tabular --webface ~/synthetic_webface10M.h5 --report

Related publication

TARAGEĽ, Marián. Column-oriented and Image Data Format Benchmarks. Brno, 2024. Bachelor’s thesis. Brno University of Technology, Faculty of Information Technology. Supervisor Ing. Jakub Špaňhel

Acknowledgements

I would like to convey my gratitude to Ing. Jakub Špaňhel for his supervision. I also express my thanks to my consultant Ing. Petr Chmelař. Both of them provided me with support and advice during the work on this thesis. Last but not least, I would like to thank the external submitter, the Innovatrics company, for their professional help.

About

Benchmark of data formats

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages