NanoSNAP, Nanoscale Signal, Noise and Audio Processing library in C++11

NanoSNAP is a small and portable signal, audio and noise processing library in C++11. 🤞 NanoSNAP depends only on C++11 STL.

Usage

For running TTS(Text-to-speech) and ASR(Automatic Speech Recognition) on C++ Embedded device.
Image processing with neural netowork inference on C++ and Embedded device.
Implement audio and speech feature(e.g. using rfft, mfcc stft, istft) on your C++ machine learning library.

Install and integration

Simply copy include and src folder to your platform.

Requirements

CMake(for building examples and tests, and build NanoSNAP as submodules)
C++11 compiler

Supported platform

Setup(can be skipped if you don't build tests)

If you want to build tests(building tests are enabled by default), you need to checkout submodules.

$ git submodule update --init --recursive --depth 1

Build

$ mkdir build
$ cd build
$ cmake ..
$ make

Build on Visual Studio 2017

> vcsetup.bat

Open build/nanosnap.sln and build it.

Build and running tests

$ mkdir build
$ cd build
$ cmake -DNANOSNAP_ENABLE_TESTS=On ..
$ make
$ ./bin/test_nanosnap

Compiler macros

NANOSNAP_NO_STDIO Disable IO. e.g. wav_read is not available. This feature is useful when you want to use NanoSNAP in Android or embedded devices.

API design

NanoSNAP takes raw pointer for input array values followin its length information(or shape information).

bool proc(const float *input, int n);

Output array is usually std::vector type so that NanoSNAP can allocate buffer for output internally. Output array is a functiona argument when a function needs to return the status.

bool proc(int n, std::vector<float> *output);

Otherwise, output array is a return value.

std::vector<float> proc(int n);

Internal state.

All API does not contain its internal state.

Multithreading

NanoSNAP API is re-entrant as it does not have any internal state, so it should be safe to use in multi-threading program unless input/output memory address does not overlap between threads.

CMake option for developers

-DSANITIZE_ADDRESS=On : Enable Address Sanitizer(for developer).

Data layout of array

NanoSNAP process 2D and higher ND array data as 1D flattened array.

The ordering of array data follows C language(This is same behavior in numpy array in C mode). For example, img[H][W] has W pixels in width(colums) , H pixels in height(rows).

-> memory address increases

+-----------+-----------+     +-------------+-----------+     +---------------+
| img[0][0] | img[0][1] | ... | img[0][W-1] | img[1][0] | ... | img[H-1][W-1] |
+-----------+-----------+     +-------------+-----------+     +---------------+

In contrary to numpy or vision/ML library such like OpenCV, The notation of dimensional arguments for a function signature starts from inner most dimension(right-most array dim). This is rather common notation in C language and graphics community. i.e,

// `output` has the shape of [h][w]
void create_image(size_t w, size_t h, float *output);

// `output` has the shape of [d][h][w]
void create_3d_tensor(size_t w, size_t h, size_t d, float *output);

// `input` has the shape of [nrows][nframes].
void rfft(const float *inout, size_t nframes, size_t nrows, ...);

Features

Numpy

NanoSNAP	Description	Python equivalent
`reshape_with_strides`	Create an array with the given shape and strides.	`numpy.lib.stride_tricks.as_strided`
`convolve`	1D convolution	`numpy.convolve`
`loadtxt`	Load 1D or 2D array	`numpy.loadtxt`
`savetxt`	Save 1D or 2D array	`numpy.savetxt`

Random number generation

NanoSNAP	Description	Python equivalent
`random_uniform`	Uniform random number	`numpy.random.rand`
`random_shuffle`	Randomly shuffle array	`numpy.random.shuffle`

FFT

NanoSNAP	Description	Python equivalent
`rfft`	Real 1D FFT	`numpy.fft.rfft`
`ifft`	Inverse Complex FFT	`numpy.fft.ifft`

Scipy

NanoSNAP	Description	Python equivalent
`lfilter`	Filter data along one-dimension with an IIR or FIR filter.	`scipy.signal.lfilt`
`medfilt`	Median filter	`scipy.signal.medfilt`
`wav_read`	Read .WAV file	`scipy.io.wavfile.read`
`wav_write`	Save .WAV file	`scipy.io.wavfile.write`

Python speech features

NanoSNAP	Description	Python equivalent
`mel2hz`	Mel to Hz	`mel2hz`
`hz2mel`	Hz to Mel	`hz2mel`
`lifter`	Apply a cepstral lifter the the matrix of cepstra	`lifter`

Work in progress

NanoSNAP	Description	Python equivalent
`mfcc`	Mel Frequency Cepstral Coefficients	`mfcc`
`fbank`	Filterbank Energies	`fbank`
`logfbank`	Log Filterbank Energies	`logfbank`
`ssc`	Spectral Subband Centroids	`ssc`

Librosa

NanoSNAP	Description	Python equivalent
`stft`	Short Term Fourier Transform	`librosa.stft`
`istft`	Inverse STFT	`librosa.istft`
`mel`	Create a Filterbank matrix to combine FFT bins into Mel-frequency bins	`librosa.filters.mel`

Image

NanoSNAP	Description	Python equivalent
`resize_bilinear`	Resize image with bilinear	`cv2.resize_image`
`imread`	Load LDR image	`cv2.imread`
`imsave`	Save image as LDR format	`cv2.imsave`

limited support

get_window : 'hann' only. scipy.signal.get_window equivalent.

TODO

Developer note

Adding tests

Write testvector generator and put it to tests/gen/
- Generate testvector file(.inc)
Add .cc to tests. Add it to CMakeLists.txt.

License

NanoSNAP is licensed under MIT license.

Third party licenses.

dr_wav : Public domain or MIT-0. By David Reid.
python_speech_features : The MIT License (MIT). Copyright (c) 2013 James Lyons. https://github.com/jameslyons/python_speech_features
pocketfft : FFT library used in numpy. Copyright (C) 2004-2018 Max-Planck-Society. 3-clause BSD-tyle license. https://gitlab.mpcdf.mpg.de/mtr/pocketfft
c_speech_features : Copyright (c) 2017 Chris Lord. MIT license. https://github.com/Cwiiis/c_speech_features
STB image : Public domain. https://github.com/nothings/stb
sRGB transform : Copyright (c) 2017 Project Nayuki. (MIT License) https://www.nayuki.io/page/srgb-transform-library
fastBPE: Copyright (c) 2019 Guillaume Lample,Timothée Lacroix(MIT License) https://github.com/glample/fastBPE

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
cmake/sanitizers		cmake/sanitizers
doc		doc
include/nanosnap		include/nanosnap
scripts		scripts
src		src
tests		tests
third_party		third_party
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
vcsetup.bat		vcsetup.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NanoSNAP, Nanoscale Signal, Noise and Audio Processing library in C++11

Usage

Install and integration

Requirements

Supported platform

Setup(can be skipped if you don't build tests)

Build

Build on Visual Studio 2017

Build and running tests

Compiler macros

API design

Internal state.

Multithreading

CMake option for developers

Data layout of array

Features

Numpy

Random number generation

FFT

Scipy

Python speech features

Work in progress

Librosa

Image

limited support

TODO

Developer note

Adding tests

License

Third party licenses.

About

Releases

Packages

Languages

lighttransport/nanosnap

Folders and files

Latest commit

History

Repository files navigation

NanoSNAP, Nanoscale Signal, Noise and Audio Processing library in C++11

Usage

Install and integration

Requirements

Supported platform

Setup(can be skipped if you don't build tests)

Build

Build on Visual Studio 2017

Build and running tests

Compiler macros

API design

Internal state.

Multithreading

CMake option for developers

Data layout of array

Features

Numpy

Random number generation

FFT

Scipy

Python speech features

Work in progress

Librosa

Image

limited support

TODO

Developer note

Adding tests

License

Third party licenses.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages