Utils

Contains methods to access LMDB database and convert the AN4 dataset into the appropriate format.

util.mk_lmdb(root_path, index_path, dict_path, out_dir, windowSize, stride)

Used in the prepare.sh script to create the AN4 training and test dataset for our neural net. Converts the given directory into LMDB databases that contain the data used in online training/testing.

index_path Path to the index file.

dict_path Path to the dictionary file.

out_dir Directory where the LMDBs are stored.

windowSize, stride Parameters chosen for the spectrogram.

Index file

The index file contains a list of file paths associated to a transcript like below:

<wave_file_path>@<transcript>@
an4/test/example.wav@EXAMPLE TRANSCRIPT@

The @ symbols are important to add for each entry. More information about extending to your own dataset can be seen here.