Main Content

Pretrained Models

Transfer learning, sound classification, feature embeddings, pretrained audio deep learning networks

Audio Toolbox™ provides MATLAB®and Simulink®support for pretrained audio deep learning networks. Locate and classify sounds with YAMNet and estimate pitch with CREPE. Extract VGGish or OpenL3 feature embeddings to input to machine learning and deep learning systems. Use i-vector systems to produce compact representations of audio signals for applications such as speaker recognition, verification, identification, and diarization, speech emotion recognition, and acoustic machine fault detection.

This functionality requires Deep Learning Toolbox™. The Audio Toolbox pretrained networks are available inDeep Network Designer(Deep Learning Toolbox).

Functions

expand all

vggish VGGish neural network
vggishPreprocess Preprocess audio for VGGish feature extraction
vggishEmbeddings Extract VGGish feature embeddings
classifySound Classify sounds in audio signal
yamnet YAMNet neural network
yamnetGraph Graph of YAMNet AudioSet ontology
yamnetPreprocess Preprocess audio for YAMNet classification
openl3 OpenL3神经网络
openl3Preprocess Preprocess audio for OpenL3 feature extraction
openl3Embeddings Extract OpenL3 feature embeddings
crepe CREPE neural network
crepePreprocess Preprocess audio for CREPE deep learning network
crepePostprocess Postprocess output of CREPE deep learning network
pitchnn Estimate pitch with deep learning neural network
ivectorSystem Create i-vector system
speakerRecognition Pretrained speaker recognition system

Blocks

expand all

VGGish Embeddings Extract VGGish embeddings
VGGish Preprocess Preprocess audio for VGGish feature extraction
VGGish VGGish embeddings extraction network
Sound Classifier Classify sounds in audio signal
YAMNet YAMNet sound classification network
YAMNet Preprocess Preprocess audio for YAMNet classification

Apps

Deep Network Designer Design, visualize, and train deep learning networks