audioFeatureExtractor

Streamline audio feature extraction

Since R2019b

expand all in page

Description

audioFeatureExtractorencapsulates multiple audio feature extractors into a streamlined and modular implementation.

Creation

Syntax

aFE = audioFeatureExtractor ()

aFE = audioFeatureExtractor (Name=Value)

Description

aFE= audioFeatureExtractor()creates an audio feature extractor with default property values.

example

aFE= audioFeatureExtractor(Name=Value)specifies nondefault properties foraFEusing one or more name-value arguments.

Properties

expand all

Main Properties

`Window`—Analysis window
`hamming(1024,"periodic")`(default) |real vector

Analysis window, specified as a real vector.

Data Types:single|double

`OverlapLength`—Overlap length of adjacent analysis windows
`512`(default) |integer in the range [0,`numel(Window)`)

重叠相邻分析窗口的长度,specified as an integer in the range [0,numel(Window)).

Data Types:single|double

`FFTLength`—FFT length
`[]`(default) |positive integer

FFT length, specified as an integer. The default value of[]means that the FFT length is equal to the window lengthnumel(Window).

Data Types:single|double

`SampleRate`—Input sample rate (Hz)
`44100`(default) |positive scalar

Input sample rate in Hz, specified as a positive scalar.

Data Types:single|double

`SpectralDescriptorInput`—Input to spectral descriptors
`"linearSpectrum"`(default) |`"melSpectrum"`|`"barkSpectrum"`|`"erbSpectrum"`

Input to spectral descriptors, specified as"linearSpectrum","melSpectrum","barkSpectrum", or"erbSpectrum".

Spectral descriptors affected by this property are:

The spectrum input to the spectral descriptors is the same as output from the corresponding feature:

For example, if you setSpectralDescriptorInputto"barkSpectrum", andspectralCentroidtotrue, thenaFEreturns the centroid of the default Bark spectrum.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav"); aFE = audioFeatureExtractor(SampleRate=fs,...SpectralDescriptorInput="barkSpectrum",...spectralCentroid=true); barkSpectralCentroid = extract(aFE,audioIn);

If you specify a nondefaultbarkSpectrumusingsetExtractorParameters, then the nondefault Bark spectrum is the input to the spectral descriptors. For example, if you callsetExtractorParameters(aFE,"barkSpectrum",NumBands=40), thenaFEreturns the centroid of a 40-band Bark spectrum.

setExtractorParameters(aFE,"barkSpectrum",NumBands=40) bark40SpectralCentroid = extract(aFE,audioIn);

Data Types:char|string

`FeatureVectorLength`—Number of features output from extract
positive integer

This property is read-only.

Total number of features output fromextractfor the current object configuration, specified as a positive integer.FeatureVectorLengthis equal to the second dimension of the output from theextractfunction.

Data Types:single|double

Features to Extract

`linearSpectrum`—Extract linear spectrum
`false`(default) |`true`

Extract the one-sided linear spectrum, specified astrueorfalse.

To set parameters of the linear spectrum extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"linearSpectrum",Name=Value)

Settable parameters for the linear spectrum extraction are:

FrequencyRange–– Frequency range of the extracted spectrum in Hz, specified as a two-element vector of increasing numbers in the range [0,SampleRate/2]. If unspecified,FrequencyRangedefaults to[0,SampleRate/2].
SpectrumType–– Spectrum type, specified as"power"or"magnitude". If unspecified,SpectrumTypedefaults to"power".
WindowNormalization–– Apply window normalization, specified astrueorfalse. If unspecified,WindowNormalizationdefaults totrue.

Data Types:logical

`melSpectrum`—Extract mel spectrum
`false`(default) |`true`

Extract the one-sided mel spectrum, specified astrueorfalse.

To set parameters of the mel spectrum extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"melSpectrum",Name=Value)

Settable parameters for the mel spectrum extraction are:

FrequencyRange–– Frequency range of the extracted spectrum in Hz, specified as a two-element vector of increasing numbers in the range [0,SampleRate/2]. If unspecified,FrequencyRangedefaults to[0,SampleRate/2].
SpectrumType–– Spectrum type, specified as"power"or"magnitude". If unspecified,SpectrumTypedefaults to"power".
NumBands–– Number of mel bands, specified as an integer. If unspecified,NumBandsdefaults to32.
FilterBankNormalization–– Normalization applied to bandpass filters, specified as"bandwidth","area", or"none". If unspecified,FilterBankNormalizationdefaults to"bandwidth".
WindowNormalization–– Apply window normalization, specified astrueorfalse. If unspecified,WindowNormalizationdefaults totrue.
FilterBankDesignDomain–– Domain in which the filter bank is designed, specified as either"linear"or"warped". If unspecified,FilterBankDesignDomaindefaults to"linear".

Data Types:logical

`barkSpectrum`—Extract Bark spectrum
`false`(default) |`true`

Extract the one-sided Bark spectrum, specified astrueorfalse.

To set parameters of the Bark spectrum extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"barkSpectrum",Name=Value)

Settable parameters for the Bark spectrum extraction are:

FrequencyRange–– Frequency range of the extracted spectrum in Hz, specified as a two-element vector of increasing numbers in the range [0,SampleRate/2]. If unspecified,FrequencyRangedefaults to[0,SampleRate/2].
SpectrumType–– Spectrum type, specified as"power"or"magnitude". If unspecified,SpectrumTypedefaults to"power".
NumBands–– Number of Bark bands, specified as an integer. If unspecified,NumBandsdefaults to32.
FilterBankNormalization–– Normalization applied to bandpass filters, specified as"bandwidth","area", or"none". If unspecified,FilterBankNormalizationdefaults to"bandwidth".
WindowNormalization–– Apply window normalization, specified astrueorfalse. If unspecified,WindowNormalizationdefaults totrue.
FilterBankDesignDomain–– Domain in which the filter bank is designed, specified as either"linear"or"warped". If unspecified,FilterBankDesignDomaindefaults to"linear".

Data Types:logical

`erbSpectrum`—Extract ERB spectrum
`false`(default) |`true`

Extract the one-sided ERB spectrum, specified astrueorfalse.

To set parameters of the ERB spectrum extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"erbSpectrum",Name=Value)

Settable parameters for the ERB spectrum extraction are:

FrequencyRange–– Frequency range of the extracted spectrum in Hz, specified as a two-element vector of increasing numbers in the range [0,SampleRate/2]. If unspecified,FrequencyRangedefaults to[0,SampleRate/2].
SpectrumType–– Spectrum type, specified as"power"or"magnitude". If unspecified,SpectrumTypedefaults to"power".
NumBands–– Number of ERB bands, specified as an integer. If unspecified,NumBandsdefaults toceil(hz2erb(FrequencyRange(2))-hz2erb(FrequencyRange(1))).
FilterBankNormalization–– Normalization applied to bandpass filters, specified as"bandwidth","area", or"none". If unspecified,FilterBankNormalizationdefaults to"bandwidth".
WindowNormalization–– Apply window normalization, specified astrueorfalse. If unspecified,WindowNormalizationdefaults totrue.

Data Types:logical

`mfcc`—Extract mel-frequency cepstral coefficients (MFCC)
`false`(default) |`true`

Extract mel-frequency cepstral coefficients (MFCC), specified astrueorfalse.

To set parameters of the MFCC extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"mfcc",Name=Value)

Settable parameters for the MFCC extraction are:

NumCoeffs–– Number of coefficients returned for each window, specified as a positive integer. If unspecified,NumCoeffsdefaults to13.
DeltaWindowLength–– Delta window length, specified as an odd integer greater than 2. If unspecified,DeltaWindowLengthdefaults to9. This parameter affects themfccDeltaandmfccDeltaDeltafeatures.
Rectification–– Type of nonlinear rectification, specified as"log"or"cubic-root".

The mel-frequency cepstral coefficients are calculated using themelSpectrum.

Data Types:logical

`mfccDelta`—Extract delta of MFCC
`false`(default) |`true`

Extract delta of MFCC, specified astrueorfalse.

The delta MFCC is calculated based on the extracted MFCC. Parameters set onmfccaffectmfccDelta.

Data Types:logical

`mfccDeltaDelta`—Extract delta-delta of MFCC
`false`(default) |`true`

Extract delta-delta of MFCC, specified astrueorfalse.

The delta-delta MFCC is calculated based on the extracted MFCC. Parameters set onmfccaffectmfccDeltaDelta.

Data Types:logical

`gtcc`—Extract gammatone cepstral coefficients (GTCC)
`false`(default) |`true`

Extract gammatone cepstral coefficients (GTCC), specified astrueorfalse.

To set parameters of the GTCC extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"gtcc",Name=Value)

Settable parameters for the GTCC extraction are:

NumCoeffs–– Number of coefficients returned for each window, specified as a positive integer. If unspecified,NumCoeffsdefaults to13.
DeltaWindowLength–– Delta window length, specified as an odd integer greater than 2. If unspecified,DeltaWindowLengthdefaults to9. This parameter affects thegtccDeltaandgtccDeltaDeltafeatures.

Rectification–– Type of nonlinear rectification, specified as"log"or"cubic-root".

The gammatone cepstral coefficients are calculated using theerbSpectrum.

Data Types:logical

`gtccDelta`—Extract delta of GTCC
`false`(default) |`true`

Extract delta of GTCC, specified astrueorfalse.

The delta GTCC is calculated based on the extracted GTCC. Parameters set ongtccaffectgtccDelta.

Data Types:logical

`gtccDeltaDelta`—Extract delta-delta of GTCC
`false`(default) |`true`

Extract delta-delta of GTCC, specified astrueorfalse.

The delta-delta GTCC is calculated based on the extracted GTCC. Parameters set ongtccaffectgtccDeltaDelta.

Data Types:logical

`spectralCentroid`—Extract spectral centroid
`false`(default) |`true`

Extract spectral centroid, specified astrueorfalse.

The spectral centroid is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralCrest`—Extract spectral crest
`false`(default) |`true`

Extract spectral crest, specified astrueorfalse.

The spectral crest is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralDecrease`—Extract spectral decrease
`false`(default) |`true`

Extract spectral decrease, specified astrueorfalse.

The spectral decrease is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralEntropy`—Extract spectral entropy
`false`(default) |`true`

Extract spectral entropy, specified astrueorfalse.

The spectral entropy is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralFlatness`—Extract spectral flatness
`false`(default) |`true`

Extract spectral flatness, specified astrueorfalse.

The spectral flatness is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralFlux`—Extract spectral flux
`false`(default) |`true`

Extract spectral flux, specified astrueorfalse.

The spectral flux is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

To set parameters of the spectral flux extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"spectralFlux",Name=Value)

Settable parameters for the spectral flux extraction are:

NormType–– Norm type used to calculate the spectral flux, specified as1or2. If unspecified,NormTypedefaults to2.

Data Types:logical

`spectralKurtosis`—Extract spectral kurtosis
`false`(default) |`true`

Extract spectral kurtosis, specified astrueorfalse.

The spectral kurtosis is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralRolloffPoint`—Extract spectral rolloff point
`false`(default) |`true`

Extract spectral rolloff point, specified astrueorfalse.

The spectral rolloff point is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

To set parameters of the spectral rolloff point extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"spectralRolloffPoint",Name=Value)

Settable parameters for the spectral flux extraction are:

Threshold–– Threshold of the rolloff point, specified as a scalar in the range (0, 1). If unspecified,Thresholddefaults to0.95.

Data Types:logical

`spectralSkewness`—Extract spectral skewness
`false`(default) |`true`

Extract spectral skewness, specified astrueorfalse.

The spectral skewness is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralSlope`—Extract spectral slope
`false`(default) |`true`

Extract spectral slope, specified astrueorfalse.

The spectral slope is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`spectralSpread`—Extract spectral spread
`false`(default) |`true`

Extract spectral spread, specified astrueorfalse.

The spectral spread is calculated on one of the following spectral representations, as specified by theSpectralDescriptorInputproperty:

Data Types:logical

`pitch`—Extract pitch
`false`(default) |`true`

Extract pitch, specified astrueorfalse.

To set parameters of the pitch extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"pitch",Name=Value)

Settable parameters for the pitch extraction are:

Method–– Method used to calculate the pitch, specified as"PEF","NCF","CEP","LHS", or"SRH". If unspecified,Methoddefaults to"NCF". For a description of available pitch extraction methods, seepitch.
Range–– Range within to search for the pitch in Hz, specified as a two-element row vector of increasing values. If unspecified,Rangedefaults to[50,400].
MedianFilterLength–– Median filter length used to smooth pitch estimates over time, specified as a positive integer. If unspecified,MedianFilterLengthdefaults to1(no median filtering).

Data Types:logical

`harmonicRatio`—Extract harmonic ratio
`false`(default) |`true`

Extract harmonic ratio, specified astrueorfalse.

Data Types:logical

`zerocrossrate`—Extract zero-crossing rate
`false`(default) |`true`

Extract zero-crossing rate, specified astrueorfalse.

To set parameters of the zero-crossing rate extraction, usesetExtractorParameters:

setExtractorParameters(aFE,"zerocrossrate",Name=Value)

Settable parameters for the zero-crossing rate extraction are:

Method–– Method for computing the zero-crossing rate, specified as"difference"or"comparison". If unspecified,Method, defaults to"difference". For more information, seezerocrossrate.
Level–– Signal level for which the crossing rate is computed, specified as a real scalar.audioFeatureExtractorsubtracts theLevelvalue from the signal and then finds the zero crossings. If unspecified,Leveldefaults to0.
Threshold–– Threshold above and below theLevelvalue over which the crossing rate is computed, specified as a real scalar.audioFeatureExtractorsets all the values of the input in the range[–Threshold,Threshold]to0and then finds the zero crossings. If unspecified,Thresholddefaults to0.
TransitionEdge— Transitions to include when counting zero crossings, specified as"falling","rising", or"both". If you specify"falling", only negative-going transitions are counted. If you specify"rising", only positive-going transitions are counted. If unspecified,TransitionEdgedefaults to"both".
ZeroPositive— Sign convention, specified as a logical scalar. If you specifyZeroPositiveastrue, then0is considered positive. If you specifyZeroPositiveasfalse, thenaudioFeatureExtractorconsiders0,–1, and+1to have distinct signs following the convention of thesignfunction. If unspecified,ZeroPositivedefaults tofalse.

Data Types:logical

`shortTimeEnergy`—Extract short-time energy
`false`(default) |`true`

Extract short-time energy, specified astrueorfalse. The short-time energy is computed using

sTE = sum(xbw.^2,1),

wherexbwis the buffered and windowed signal.

Example: Chirp Function

Generate a chirp sampled at 1 kHz for 3 seconds. The instantaneous frequency is 100 Hz at $t = 0$ and crosses 200 Hz at $t = 1$ second. Divide the signal into 103-sample segments with 43 samples of overlap between adjoining segments. Window each segment with a periodic Hamming window.

fs = 1e3; x = chirp(0:1/fs:3,100,1,200)'; win = hamming(103,"periodic"); nover = 43; [xb,~] = buffer(x,length(win),nover,"nodelay"); xbw = xb.*win;

Compute the short-time energy using the definition.

Edef = sum(xbw.^2,1)';

UseaudioFeatureExtractorto compute the short-time energy.

EaFE = extract(audioFeatureExtractor(shortTimeEnergy=true,...SampleRate=fs,Window=win,OverlapLength=nover),x);

Verify that both procedures give the same short-time energy.

dff = max(abs(EaFE-Edef))

dff = 0

Data Types:logical

Object Functions

`extract`	Extract audio features
`setExtractorParameters`	Set nondefault parameter values for individual feature extractors
`info`	Output mapping and individual feature extractor parameters
`generateMATLABFunction`	CreateMATLABfunction compatible with C/C++ code generation
`plotFeatures`	Plot extracted audio features

Examples

collapse all

Extract Multiple Audio Features

Open Live Script

Read in an audio signal.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");

Create anaudioFeatureExtractorobject that extracts the MFCC, delta MFCC, delta-delta MFCC, pitch, spectral centroid, zero-crossing rate, and short-time energy of the signal. Use a 30 ms analysis window with 20 ms overlap.

aFE = audioFeatureExtractor (...SampleRate=fs,...Window=hamming(round(0.03*fs),"periodic"),...OverlapLength=round(0.02*fs),...mfcc=true,...mfccDelta=true,...mfccDeltaDelta=true,...pitch=true,...spectralCentroid=true,...zerocrossrate=true,...shortTimeEnergy=true);

Callextractto extract the audio features from the audio signal.

features = extract(aFE,audioIn);

Useinfoto determine which column of the feature extraction matrix corresponds to the requested pitch extraction.

idx = info(aFE)

idx =struct with fields:mfcc: [1 2 3 4 5 6 7 8 9 10 11 12 13] mfccDelta: [14 15 16 17 18 19 20 21 22 23 24 25 26] mfccDeltaDelta: [27 28 29 30 31 32 33 34 35 36 37 38 39] spectralCentroid: 40 pitch: 41 zerocrossrate: 42 shortTimeEnergy: 43

Plot the detected pitch over time.

t = linspace(0,size(audioIn,1)/fs,size(features,1)); plot(t,features(:,idx.pitch)) title("Pitch") xlabel("Time (s)") ylabel("Frequency (Hz)")

Figure contains an axes object. The axes object with title Pitch, xlabel Time (s), ylabel Frequency (Hz) contains an object of type line.

Plot the zero-crossing rate over time.

plot(t,features(:,idx.zerocrossrate)) title("Zero-Crossing Rate") xlabel("Time (s)")

Figure contains an axes object. The axes object with title Zero-Crossing Rate, xlabel Time (s) contains an object of type line.

Plot the short-time energy over time.

plot(t,features(:,idx.shortTimeEnergy)) title("Short-Time Energy") xlabel("Time (s)")

Figure contains an axes object. The axes object with title Short-Time Energy, xlabel Time (s) contains an object of type line.

Extract Features from Dataset

Open Live Script

Create an audio datastore that points to audio samples included with Audio Toolbox®.

folder = fullfile(matlabroot,"toolbox","audio","samples"); ads = audioDatastore(folder);

Find all files that correspond to a sample rate of 44.1 kHz and thensubsetthe datastore.

keepFile = cellfun(@(x)contains(x,"44p1"),ads.Files); ads = subset(ads,keepFile);

Convert the data to atallarray.tallarrays are evaluated only when you request them explicitly usinggather. MATLAB® automatically optimizes the queued calculations by minimizing the number of passes through the data. If you have Parallel Computing Toolbox™, you can spread the calculations across multiple workers. The audio data is represented as anM-by-1 tall cell array, whereMis the number of files in the audio datastore.

adsTall = tall(ads)

Starting parallel pool (parpool) using the 'local' profile ... Connected to the parallel pool (number of workers: 6). adsTall = M×1 tall cell array { 539648×1 double} { 227497×1 double} { 8000×1 double} { 685056×1 double} { 882688×2 double} {1115760×2 double} { 505200×2 double} {3195904×2 double} : : : :

Create anaudioFeatureExtractorobject to extract the mel spectrum, Bark spectrum, ERB spectrum, and linear spectrum from each audio file. Use the default analysis window and overlap length for the spectrum extraction.

aFE = audioFeatureExtractor (SampleRate=44.1e3,...melSpectrum=true,...barkSpectrum=true,...erbSpectrum=true,...linearSpectrum=true);

Define acellfunfunction so that audio features are extracted from each cell of the tall array. Callgatherto evaluate the tall array.

specsTall = cellfun(@(x)extract(aFE,x),adsTall,UniformOutput=false); specs = gather(specsTall);

Evaluating tall expression using the Parallel Pool 'local': - Pass 1 of 1: Completed in 14 sec Evaluation completed in 14 sec

Thespecsvariable returned from gather is anumFiles-by-1 cell array, wherenumFilesis the number of files in the datastore. Each element of the cell array is anumHops-by-numFeatures-by-numChannelsarray, where the number of hops and number of channels depends on the length and number of channels of the audio file, and the number of features is the requested number of features from the audio data.

numFiles = numel(specs)

numFiles = 12

[numHops1,numFeaturesFile1,numChanelsFile1] = size(specs{1})

numHops1 = 1053

numFeaturesFile1 = 620

numChanelsFile1 = 1

[numHops2,numFeaturesFile2,numChanelsFile2] = size(specs{2})

numHops2 = 443

numFeaturesFile2 = 620

numChanelsFile2 = 1

Visualize Extracted Audio Features

Open Live Script

UseplotFeaturesto visualize audio features extracted with anaudioFeatureExtractorobject.

Read in an audio signal from a file.

[audioIn,fs] = audioread("Counting-16-44p1-mono-15secs.wav");

Create anaudioFeatureExtractorobject that extracts the gammatone cepstral coefficients (GTCCs) and the delta of the GTCCs. Set theSampleRateproperty to the sample rate of the audio signal, and use the default values for the other properties.

afe = audioFeatureExtractor (SampleRate = fs, gtcc = true,gtccDelta=true);

Plot the features extracted from the audio signal.

plotFeatures(afe,audioIn)

Figure audioFeatureExtractor contains 2 axes objects and another object of type uipanel. Axes object 1 with title GTCC, xlabel Time (s), ylabel Coefficient contains an object of type image. Axes object 2 with title GTCC Delta, xlabel Time (s), ylabel Coefficient contains an object of type image.

Algorithms

TheaudioFeatureExtractorcreates a feature extraction pipeline based on your selected features. To reduce computations,audioFeatureExtractorreuses intermediary representations and outputs some intermediate representations as features.

例如,要创建一个对象that extracts the centroid of the Bark spectrum, the flux of the Bark spectrum, the pitch, the harmonic ratio, and the delta-delta of the MFCC, specify theaudioFeatureExtractoras follows.

aFE = audioFeatureExtractor (...SpectralDescriptorInput="barkSpectrum",...spectralCentroid=true,...spectralFlux=true,...pitch=true,...harmonicRatio=true,...mfccDeltaDelta=true)

aFE = audioFeatureExtractor with properties: Properties Window: [1024×1 double] OverlapLength: 512 SampleRate: 44100 FFTLength: [] SpectralDescriptorInput: 'barkSpectrum' Enabled Features mfccDeltaDelta, spectralCentroid, spectralFlux, pitch, harmonicRatio Disabled Features linearSpectrum, melSpectrum, barkSpectrum, erbSpectrum, mfcc, mfccDelta gtcc, gtccDelta, gtccDeltaDelta, spectralCrest, spectralDecrease, spectralEntropy spectralFlatness, spectralKurtosis, spectralRolloffPoint, spectralSkewness, spectralSlope, spectralSpread To extract a feature, set the corresponding property to true. For example, obj.mfcc = true, adds mfcc to the list of enabled features.

This configuration corresponds to the highlighted feature extraction pipeline.

Note

BecauseaudioFeatureExtractorreuses intermediary representations, the features output fromaudioFeatureExtractormight not correspond with the default configuration of features output by corresponding individual feature extractors.

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

Usage notes and limitations:

You cannot generate code directly fromaudioFeatureExtractor. You can generate C/C++ code from the function returned bygenerateMATLABFunction.
Functions returned bygenerateMATLABFunctionthat compute an auditory spectrum (mel, Bark, ERB) support optimized code generation using single instruction, multiple data (SIMD) instructions. For more information about SIMD code generation, seeGenerate SIMD Code for MATLAB Functions(MATLAB Coder).
zerocrossratecode generation does not support disabling dynamic memory allocation when the input is multichannel.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

This function fully supports GPU arrays. For more information, seeRun MATLAB Functions on a GPU(Parallel Computing Toolbox).

Version History

Introduced in R2019b

expand all

R2023a:生成优化的C / c++代码计算auditory spectrum

Functions returned bygenerateMATLABFunctionthat compute an auditory spectrum (mel, Bark, ERB) support optimized C/C++ code generation using single instruction, multiple data (SIMD) instructions.

R2022b:Visualize extracted features

Use theplotFeaturesobject function to visualize extracted audio features.

R2020b:Computation of deltas and delta-deltas

TheaudioDeltafunction is now used to computemfccDelta,mfccDeltaDelta,gtccDelta, andgtccDeltaDelta. TheaudioDeltaalgorithm has a different startup behavior than the previous algorithm. The default window length used to compute the deltas has changed from2to9. A delta window length of2is no longer supported.

audioFeatureExtractor

Description

Creation

Syntax

Description

Properties

Main Properties

Window—Analysis windowhamming(1024,"periodic")(default) |real vector

OverlapLength—Overlap length of adjacent analysis windows512(default) |integer in the range [0,numel(Window))

FFTLength—FFT length[](default) |positive integer

SampleRate—Input sample rate (Hz)44100(default) |positive scalar

SpectralDescriptorInput—Input to spectral descriptors"linearSpectrum"(default) |"melSpectrum"|"barkSpectrum"|"erbSpectrum"

FeatureVectorLength—Number of features output from extractpositive integer

Features to Extract

linearSpectrum—Extract linear spectrumfalse(default) |true

melSpectrum—Extract mel spectrumfalse(default) |true

barkSpectrum—Extract Bark spectrumfalse(default) |true

erbSpectrum—Extract ERB spectrumfalse(default) |true

mfcc—Extract mel-frequency cepstral coefficients (MFCC)false(default) |true

mfccDelta—Extract delta of MFCCfalse(default) |true

mfccDeltaDelta—Extract delta-delta of MFCCfalse(default) |true

gtcc—Extract gammatone cepstral coefficients (GTCC)false(default) |true

gtccDelta—Extract delta of GTCCfalse(default) |true

gtccDeltaDelta—Extract delta-delta of GTCCfalse(default) |true

spectralCentroid—Extract spectral centroidfalse(default) |true

spectralCrest—Extract spectral crestfalse(default) |true

spectralDecrease—Extract spectral decreasefalse(default) |true

spectralEntropy—Extract spectral entropyfalse(default) |true

spectralFlatness—Extract spectral flatnessfalse(default) |true

spectralFlux—Extract spectral fluxfalse(default) |true

spectralKurtosis—Extract spectral kurtosisfalse(default) |true

spectralRolloffPoint—Extract spectral rolloff pointfalse(default) |true

spectralSkewness—Extract spectral skewnessfalse(default) |true

spectralSlope—Extract spectral slopefalse(default) |true

spectralSpread—Extract spectral spreadfalse(default) |true

pitch—Extract pitchfalse(default) |true

harmonicRatio—Extract harmonic ratiofalse(default) |true

zerocrossrate—Extract zero-crossing ratefalse(default) |true

shortTimeEnergy—Extract short-time energyfalse(default) |true

Example: Chirp Function

Object Functions

Examples

Extract Multiple Audio Features

Extract Features from Dataset

Visualize Extracted Audio Features

Algorithms

Extended Capabilities

C/C++ Code GenerationGenerate C and C++ code using MATLAB® Coder™.

GPU ArraysAccelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.

Version History

R2023a:生成优化的C / c++代码计算auditory spectrum

R2022b:Visualize extracted features

R2020b:Computation of deltas and delta-deltas

See Also

`Window`—Analysis window
`hamming(1024,"periodic")`(default) |real vector

`OverlapLength`—Overlap length of adjacent analysis windows
`512`(default) |integer in the range [0,`numel(Window)`)

`FFTLength`—FFT length
`[]`(default) |positive integer

`SampleRate`—Input sample rate (Hz)
`44100`(default) |positive scalar

`SpectralDescriptorInput`—Input to spectral descriptors
`"linearSpectrum"`(default) |`"melSpectrum"`|`"barkSpectrum"`|`"erbSpectrum"`

`FeatureVectorLength`—Number of features output from extract
positive integer

`linearSpectrum`—Extract linear spectrum
`false`(default) |`true`

`melSpectrum`—Extract mel spectrum
`false`(default) |`true`

`barkSpectrum`—Extract Bark spectrum
`false`(default) |`true`

`erbSpectrum`—Extract ERB spectrum
`false`(default) |`true`

`mfcc`—Extract mel-frequency cepstral coefficients (MFCC)
`false`(default) |`true`

`mfccDelta`—Extract delta of MFCC
`false`(default) |`true`

`mfccDeltaDelta`—Extract delta-delta of MFCC
`false`(default) |`true`

`gtcc`—Extract gammatone cepstral coefficients (GTCC)
`false`(default) |`true`

`gtccDelta`—Extract delta of GTCC
`false`(default) |`true`

`gtccDeltaDelta`—Extract delta-delta of GTCC
`false`(default) |`true`

`spectralCentroid`—Extract spectral centroid
`false`(default) |`true`

`spectralCrest`—Extract spectral crest
`false`(default) |`true`

`spectralDecrease`—Extract spectral decrease
`false`(default) |`true`

`spectralEntropy`—Extract spectral entropy
`false`(default) |`true`

`spectralFlatness`—Extract spectral flatness
`false`(default) |`true`

`spectralFlux`—Extract spectral flux
`false`(default) |`true`

`spectralKurtosis`—Extract spectral kurtosis
`false`(default) |`true`

`spectralRolloffPoint`—Extract spectral rolloff point
`false`(default) |`true`

`spectralSkewness`—Extract spectral skewness
`false`(default) |`true`

`spectralSlope`—Extract spectral slope
`false`(default) |`true`

`spectralSpread`—Extract spectral spread
`false`(default) |`true`

`pitch`—Extract pitch
`false`(default) |`true`

`harmonicRatio`—Extract harmonic ratio
`false`(default) |`true`

`zerocrossrate`—Extract zero-crossing rate
`false`(default) |`true`

`shortTimeEnergy`—Extract short-time energy
`false`(default) |`true`

C/C++ Code Generation
Generate C and C++ code using MATLAB® Coder™.

GPU Arrays
Accelerate code by running on a graphics processing unit (GPU) using Parallel Computing Toolbox™.