Deep Network Quantizer
Description
Use theDeep Network Quantizerapp to reduce the memory requirement of a deep neural network by quantizing weights, biases, and activations of convolution layers to 8-bit scaled integer data types. Using this app you can:
Visualize the dynamic ranges of convolution layers in a deep neural network.
Select individual network layers to quantize.
Assess the performance of a quantized network.
Generate GPU code to deploy the quantized network using GPU Coder™.
Generate HDL code to deploy the quantized network to an FPGA using Deep Learning HDL Toolbox™.
Generate C++ code to deploy the quantized network to an ARM Cortex-A microcontroller usingMATLAB®Coder™.
Generate a simulatable quantized network that you can explore in MATLAB without generating code or deploying to hardware.
This app requiresDeep Learning ToolboxModel Quantization Library. To learn about the products required to quantize a deep neural network, seeQuantization Workflow Prerequisites.
Open the Deep Network Quantizer App
MATLAB command prompt: Enter
deepNetworkQuantizer
.MATLAB toolstrip: On theAppstab, underMachine Learning and Deep Learning, click the app icon.
Examples
Quantize a Network for GPU Deployment
To explore the behavior of a neural network with quantized convolution layers, use theDeep Network Quantizerapp. This example quantizes the learnable parameters of the convolution layers of thesqueezenet
neural network after retraining the network to classify new images according to theTrain Deep Learning Network to Classify New Imagesexample.
This example uses a DAG network with the GPU execution environment.
Load the network to quantize into the base workspace.
loadsqueezenetmerchnet
net = DAGNetwork with properties: Layers: [68×1 nnet.cnn.layer.Layer] Connections: [75×2 table] InputNames: {'data'} OutputNames: {'new_classoutput'}
Define calibration and validation data.
应用程序使用校准数据netw锻炼ork and collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.
The app uses the validation data to test the network after quantization to understand the effects of the limited range and precision of the quantized learnable parameters of the convolution layers in the network.
In this example, use the images in theMerchData
data set. Define anaugmentedImageDatastore
object to resize the data for the network. Then, split the data into calibration and validation data sets.
unzip('MerchData.zip'); imds = imageDatastore('MerchData',...'IncludeSubfolders',true,...'LabelSource','foldernames'); [calData, valData] = splitEachLabel(imds, 0.7,'randomized'); aug_calData = augmentedImageDatastore([227 227], calData); aug_valData = augmentedImageDatastore([227 227], valData);
At the MATLAB command prompt, open the app.
deepNetworkQuantizer
In the app, clickNewand selectQuantize a network
.
The app verifies your execution environment. For more information, seeQuantization Workflow Prerequisites.
In the dialog, select the execution environment and the network to quantize from the base workspace. For this example, select a GPU execution environment and the DAG network,net
.
The app displays the layer graph of the selected network.
In theCalibratesection of the toolstrip, underCalibration Data, select theaugmentedImageDatastore
object from the base workspace containing the calibration data,aug_calData
. SelectCalibrate.
TheDeep Network Quantizeruses the calibration data to exercise the network and collect range information for the learnable parameters in the network layers.
When the calibration is complete, the app displays a table containing the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network and their minimum and maximum values during the calibration. To the right of the table, the app displays histograms of the dynamic ranges of the parameters. The gray regions of the histograms indicate data that cannot be represented by the quantized representation. For more information on how to interpret these histograms, seeQuantization of Deep Neural Networks.
In theQuantizecolumn of the table, indicate whether to quantize the learnable parameters in the layer. Layers that are not quantized remain in single-precision after quantization.
In theValidatesection of the toolstrip, underValidation Data, select theaugmentedImageDatastore
object from the base workspace containing the validation data,aug_valData
.
In theValidatesection of the toolstrip, underQuantization Options, select theDefaultmetric function andMinMaxexponent scheme. Select数字转换和验证.
TheDeep Network Quantizerquantizes the weights, activations, and biases of convolution layers in the network to scaled 8-bit integer data types and uses the validation data to exercise the network. The app determines a default metric function to use for the validation based on the type of network that is being quantized. For a classification network, the app uses Top-1 Accuracy.
When the validation is complete, the app displays the results of the validation, including:
Metric function used for validation
Result of the metric function before and after quantization
Memory requirement of the network before and after quantization (MB)
If you want to use a different metric function for validation, for example to use the Top-5 accuracy metric function instead of the default Top-1 accuracy metric function, you can define a custom metric function. Save this function in a local file.
functionaccuracy = hComputeModelAccuracy(predictionScores, net, dataStore)%% Computes model-level accuracy statistics% Load ground truthtmp = readall(dataStore); groundTruth = tmp.response;% Compare with predicted label with actual ground truthpredictionError = {};foridx=1:numel(groundTruth) [~, idy] = max(predictionScores(idx,:)); yActual = net.Layers(end).Classes(idy); predictionError{end+1} = (yActual == groundTruth(idx));%#okend% Sum all prediction errors.predictionError = [predictionError{:}]; accuracy = sum(predictionError)/numel(predictionError);end
To revalidate the network using this custom metric function, underQuantization Options, enter the name of the custom metric function,hComputeModelAccuracy
. SelectAddto addhComputeModelAccuracy
to the list of metric functions available in the app. SelecthComputeModelAccuracy
as the metric function to use.
The custom metric function must be on the path. If the metric function is not on the path, this step will produce an error.
Select数字转换和验证.
The app quantizes the network and displays the validation results for the custom metric function.
The app displays only scalar values in the validation results table. To view the validation results for a custom metric function with non-scalar output, export thedlquantizer
object as described below, then validate using thevalidate
function at the MATLAB command window.
If the performance of the quantized network is not satisfactory, you can choose to not quantize some layers by deselecting the layer in the table. You can also explore the effects of choosing a different exponent selection scheme for quantization in theQuantization Optionsmenu. To see the effects of these changes, select数字转换和验证again.
After calibrating the network, you can choose to export the quantized network or thedlquantizer
object. Select theExportbutton. In the drop down, select from the following options:
Export Quantized Network- Add the quantized network to the base workspace. This option exports a simulatable quantized network that you can explore in MATLAB without deploying to hardware.
Export Quantizer- Add the
dlquantizer
object to the base workspace. You can save thedlquantizer
object and use it for further exploration in theDeep Network Quantizerapp or at the command line, or use it to generate code for your target hardware.Generate Code- Open theGPU Coderapp and generate GPU code from the quantized neural network. Generating GPU code requires a GPU Coder™ license.
Quantize a Network for CPU Deployment
This example uses:
- Deep Learning ToolboxDeep Learning Toolbox
- MATLAB CoderMATLAB Coder
- Embedded CoderEmbedded Coder
- Deep Learning Toolbox Model Quantization LibraryDeep Learning Toolbox Model Quantization Library
- MATLAB Support Package for Raspberry Pi HardwareMATLAB Support Package for Raspberry Pi Hardware
- MATLAB Coder Interface for Deep LearningMATLAB Coder Interface for Deep Learning
To explore the behavior of a neural network with quantized convolution layers, use theDeep Network Quantizerapp. This example quantizes the learnable parameters of the convolution layers of thesqueezenet
neural network after retraining the network to classify new images according to theTrain Deep Learning Network to Classify New Imagesexample.
This example uses a DAG network with the CPU execution environment.
Load the network to quantize into the base workspace.
loadsqueezenetmerchnet
net = DAGNetwork with properties: Layers: [68×1 nnet.cnn.layer.Layer] Connections: [75×2 table] InputNames: {'data'} OutputNames: {'new_classoutput'}
Define calibration and validation data.
应用程序使用校准数据netw锻炼ork and collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network. For the best quantization results, the calibration data must be representative of inputs to the network.
The app uses the validation data to test the network after quantization to understand the effects of the limited range and precision of the quantized learnable parameters of the convolution layers in the network.
In this example, use the images in theMerchData
data set. Define anaugmentedImageDatastore
object to resize the data for the network. Then, split the data into calibration and validation data sets.
unzip('MerchData.zip'); imds = imageDatastore('MerchData',...'IncludeSubfolders',true,...'LabelSource','foldernames'); [calData, valData] = splitEachLabel(imds, 0.7,'randomized'); aug_calData = augmentedImageDatastore([227 227], calData); aug_valData = augmentedImageDatastore([227 227], valData);
At the MATLAB command prompt, open the app.
deepNetworkQuantizer
In the app, clickNewand selectQuantize a network
.
The app verifies your execution environment. For more information, seeQuantization Workflow Prerequisites.
In the dialog, select the execution environment and the network to quantize from the base workspace. For this example, select a CPU execution environment and the DAG network,net
.
The app displays the layer graph of the selected network.
In theCalibratesection of the toolstrip, underCalibration Data, select theaugmentedImageDatastore
object from the base workspace containing the calibration data,aug_calData
. SelectCalibrate.
TheDeep Network Quantizeruses the calibration data to exercise the network and collect range information for the learnable parameters in the network layers.
When the calibration is complete, the app displays a table containing the weights and biases in the convolution and fully connected layers of the network and the dynamic ranges of the activations in all layers of the network and their minimum and maximum values during the calibration. To the right of the table, the app displays histograms of the dynamic ranges of the parameters. The gray regions of the histograms indicate data that cannot be represented by the quantized representation. For more information on how to interpret these histograms, seeQuantization of Deep Neural Networks.
In theQuantizecolumn of the table, indicate whether to quantize the learnable parameters in the layer. Layers that are not quantized remain in single-precision after quantization.
In theValidatesection of the toolstrip, underValidation Data, select theaugmentedImageDatastore
object from the base workspace containing the validation data,aug_valData
.
In theValidatesection of the toolstrip, underHardware Settings, selectRaspberry Pias theSimulation Environment. The app auto-populates the Target credentials from an existing connection or from the last successful connection. You can also use this option to create a new Raspberry Pi connection.
In theValidatesection of the toolstrip, underQuantization Options, select theDefaultmetric function andMinMaxexponent scheme. Select数字转换和验证.
TheDeep Network Quantizerquantizes the weights, activations, and biases of convolution layers in the network to scaled 8-bit integer data types and uses the validation data to exercise the network. The app determines a default metric function to use for the validation based on the type of network that is being quantized. For a classification network, the app uses Top-1 Accuracy.
When the validation is complete, the app displays the results of the validation, including:
Metric function used for validation
Result of the metric function before and after quantization
Memory requirement of the network before and after quantization (MB)
If the performance of the quantized network is not satisfactory, you can choose to not quantize some layers by deselecting the layer in the table. You can also explore the effects of choosing a different exponent selection scheme for quantization in theQuantization Optionsmenu. To see the effects of these changes, select数字转换和验证again.
After calibrating the network, you can choose to export the quantized network or thedlquantizer
object. Select theExportbutton. In the drop down, select from the following options:
Export Quantized Network- Add the quantized network to the base workspace. This option exports a simulatable quantized network that you can explore in MATLAB without deploying to hardware.
Export Quantizer- Add the
dlquantizer
object to the base workspace. You can save thedlquantizer
object and use it for further exploration in theDeep Network Quantizerapp or at the command line, or use it to generate code for your target hardware.Generate Code- Open theMATLAB Coderapp and generate C++ code from the quantized neural network. Generating C++ code requires a MATLAB Coder™ license.
Quantize a Network for FPGA Deployment
To explore the behavior of a neural network that has quantized convolution layers, use theDeep Network Quantizerapp. This example quantizes the learnable parameters of the convolution layers of theLogoNet
神经网络的FPGA的目标。
For this example, you need the products listed underFPGA
inQuantization Workflow Prerequisites.
加载pretrained网络to quantize into the base workspace. Create a file in your current working folder calledgetLogoNetwork.m
. In the file, enter:
functionnet = getLogoNetworkif~isfile('LogoNet.mat') url ='//www.tianjin-qmedu.com/supportfiles/gpucoder/cnn_models/logo_detection/LogoNet.mat'; websave('LogoNet.mat',url);enddata = load('LogoNet.mat'); net = data.convnet;end
加载pretrained网络.
snet = getLogoNetwork;
snet = SeriesNetwork with properties: Layers: [22×1 nnet.cnn.layer.Layer] InputNames: {'imageinput'} OutputNames: {'classoutput'}
Define calibration and validation data to use for quantization.
TheDeep Network Quantizerapp uses calibration data to exercise the network and collect the dynamic ranges of the weights and biases in the convolution and fully connected layers of the network. The app also exercises the dynamic ranges of the activations in all layers of the LogoNet network. For the best quantization results, the calibration data must be representative of inputs to the LogoNet network.
量化后,应用程序使用验证哒ta set to test the network to understand the effects of the limited range and precision of the quantized learnable parameters of the convolution layers in the network.
In this example, use the images in thelogos_dataset
data set to calibrate and validate the LogoNet network. Define animageDatastore
object, then split the data into calibration and validation data sets.
Expedite the calibration and validation process for this example by using a subset of the calibration and validation data. Store the new reduced calibration data set incalData_concise
and the new reduced validation data set invalData_concise
.
currentDir = pwd; openExample('deeplearning_shared/QuantizeNetworkForFPGADeploymentExample') unzip('logos_dataset.zip'); imds = imageDatastore(fullfile(currentDir,'logos_dataset'),...'IncludeSubfolders',true,'FileExtensions','.JPG','LabelSource','foldernames'); [calData,valData] = splitEachLabel(imds,0.7,'randomized'); calData_concise = calData.subset(1:20); valData_concise = valData.subset(1:6);
Open theDeep Network Quantizerapp.
deepNetworkQuantizer
ClickNewand selectQuantize a network
.
Set the execution environment to FPGA and selectsnet - SeriesNetwork
as the network to quantize.
The app displays the layer graph of the selected network.
UnderCalibration Data, select thecalData_concise - ImageDatastore
object from the base workspace containing the calibration data.
ClickCalibrate. By default, the app uses the host GPU to collect calibration data, if one is available. Otherwise, the host CPU is used. You can use theCalibratedrop down menu to select the calibration environment.
TheDeep Network Quantizerapp uses the calibration data to exercise the network and collect range information for the learnable parameters in the network layers.
When the calibration is complete, the app displays a table containing the weights and biases in the convolution and fully connected layers of the network. Also displayed are the dynamic ranges of the activations in all layers of the network and their minimum and maximum values recorded during the calibration. The app displays histograms of the dynamic ranges of the parameters. The gray regions of the histograms indicate data that cannot be represented by the quantized representation. For more information on how to interpret these histograms, seeQuantization of Deep Neural Networks.
In theQuantize Layercolumn of the table, indicate whether to quantize the learnable parameters in the layer. Layers that are not quantized remain in single-precision.
UnderValidation Data, select thevalData_concise - ImageDatastore
object from the base workspace containing the validation data.
In theHardware Settingssection of the toolstrip, select the environment to use for validation of the quantized network. For more information on these options, seeHardware Settings.
For this example, selectXilinx ZC706 (zc706_int8)andJTAG.
UnderQuantization Options, select theDefaultmetric function andMinMaxexponent scheme. For more information on these options, seeQuantization Options.
Click数字转换和验证.
TheDeep Network Quantizerapp quantizes the weights, activations, and biases of convolution layers in the network to scaled 8-bit integer data types and uses the validation data to exercise the network. The app determines a default metric function to use for the validation based on the type of network that is being quantized. For more information, seeQuantization Options.
When the validation is complete, the app displays the validation results.
After quantizing and validating the network, you can choose to export the quantized network.
Click theExportbutton. In the drop-down list, selectExport Quantizerto create adlquantizer
object in the base workspace. You can deploy the quantized network to your target FPGA board and retrieve the prediction results by using MATLAB. For an example, seeClassify Images on FPGA Using Quantized Neural Network(Deep Learning HDL Toolbox).
Import adlquantizer
Object into the Deep Network Quantizer App
这个例子shows you how to import adlquantizer
object from the base workspace into theDeep Network Quantizerapp. This allows you to begin quantization of a deep neural network using the command line or the app, and resume your work later in the app.
Open theDeep Network Quantizerapp.
deepNetworkQuantizer
In the app, clickNewand selectImport dlquantizer object
.
In the dialog, select thedlquantizer
object to import from the base workspace. For this example, usequantObj
that you create in the above example Quantize a Neural Network for GPU Target.
The app imports any data contained in thedlquantizer
object that was collected at the command line. This data can include the network to quantize, calibration data, validation data, and calibration statistics.
The app displays a table containing the calibration data contained in the importeddlquantizer
object,quantObj
. To the right of the table, the app displays histograms of the dynamic ranges of the parameters. The gray regions of the histograms indicate data that cannot be represented by the quantized representation. For more information on how to interpret these histograms, seeQuantization of Deep Neural Networks.
Related Examples
Parameters
Execution Environment
—Execution Environment
GPU
(default) |FPGA
|CPU
|MATLAB
When you selectNew > Quantize a Network, the app allows you to choose the execution environment for the quantized network. How the network is quantized depends on the choice of execution environment.
When you select theMATLAB
execution environment, the app performs target-agnostic quantization of the neural network. This option does not require you to have target hardware in order to explore the quantized network in MATLAB.
Hardware Settings
—Hardware settings
simulation environment | target
Specify hardware settings based on your execution environment.
GPU Execution Environment
Select from the following simulation environments:
Simulation Environment Action GPU
Simulate on host GPU
Deploys the quantized network to the host GPU. Validates the quantized network by comparing performance to single-precision version of the network.
MATLAB
Simulate in MATLAB
Simulates the quantized network in MATLAB. Validates the quantized network by comparing performance to single-precision version of the network.
FPGA Execution Environment
Select from the following simulation environments:
Simulation Environment Action Co-Simulation
Simulate with dlhdl.Simulator
Simulates the quantized network in MATLAB using dlhdl.Simulator
(Deep Learning HDL Toolbox). Validates the quantized network by comparing performance to single-precision version of the network.Intel Arria 10 SoC
arria10soc_int8
Deploys the quantized network to an Intel®Arria®10 SoC board by using the
arria10soc_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.Xilinx ZCU102
zcu102_int8
Deploys the quantized network to a Xilinx®Zynq®UltraScale+™ MPSoC ZCU102 10 SoC board by using the
zcu102_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.Xilinx ZC706
zc706_int8
Deploys the quantized network to a Xilinx Zynq-7000 ZC706 board by using the
zc706_int8
bitstream. Validates the quantized network by comparing performance to single-precision version of the network.When you select theIntel Arria 10 SoC,Xilinx ZCU102, orXilinx ZC706option, additionally select the interface to use to deploy and validate the quantized network.
Target Option Action JTAG Programs the target FPGA board selected underSimulation Environmentby using a JTAG cable. For more information, seeJTAG Connection(Deep Learning HDL Toolbox). Ethernet Programs the target FPGA board selected inSimulation Environmentthrough the Ethernet interface. Specify the IP address for your target board in theIP Addressfield. CPU Execution Environment
Select from the following simulation environments:
Simulation Environment Action Raspberry Pi
Deploys the quantized network to the Raspberry Pi board. Validates the quantized network by comparing performance to single-precision version of the network.
When you select theRaspberry Pioption, additionally specify the following details for the
raspi
connection.Target Option Description Hostname Hostname of the board, specified as a string. Username Linux®username, specified as a string. Password Linux user password, specified as a string.
Quantization Options
—Options for quantization and validation
metric function | exponent scheme
By default, theDeep Network Quantizerapp determines a metric function to use for the validation based on the type of network that is being quantized.
Type of Network | Metric Function |
---|---|
Classification | Top-1 Accuracy– Accuracy of the network |
Object Detection | Average Precision– Average precision over all detection results. See |
Regression | MSE– Mean squared error of the network |
Semantic Segmentation | WeightedIOU– Average IoU of each class, weighted by the number of pixels in that class. See |
You can also specify a custom metric function to use for validation.
You can select the exponent selection scheme to use for quantization of the network:
MinMax— (default) Evaluate the exponent based on the range information in the calibration statistics and avoid overflows.
Histogram— Distribution-based scaling which evaluates the exponent to best fit the calibration data.
Export
—Options for exporting quantized network
Export Quantized Network
|Export Quantizer
|Generate Code
Export Quantized Network— After calibrating the network, quantize and add the quantized network to the base workspace. This option exports a simulatable quantized network,
quantizedNet
, that you can explore in MATLAB without deploying to hardware. This option is equivalent to usingquantize
at the command line.Code generation is not supported for the exported quantized network,
quantizedNet
.Export Quantizer— Add the
dlquantizer
object to the base workspace. You can save thedlquantizer
object and use it for further exploration in theDeep Network Quantizerapp or at the command line, or use it to generate code for your target hardware.Generate Code
Execution Environment Code Generation GPU Open the GPU Coder app and generate GPU code from the quantized and validated neural network. Generating GPU code requires a GPU Coder license. CPU Open theMATLAB Coderapp and generate C++ code from the quantized and validated neural network. Generating C++ code requires aMATLAB Coderlicense.
Version History
Introduced in R2020aR2023a:Specify Raspberry Pi connection in Hardware Settings
You can now validate the quantized network on a Raspberry Pi®board by selectingRaspberry PiinHardware Settings
and specifying target credentials for araspi
object in the validation step.
R2022b:Calibrate on host GPU or host CPU
You can now choose whether to calibrate your network using the host GPU or host CPU. By default, thecalibrate
function and theDeep Network Quantizerapp will calibrate on the host GPU if one is available.
In previous versions, it was required that the execution environment be the same as the instrumentation environment used for the calibration step of quantization.
R2022b:dlnetwork
万博1manbetx
TheDeep Network Quantizerapp now supports calibration and validation fordlnetwork
objects.
R2022a:Validate the performance of quantized network for CPU target
TheDeep Network Quantizerapp now supports the quantization and validation workflow for CPU targets.
R2022a:Quantize neural networks without a specific target
SpecifyMATLAB
as theExecution Environment
to quantize your neural networks without generating code or committing to a specific target for code deployment. This can be useful if you:
Do not have access to your target hardware.
Want to inspect your quantized network without generating code.
Your quantized network implementsint8
data instead ofsingle
data. It keeps the same layers and connections as the original network, and it has the same inference behavior as it would when running on hardware.
Once you have quantized your network, you can use thequantizationDetails
function to inspect your quantized network. Additionally, you also have the option to deploy the code to a GPU target.
See Also
Functions
Open Example
You have a modified version of this example. Do you want to open this example with your edits?
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
选择一个网站翻译内容的地方available and see local events and offers. Based on your location, we recommend that you select:.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina(Español)
- Canada(English)
- United States(English)
Europe
- Belgium(English)
- Denmark(English)
- Deutschland(Deutsch)
- España(Español)
- Finland(English)
- France(Français)
- Ireland(English)
- Italia(Italiano)
- Luxembourg(English)
- Netherlands(English)
- Norway(English)
- Österreich(Deutsch)
- Portugal(English)
- Sweden(English)
- Switzerland
- United Kingdom(English)