Quantization and Pruning

通过执行量化或修剪来压缩深层神经网络

将深度学习工具箱™与深度学习工具箱Model Quantization Library万博1manbetx支持包，以减少深神经网络的内存足迹和计算要求：

Quantizing the weights, biases, and activations of layers to reduced precision scaled integer data types. You can then generate C/C++, CUDA^®, or HDL code from this quantized network.
Pruning filters from convolution layers by using first-order Taylor approximation. You can then generate C/C++ or CUDA code from this pruned network.

Functions

Quantization

`dlquantizer`	Quantize a deep neural network to 8-bit scaled integer data types
`dlquantizationOptions`	Options for quantizing a trained deep neural network
`calibrate`	Simulate and collect ranges of a deep neural network
`validate`	Quantize and validate a deep neural network
`quantize`	Create quantized deep neural network
`estimateNetworkMetrics`	Estimate metrics for a specific layers of a neural network
`quantizationDetails`	显示量化网络的详细信息

Pruning

`taylorPrunableNetwork`	Network that can be pruned by using first-order Taylor approximation
`向前`	计算培训深度学习网络输出
`预测`	Compute deep learning network output for inference
`updatePrunables`	Remove filters from prunable layers based on importance scores
`updateScore`	计算和累积基于泰勒修剪的重要性得分
`dlnetwork`	深入学习网络定制培训循环

Apps

Deep Network Quantizer

Quantize a deep neural network to 8-bit scaled integer data types

Topics

Deep Learning Quantization

Quantization of Deep Neural Networks
Understand effects of quantization and how to visualize dynamic ranges of network convolution layers.
Quantization Workflow Prerequisites
Products required for the quantization of deep learning networks.
Emulate Target Agnostic Quantized Network
With MATLAB you can quantize neural networks without generating code or deploying to a specific target.

Quantization for GPU Target

模拟GPU目标的量化网络行为
Examine the behavior of a quantized network deployed for GPU targets without generating code.
Code Generation for Quantized Deep Learning Networks（GPU编码器）
Quantize and generate code for a pretrained convolutional neural network.
Quantize Residual Network Trained for Image Classification and Generate CUDA Code
此示例显示了如何在具有残差连接的深度学习神经网络的卷积层中量化可学习的参数，并已通过CIFAR-10数据对图像进行培训。
量化对象检测器并生成CUDA代码
This example shows how to generate CUDA® code for an SSD vehicle detector and a YOLO v2 vehicle detector that performs inference computations in 8-bit integers.

Quantization for FPGA Target

Quantize Network for FPGA Deployment(Deep Learning HDL Toolbox)
此示例显示了如何在神经网络的卷积层中量化可学习的参数，并验证量化网络。
Classify Images on an FPGA Using a Quantized DAG Network(Deep Learning HDL Toolbox)
在此示例中，您使用深度学习HDL Toolbox™来部署量化的深卷积神经网络并对图像进行分类。
通过使用量化的Googlenet网络在FPGA上对图像进行分类(Deep Learning HDL Toolbox)
This example show how to use the Deep Learning HDL Toolbox™ to deploy a quantized GoogleNet network to classify an image.

Quantization for CPU Target

Code Generation for Quantized Deep Learning Networks(MATLAB Coder)
Quantize and generate code for a pretrained convolutional neural network.
在Raspberry Pi上量化深度学习网络的代码生成(MATLAB Coder)
Generate code for deep learning network that performs inference computations in 8-bit integers.

Pruning

图像分类网络的参数修剪和量化
Use parameter pruning and quantization to reduce network size.
Prune Image Classification Network Using Taylor Scores
This example shows how to reduce the size of a deep neural network using Taylor pruning.
Prune Filters in a Detection Network Using Taylor Scores
This example shows how to reduce network size and increase inference speed by pruning convolutional filters in a you only look once (YOLO) v3 object detection network.

Featured Examples

图像分类网络的参数修剪和量化

Use parameter pruning and quantization to reduce network size.

打开实时脚本

Prune Image Classification Network Using Taylor Scores

Reduce the size of a deep neural network using Taylor pruning. By using the taylorPrunableNetwork function to remove convolution layer filters, you can reduce the overall network size and increase the inference speed.

打开实时脚本

Prune Filters in a Detection Network Using Taylor Scores

通过在您只看一次（YOLO）V3对象检测网络中修剪卷积过滤器来减少网络大小并提高推理速度。

打开实时脚本