深度学习技巧和窍门 - Matlab＆Simulink -Mathworks A万博1manbetxustralia

Deep Learning Tips and Tricks

本页介绍了各种培训选项和技术，以提高深度学习网络的准确性。

选择网络体系结构

这appropriate network architecture depends on the task and the data available. Consider these suggestions when deciding which architecture to use and whether to use a pretrained network or to train from scratch.

Data	Description of Task	Learn More
图片	Classification of natural images	尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Deep Neural Networks。要了解如何使用深层网络设计师进行交互式的网络进行转移学习，请参见Transfer Learning with Deep Network Designer。
	Regression of natural images	尝试不同的预处理网络。有关如何将验证分类网络转换为回归网络的示例，请参见Convert Classification Network into Regression Network。
	非天然图像的分类和回归（例如，微小的图像和频谱图	For an example showing how to classify tiny images, seeTrain Residual Network for Image Classification。 For an example showing how to classify spectrograms, see使用深度学习的语音命令识别。
	Semantic segmentation	Computer Vision Toolbox™ provides tools to create deep learning networks for semantic segmentation. For more information, seeGetting Started with Semantic Segmentation Using Deep Learning（计算机视觉工具箱）。
Sequences, time series, and signals	序列到标签分类	例如，请参阅使用深度学习的序列分类。
	Sequence-to-sequence classification and regression	要了解更多，请参阅Sequence-to-Sequence Classification Using Deep LearningandSequence-to-Sequence Regression Using Deep Learning。
	序列到一回归	例如，请参阅Sequence-to-One Regression Using Deep Learning。
	Time series forecasting	例如，请参阅Time Series Forecasting Using Deep Learning。
Text	分类和回归	Text Analytics Toolbox™提供了为文本数据创建深度学习网络的工具。例如，请参阅Classify Text Data Using Deep Learning。
Text	Text generation	例如，请参阅使用深度学习生成文本。
声音的	音频分类和回归	尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Models(Audio Toolbox)。要学习如何编程准备转移学习的网络，请参见Transfer Learning with Pretrained Audio Networks(Audio Toolbox)。要了解如何使用深层网络设计师进行交互式的网络进行转移学习，请参见Transfer Learning with Pretrained Audio Networks in Deep Network Designer。对于显示如何使用深度学习对声音进行分类的示例，请参阅使用深度学习对声音进行分类(Audio Toolbox)。

Choose Training Options

这trainingOptionsfunction provides a variety of options to train your deep learning network.

小费	更多信息
Monitor training progress	To turn on the training progress plot, set the`“绘图”`option in`trainingOptions`至`“训练过程”`。
使用验证数据	To specify validation data, use the`'ValidationData'`option in`trainingOptions`。笔记如果您的验证数据集太小，并且无法充分表示数据，则报告的指标可能无济于事。使用太大的验证数据集可能会导致训练较慢。
For transfer learning, speed up the learning of new layers and slow down the learning in the transferred layers	Specify higher learning rate factors for new layers by using, for example, the`重量应培训`property of`convolution2dLayer`。 Decrease the initial learning rate using the`'InitialLearnRate'`option of`trainingOptions`。 When transfer learning, you do not need to train for as many epochs. Decrease the number of epochs using the`“ maxepochs”`option in`trainingOptions`。要了解如何使用深层网络设计师进行交互式的网络进行转移学习，请参见Transfer Learning with Deep Network Designer。
Shuffle your data every epoch	To shuffle your data every epoch (one full pass of the data), set the`'Shuffle'`option in`trainingOptions`至`'every-epoch'`。笔记 For sequence data, shuffling can have a negative impact on the accuracy as it can increase the amount of padding or truncated data. If you have sequence data, then sorting the data by sequence length can help. To learn more, seeSequence Padding, Truncation, and Splitting。
Try different optimizers	To specify different optimizers, use the`solverName`参数`trainingOptions`。

有关更多信息，请参阅Set Up Parameters and Train Convolutional Neural Network。

Improve Training Accuracy

If you notice problems during training, then consider these possible solutions.

Problem	Possible Solution
NaNs or large spikes in the loss	Decrease the initial learning rate using the`'InitialLearnRate'`option of`trainingOptions`。如果降低学习率无济于事，请尝试使用梯度剪辑。要设置梯度阈值，请使用`'GradientThreshold'`option in`trainingOptions`。
Loss is still decreasing at the end of training	Train for longer by increasing the number of epochs using the`“ maxepochs”`option in`trainingOptions`。
Loss plateaus	If the loss plateaus at an unexpectedly high value, then drop the learning rate at the plateau. To change the learning rate schedule, use the`'LearnRateSchedule'`option in`trainingOptions`。如果降低学习率无济于事，则该模型可能会不足。尝试增加参数或层数。您可以通过监视验证损失来检查模型是否不适用。
Validation loss is much higher than the training loss	To prevent overfitting, try one or more of the following: Use data augmentation. For more information, seeTrain Network with Augmented Images。使用辍学层。有关更多信息，请参阅`dropoutLayer`。 Increase the global L₂regularization factor using the`'L2Regularization'`option in`trainingOptions`。
慢慢地减少损失	Increase the initial learning rate using the`'InitialLearnRate'`option of`trainingOptions`。 For image data, try including batch normalization layers in your network. For more information, see`batchnormalizationlayer`。

有关更多信息，请参阅Set Up Parameters and Train Convolutional Neural Network。

Fix Errors in Training

If your network does not train at all, then consider the possible solutions.

Error Description Possible Solution

Out-of-memory error when training

可用的硬件无法存储当前的迷你批次，网络权重和计算的激活。

Error	Description	Possible Solution
Out-of-memory error when training	可用的硬件无法存储当前的迷你批次，网络权重和计算的激活。	Try reducing the mini-batch size using the`'MiniBatchSize'`option of`trainingOptions`。 If reducing the mini-batch size does not work, then try using a smaller network, reducing the number of layers, or reducing the number of parameters or filters in the layers.
Custom layer errors	这re could be an issue with the implementation of the custom layer.	检查自定义层的有效性，并使用`checkLayer`。如果您使用测试失败`checkLayer`，，，，then the function provides a test diagnostic and a framework diagnostic. The test diagnostic highlights any layer issues, whereas the framework diagnostic provides more detailed information. To learn more about the test diagnostics and get suggestions for possible solutions, seeDiagnostics。
Training throws the error`'CUDA_ERROR_UNKNOWN'`	Sometimes, the GPU throws this error when it is being used for both compute and display requests from the OS.	Try reducing the mini-batch size using the`'MiniBatchSize'`option of`trainingOptions`。 If reducing the mini-batch size does not work, then in Windows^®,试着调整超时检测和恢复(TDR) settings. For example, change the`tdrdelay`from 2 seconds (default) to 4 seconds (requires registry edit).

Try reducing the mini-batch size using the'MiniBatchSize'option oftrainingOptions。

If reducing the mini-batch size does not work, then try using a smaller network, reducing the number of layers, or reducing the number of parameters or filters in the layers.

Custom layer errors

这re could be an issue with the implementation of the custom layer.

检查自定义层的有效性，并使用checkLayer。

如果您使用测试失败checkLayer，，，，then the function provides a test diagnostic and a framework diagnostic. The test diagnostic highlights any layer issues, whereas the framework diagnostic provides more detailed information. To learn more about the test diagnostics and get suggestions for possible solutions, seeDiagnostics。

Training throws the error'CUDA_ERROR_UNKNOWN'

Sometimes, the GPU throws this error when it is being used for both compute and display requests from the OS.

Try reducing the mini-batch size using the'MiniBatchSize'option oftrainingOptions。

If reducing the mini-batch size does not work, then in Windows^®,试着调整超时检测和恢复(TDR) settings. For example, change thetdrdelayfrom 2 seconds (default) to 4 seconds (requires registry edit).

You can analyze your deep learning network usinganalyzeNetwork。这analyzeNetworkfunction displays an interactive visualization of the network architecture, detects errors and issues with the network, and provides detailed information about the network layers. Use the network analyzer to visualize and understand the network architecture, check that you have defined the architecture correctly, and detect problems before training. Problems thatanalyzeNetwork检测包括缺失或断开的层，层输入的不匹配或不正确的大小，不正确的层输入数量以及无效的图形结构。

准备和预处理数据

You can improve the accuracy by preprocessing your data.

Weight or Balance Classes

理想情况下,所有的类都有相同数量的观察vations. However, for some tasks, classes can be imbalanced. For example, automotive datasets of street scenes tend to have more sky, building, and road pixels than pedestrian and bicyclist pixels because the sky, buildings, and roads cover more image area. If not handled correctly, this imbalance can be detrimental to the learning process because the learning is biased in favor of the dominant classes.

For classification tasks, you can specify class weights using the'ClassWeights'option of分类器。例如，请参阅Sequence Classification Using Inverse-Frequency Class Weights。For semantic segmentation tasks, you can specify class weights using theClassWeights（计算机视觉工具箱）property ofpixelClassificationLayer（计算机视觉工具箱）。

Alternatively, you can balance the classes by doing one or more of the following:

Add new observations from the least frequent classes.
Remove observations from the most frequent classes.
Group similar classes. For example, group the classes "car" and "truck" into the single class "vehicle".

预处理图像数据

For more information about preprocessing image data, see深度学习的预处理图像。

Task 更多信息

Resize images

Task	更多信息
Resize images	要使用预处理的网络，您必须将图像大小调整到网络的输入大小。要调整图像大小，请使用`augmentedImageDatastore`。例如，该语法在图像数据存储中大小调整图像`IMD`： auimds = augmentedImageDatastore(inputSize,imds); 小费 Use`augmentedImageDatastore`for efficient preprocessing of images for deep learning, including image resizing. 不要使用`readFcn`选项`imageDatastore`function for preprocessing or resizing, as this option is usually significantly slower.
Image augmentation	为了避免过度拟合，请使用图像转换。要了解更多，请参阅Train Network with Augmented Images。
Normalize regression targets	在将预测变量输入网络之前将其标准化。如果您在训练前将响应归一化，则必须改变训练网络的预测，以获得原始响应的预测。有关更多信息，请参阅Train Convolutional Neural Network for Regression。

要使用预处理的网络，您必须将图像大小调整到网络的输入大小。要调整图像大小，请使用augmentedImageDatastore。例如，该语法在图像数据存储中大小调整图像IMD：

auimds = augmentedImageDatastore(inputSize,imds);

小费

UseaugmentedImageDatastorefor efficient preprocessing of images for deep learning, including image resizing.

不要使用readFcn选项imageDatastorefunction for preprocessing or resizing, as this option is usually significantly slower.

Image augmentation

为了避免过度拟合，请使用图像转换。要了解更多，请参阅Train Network with Augmented Images。

Normalize regression targets

在将预测变量输入网络之前将其标准化。如果您在训练前将响应归一化，则必须改变训练网络的预测，以获得原始响应的预测。

有关更多信息，请参阅Train Convolutional Neural Network for Regression。

Preprocess Sequence Data

For more information about working with LSTM networks, seeLong Short-Term Memory Networks。

Task 更多信息

标准化序列数据

Task	更多信息
标准化序列数据	To normalize sequence data, first calculate the per-feature mean and standard deviation for all the sequences. Then, for each training observation, subtract the mean value and divide by the standard deviation. 要了解更多，请参阅Normalize Sequence Data。
减少序列填充和截断	To reduce the amount of padding or discarded data when padding or truncating sequences, try sorting your data by sequence length. 要了解更多，请参阅Sequence Padding, Truncation, and Splitting。
Specify mini-batch size and padding options for prediction	When you make predictions with sequences of different lengths, the mini-batch size can impact the amount of padding added to the input data, which can result in different predicted values. Try using different values to see which works best with your network. To specify mini-batch size and padding options, use the`'MiniBatchSize'`and`'SequenceLength'`选项`classify`，，，，`predict`，，，，`classifyAndUpdateState`，，，，and`预测和dateState`functions.

To normalize sequence data, first calculate the per-feature mean and standard deviation for all the sequences. Then, for each training observation, subtract the mean value and divide by the standard deviation.

要了解更多，请参阅Normalize Sequence Data。

减少序列填充和截断

To reduce the amount of padding or discarded data when padding or truncating sequences, try sorting your data by sequence length.

要了解更多，请参阅Sequence Padding, Truncation, and Splitting。

Specify mini-batch size and padding options for prediction

When you make predictions with sequences of different lengths, the mini-batch size can impact the amount of padding added to the input data, which can result in different predicted values. Try using different values to see which works best with your network.

To specify mini-batch size and padding options, use the'MiniBatchSize'and'SequenceLength'选项classify，，，，predict，，，，classifyAndUpdateState，，，，and预测和dateStatefunctions.

Use Available Hardware

要指定执行环境，请使用'ExecutionEnvironment'option intrainingOptions。

Problem	更多信息
Training on CPU is slow	If training is too slow on a single CPU, try using a pretrained deep learning network as a feature extractor and train a machine learning model. For an example, see使用验证网络提取图像功能。
Training LSTM on GPU is slow	这CPU is better suited for training an LSTM network using mini-batches with short sequences. To use the CPU, set the`'ExecutionEnvironment'`option in`trainingOptions`至`'中央处理器'`。
Software does not use all available GPUs	If you have access to a machine with multiple GPUs, simply set the`'ExecutionEnvironment'`option in`trainingOptions`至`'multi-gpu'`。有关更多信息，请参阅Deep Learning with MATLAB on Multiple GPUs。

有关更多信息，请参阅Scale Up Deep Learning in Parallel, on GPUs, and in the Cloud。

通过垫子加载来解决错误

如果您无法从垫子上加载层或网络并获取表格的警告

Warning: Unable to load instances of class layerType into a heterogeneous array. The definition of layerType could be missing or contain an error. Default objects will be substituted. Warning: While loading an object of class 'SeriesNetwork': Error using 'forward' in Layer nnet.cnn.layer.MissingLayer. The function threw an error and could not be executed.

then the network in the MAT-file may contain unavailable layers. This could be due to the following:

这file contains a custom layer not on the path – To load networks containing custom layers, add the custom layer files to the MATLAB^®path.
该文件包含来自支持软件包的自定义层 - 用于使用支持软件包的层加载网络，使用相应万博1manbetx的函数在命令行中安装所需的支持软件包（例如，RESNET18) or using the Add-On Explorer.
该文件包含来自文档示例的自定义层，该示例不在路径上 - 加载包含文档示例中自定义层的网络，将示例打开为实时脚本，然后将层从示例文件夹复制到您的工作目录。
这file contains a layer from a toolbox that is not installed – To access layers from other toolboxes, for example, Computer Vision Toolbox or Text Analytics Toolbox, install the corresponding toolbox.

After trying the suggested solutions, reload the MAT-file.