Main Content

Deep Learning Tips and Tricks

本页介绍了各种培训选项和技术,以提高深度学习网络的准确性。

选择网络体系结构

这appropriate network architecture depends on the task and the data available. Consider these suggestions when deciding which architecture to use and whether to use a pretrained network or to train from scratch.

Data Description of Task Learn More
图片 Classification of natural images

尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Deep Neural Networks

要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Deep Network Designer

Regression of natural images 尝试不同的预处理网络。有关如何将验证分类网络转换为回归网络的示例,请参见Convert Classification Network into Regression Network
非天然图像的分类和回归(例如,微小的图像和频谱图

For an example showing how to classify tiny images, seeTrain Residual Network for Image Classification

For an example showing how to classify spectrograms, see使用深度学习的语音命令识别

Semantic segmentation Computer Vision Toolbox™ provides tools to create deep learning networks for semantic segmentation. For more information, seeGetting Started with Semantic Segmentation Using Deep Learning(计算机视觉工具箱)
Sequences, time series, and signals 序列到标签分类 例如,请参阅使用深度学习的序列分类
Sequence-to-sequence classification and regression 要了解更多,请参阅Sequence-to-Sequence Classification Using Deep LearningandSequence-to-Sequence Regression Using Deep Learning
序列到一回归 例如,请参阅Sequence-to-One Regression Using Deep Learning
Time series forecasting 例如,请参阅Time Series Forecasting Using Deep Learning
Text 分类和回归 Text Analytics Toolbox™提供了为文本数据创建深度学习网络的工具。例如,请参阅Classify Text Data Using Deep Learning
Text generation 例如,请参阅使用深度学习生成文本
声音的 音频分类和回归

尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Models(Audio Toolbox)

要学习如何编程准备转移学习的网络,请参见Transfer Learning with Pretrained Audio Networks(Audio Toolbox)。要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Pretrained Audio Networks in Deep Network Designer

对于显示如何使用深度学习对声音进行分类的示例,请参阅使用深度学习对声音进行分类(Audio Toolbox)

Choose Training Options

trainingOptionsfunction provides a variety of options to train your deep learning network.

小费 更多信息
Monitor training progress To turn on the training progress plot, set the“绘图”option intrainingOptions“训练过程”
使用验证数据

To specify validation data, use the'ValidationData'option intrainingOptions

笔记

如果您的验证数据集太小,并且无法充分表示数据,则报告的指标可能无济于事。使用太大的验证数据集可能会导致训练较慢。

For transfer learning, speed up the learning of new layers and slow down the learning in the transferred layers

Specify higher learning rate factors for new layers by using, for example, the重量应培训property ofconvolution2dLayer

Decrease the initial learning rate using the'InitialLearnRate'option oftrainingOptions

When transfer learning, you do not need to train for as many epochs. Decrease the number of epochs using the“ maxepochs”option intrainingOptions

要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Deep Network Designer

Shuffle your data every epoch

To shuffle your data every epoch (one full pass of the data), set the'Shuffle'option intrainingOptions'every-epoch'

笔记

For sequence data, shuffling can have a negative impact on the accuracy as it can increase the amount of padding or truncated data. If you have sequence data, then sorting the data by sequence length can help. To learn more, seeSequence Padding, Truncation, and Splitting

Try different optimizers

To specify different optimizers, use thesolverName参数trainingOptions

有关更多信息,请参阅Set Up Parameters and Train Convolutional Neural Network

Improve Training Accuracy

If you notice problems during training, then consider these possible solutions.

Problem Possible Solution
NaNs or large spikes in the loss

Decrease the initial learning rate using the'InitialLearnRate'option oftrainingOptions

如果降低学习率无济于事,请尝试使用梯度剪辑。要设置梯度阈值,请使用'GradientThreshold'option intrainingOptions

Loss is still decreasing at the end of training Train for longer by increasing the number of epochs using the“ maxepochs”option intrainingOptions
Loss plateaus

If the loss plateaus at an unexpectedly high value, then drop the learning rate at the plateau. To change the learning rate schedule, use the'LearnRateSchedule'option intrainingOptions

如果降低学习率无济于事,则该模型可能会不足。尝试增加参数或层数。您可以通过监视验证损失来检查模型是否不适用。

Validation loss is much higher than the training loss

To prevent overfitting, try one or more of the following:

慢慢地减少损失

Increase the initial learning rate using the'InitialLearnRate'option oftrainingOptions

For image data, try including batch normalization layers in your network. For more information, seebatchnormalizationlayer

有关更多信息,请参阅Set Up Parameters and Train Convolutional Neural Network

Fix Errors in Training

If your network does not train at all, then consider the possible solutions.

Error Description Possible Solution
Out-of-memory error when training 可用的硬件无法存储当前的迷你批次,网络权重和计算的激活。

Try reducing the mini-batch size using the'MiniBatchSize'option oftrainingOptions

If reducing the mini-batch size does not work, then try using a smaller network, reducing the number of layers, or reducing the number of parameters or filters in the layers.

Custom layer errors 这re could be an issue with the implementation of the custom layer.

检查自定义层的有效性,并使用checkLayer

如果您使用测试失败checkLayer,,,,then the function provides a test diagnostic and a framework diagnostic. The test diagnostic highlights any layer issues, whereas the framework diagnostic provides more detailed information. To learn more about the test diagnostics and get suggestions for possible solutions, seeDiagnostics

Training throws the error'CUDA_ERROR_UNKNOWN' Sometimes, the GPU throws this error when it is being used for both compute and display requests from the OS.

Try reducing the mini-batch size using the'MiniBatchSize'option oftrainingOptions

If reducing the mini-batch size does not work, then in Windows®,试着调整超时检测和恢复(TDR) settings. For example, change thetdrdelayfrom 2 seconds (default) to 4 seconds (requires registry edit).

You can analyze your deep learning network usinganalyzeNetwork。这analyzeNetworkfunction displays an interactive visualization of the network architecture, detects errors and issues with the network, and provides detailed information about the network layers. Use the network analyzer to visualize and understand the network architecture, check that you have defined the architecture correctly, and detect problems before training. Problems thatanalyzeNetwork检测包括缺失或断开的层,层输入的不匹配或不正确的大小,不正确的层输入数量以及无效的图形结构。

准备和预处理数据

You can improve the accuracy by preprocessing your data.

Weight or Balance Classes

理想情况下,所有的类都有相同数量的观察vations. However, for some tasks, classes can be imbalanced. For example, automotive datasets of street scenes tend to have more sky, building, and road pixels than pedestrian and bicyclist pixels because the sky, buildings, and roads cover more image area. If not handled correctly, this imbalance can be detrimental to the learning process because the learning is biased in favor of the dominant classes.

For classification tasks, you can specify class weights using the'ClassWeights'option of分类器。例如,请参阅Sequence Classification Using Inverse-Frequency Class Weights。For semantic segmentation tasks, you can specify class weights using theClassWeights(计算机视觉工具箱)property ofpixelClassificationLayer(计算机视觉工具箱)

Alternatively, you can balance the classes by doing one or more of the following:

  • Add new observations from the least frequent classes.

  • Remove observations from the most frequent classes.

  • Group similar classes. For example, group the classes "car" and "truck" into the single class "vehicle".

预处理图像数据

For more information about preprocessing image data, see深度学习的预处理图像

Task 更多信息
Resize images

要使用预处理的网络,您必须将图像大小调整到网络的输入大小。要调整图像大小,请使用augmentedImageDatastore。例如,该语法在图像数据存储中大小调整图像IMD

auimds = augmentedImageDatastore(inputSize,imds);

小费

UseaugmentedImageDatastorefor efficient preprocessing of images for deep learning, including image resizing.

不要使用readFcn选项imageDatastorefunction for preprocessing or resizing, as this option is usually significantly slower.

Image augmentation

为了避免过度拟合,请使用图像转换。要了解更多,请参阅Train Network with Augmented Images

Normalize regression targets

在将预测变量输入网络之前将其标准化。如果您在训练前将响应归一化,则必须改变训练网络的预测,以获得原始响应的预测。

有关更多信息,请参阅Train Convolutional Neural Network for Regression

Preprocess Sequence Data

For more information about working with LSTM networks, seeLong Short-Term Memory Networks

Task 更多信息
标准化序列数据

To normalize sequence data, first calculate the per-feature mean and standard deviation for all the sequences. Then, for each training observation, subtract the mean value and divide by the standard deviation.

要了解更多,请参阅Normalize Sequence Data

减少序列填充和截断

To reduce the amount of padding or discarded data when padding or truncating sequences, try sorting your data by sequence length.

要了解更多,请参阅Sequence Padding, Truncation, and Splitting

Specify mini-batch size and padding options for prediction

When you make predictions with sequences of different lengths, the mini-batch size can impact the amount of padding added to the input data, which can result in different predicted values. Try using different values to see which works best with your network.

To specify mini-batch size and padding options, use the'MiniBatchSize'and'SequenceLength'选项classify,,,,predict,,,,classifyAndUpdateState,,,,and预测和dateStatefunctions.

Use Available Hardware

要指定执行环境,请使用'ExecutionEnvironment'option intrainingOptions

Problem 更多信息
Training on CPU is slow If training is too slow on a single CPU, try using a pretrained deep learning network as a feature extractor and train a machine learning model. For an example, see使用验证网络提取图像功能
Training LSTM on GPU is slow

这CPU is better suited for training an LSTM network using mini-batches with short sequences. To use the CPU, set the'ExecutionEnvironment'option intrainingOptions'中央处理器'

Software does not use all available GPUs If you have access to a machine with multiple GPUs, simply set the'ExecutionEnvironment'option intrainingOptions'multi-gpu'。有关更多信息,请参阅Deep Learning with MATLAB on Multiple GPUs

有关更多信息,请参阅Scale Up Deep Learning in Parallel, on GPUs, and in the Cloud

通过垫子加载来解决错误

如果您无法从垫子上加载层或网络并获取表格的警告

Warning: Unable to load instances of class layerType into a heterogeneous array. The definition of layerType could be missing or contain an error. Default objects will be substituted. Warning: While loading an object of class 'SeriesNetwork': Error using 'forward' in Layer nnet.cnn.layer.MissingLayer. The function threw an error and could not be executed.
then the network in the MAT-file may contain unavailable layers. This could be due to the following:

  • 这file contains a custom layer not on the path – To load networks containing custom layers, add the custom layer files to the MATLAB®path.

  • 该文件包含来自支持软件包的自定义层 - 用于使用支持软件包的层加载网络,使用相应万博1manbetx的函数在命令行中安装所需的支持软件包(例如,RESNET18) or using the Add-On Explorer.

  • 该文件包含来自文档示例的自定义层,该示例不在路径上 - 加载包含文档示例中自定义层的网络,将示例打开为实时脚本,然后将层从示例文件夹复制到您的工作目录。

  • 这file contains a layer from a toolbox that is not installed – To access layers from other toolboxes, for example, Computer Vision Toolbox or Text Analytics Toolbox, install the corresponding toolbox.

After trying the suggested solutions, reload the MAT-file.

See Also

|||

Related Topics