Deep Learning Tips and Tricks
本页介绍了各种培训选项和技术,以提高深度学习网络的准确性。
选择网络体系结构
这appropriate network architecture depends on the task and the data available. Consider these suggestions when deciding which architecture to use and whether to use a pretrained network or to train from scratch.
Data | Description of Task | Learn More |
---|---|---|
图片 | Classification of natural images | 尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Deep Neural Networks。 要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Deep Network Designer。 |
Regression of natural images | 尝试不同的预处理网络。有关如何将验证分类网络转换为回归网络的示例,请参见Convert Classification Network into Regression Network。 | |
非天然图像的分类和回归(例如,微小的图像和频谱图 | For an example showing how to classify tiny images, seeTrain Residual Network for Image Classification。 For an example showing how to classify spectrograms, see使用深度学习的语音命令识别。 |
|
Semantic segmentation | Computer Vision Toolbox™ provides tools to create deep learning networks for semantic segmentation. For more information, seeGetting Started with Semantic Segmentation Using Deep Learning(计算机视觉工具箱)。 | |
Sequences, time series, and signals | 序列到标签分类 | 例如,请参阅使用深度学习的序列分类。 |
Sequence-to-sequence classification and regression | 要了解更多,请参阅Sequence-to-Sequence Classification Using Deep LearningandSequence-to-Sequence Regression Using Deep Learning。 | |
序列到一回归 | 例如,请参阅Sequence-to-One Regression Using Deep Learning。 | |
Time series forecasting | 例如,请参阅Time Series Forecasting Using Deep Learning。 | |
Text | 分类和回归 | Text Analytics Toolbox™提供了为文本数据创建深度学习网络的工具。例如,请参阅Classify Text Data Using Deep Learning。 |
Text generation | 例如,请参阅使用深度学习生成文本。 | |
声音的 | 音频分类和回归 | 尝试不同的预处理网络。For a list of pretrained deep learning networks, seePretrained Models(Audio Toolbox)。 要学习如何编程准备转移学习的网络,请参见Transfer Learning with Pretrained Audio Networks(Audio Toolbox)。要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Pretrained Audio Networks in Deep Network Designer。 对于显示如何使用深度学习对声音进行分类的示例,请参阅使用深度学习对声音进行分类(Audio Toolbox)。 |
Choose Training Options
这trainingOptions
function provides a variety of options to train your deep learning network.
小费 | 更多信息 |
---|---|
Monitor training progress | To turn on the training progress plot, set the“绘图” option intrainingOptions 至“训练过程” 。 |
使用验证数据 | To specify validation data, use the 笔记 如果您的验证数据集太小,并且无法充分表示数据,则报告的指标可能无济于事。使用太大的验证数据集可能会导致训练较慢。 |
For transfer learning, speed up the learning of new layers and slow down the learning in the transferred layers | Specify higher learning rate factors for new layers by using, for example, the Decrease the initial learning rate using the When transfer learning, you do not need to train for as many epochs. Decrease the number of epochs using the 要了解如何使用深层网络设计师进行交互式的网络进行转移学习,请参见Transfer Learning with Deep Network Designer。 |
Shuffle your data every epoch | To shuffle your data every epoch (one full pass of the data), set the 笔记 For sequence data, shuffling can have a negative impact on the accuracy as it can increase the amount of padding or truncated data. If you have sequence data, then sorting the data by sequence length can help. To learn more, seeSequence Padding, Truncation, and Splitting。 |
Try different optimizers | To specify different optimizers, use the |
有关更多信息,请参阅Set Up Parameters and Train Convolutional Neural Network。
Improve Training Accuracy
If you notice problems during training, then consider these possible solutions.
Problem | Possible Solution |
---|---|
NaNs or large spikes in the loss | Decrease the initial learning rate using the 如果降低学习率无济于事,请尝试使用梯度剪辑。要设置梯度阈值,请使用 |
Loss is still decreasing at the end of training | Train for longer by increasing the number of epochs using the“ maxepochs” option intrainingOptions 。 |
Loss plateaus | If the loss plateaus at an unexpectedly high value, then drop the learning rate at the plateau. To change the learning rate schedule, use the 如果降低学习率无济于事,则该模型可能会不足。尝试增加参数或层数。您可以通过监视验证损失来检查模型是否不适用。 |
Validation loss is much higher than the training loss | To prevent overfitting, try one or more of the following:
|
慢慢地减少损失 | Increase the initial learning rate using the For image data, try including batch normalization layers in your network. For more information, see |
有关更多信息,请参阅Set Up Parameters and Train Convolutional Neural Network。
Fix Errors in Training
If your network does not train at all, then consider the possible solutions.
Error | Description | Possible Solution |
---|---|---|
Out-of-memory error when training | 可用的硬件无法存储当前的迷你批次,网络权重和计算的激活。 | Try reducing the mini-batch size using the If reducing the mini-batch size does not work, then try using a smaller network, reducing the number of layers, or reducing the number of parameters or filters in the layers. |
Custom layer errors | 这re could be an issue with the implementation of the custom layer. | 检查自定义层的有效性,并使用 如果您使用测试失败 |
Training throws the error'CUDA_ERROR_UNKNOWN' |
Sometimes, the GPU throws this error when it is being used for both compute and display requests from the OS. | Try reducing the mini-batch size using the If reducing the mini-batch size does not work, then in Windows®,试着调整超时检测和恢复(TDR) settings. For example, change the |
You can analyze your deep learning network usinganalyzeNetwork
。这analyzeNetwork
function displays an interactive visualization of the network architecture, detects errors and issues with the network, and provides detailed information about the network layers. Use the network analyzer to visualize and understand the network architecture, check that you have defined the architecture correctly, and detect problems before training. Problems thatanalyzeNetwork
检测包括缺失或断开的层,层输入的不匹配或不正确的大小,不正确的层输入数量以及无效的图形结构。
准备和预处理数据
You can improve the accuracy by preprocessing your data.
Weight or Balance Classes
理想情况下,所有的类都有相同数量的观察vations. However, for some tasks, classes can be imbalanced. For example, automotive datasets of street scenes tend to have more sky, building, and road pixels than pedestrian and bicyclist pixels because the sky, buildings, and roads cover more image area. If not handled correctly, this imbalance can be detrimental to the learning process because the learning is biased in favor of the dominant classes.
For classification tasks, you can specify class weights using the'ClassWeights'
option of分类器
。例如,请参阅Sequence Classification Using Inverse-Frequency Class Weights。For semantic segmentation tasks, you can specify class weights using theClassWeights
(计算机视觉工具箱)property ofpixelClassificationLayer
(计算机视觉工具箱)。
Alternatively, you can balance the classes by doing one or more of the following:
Add new observations from the least frequent classes.
Remove observations from the most frequent classes.
Group similar classes. For example, group the classes "car" and "truck" into the single class "vehicle".
预处理图像数据
For more information about preprocessing image data, see深度学习的预处理图像。
Task | 更多信息 |
---|---|
Resize images | 要使用预处理的网络,您必须将图像大小调整到网络的输入大小。要调整图像大小,请使用 auimds = augmentedImageDatastore(inputSize,imds); 小费 Use 不要使用 |
Image augmentation | 为了避免过度拟合,请使用图像转换。要了解更多,请参阅Train Network with Augmented Images。 |
Normalize regression targets | 在将预测变量输入网络之前将其标准化。如果您在训练前将响应归一化,则必须改变训练网络的预测,以获得原始响应的预测。 有关更多信息,请参阅Train Convolutional Neural Network for Regression。 |
Preprocess Sequence Data
For more information about working with LSTM networks, seeLong Short-Term Memory Networks。
Task | 更多信息 |
---|---|
标准化序列数据 | To normalize sequence data, first calculate the per-feature mean and standard deviation for all the sequences. Then, for each training observation, subtract the mean value and divide by the standard deviation. 要了解更多,请参阅Normalize Sequence Data。 |
减少序列填充和截断 | To reduce the amount of padding or discarded data when padding or truncating sequences, try sorting your data by sequence length. |
Specify mini-batch size and padding options for prediction | When you make predictions with sequences of different lengths, the mini-batch size can impact the amount of padding added to the input data, which can result in different predicted values. Try using different values to see which works best with your network. To specify mini-batch size and padding options, use the |
Use Available Hardware
要指定执行环境,请使用'ExecutionEnvironment'
option intrainingOptions
。
Problem | 更多信息 |
---|---|
Training on CPU is slow | If training is too slow on a single CPU, try using a pretrained deep learning network as a feature extractor and train a machine learning model. For an example, see使用验证网络提取图像功能。 |
Training LSTM on GPU is slow | 这CPU is better suited for training an LSTM network using mini-batches with short sequences. To use the CPU, set the |
Software does not use all available GPUs | If you have access to a machine with multiple GPUs, simply set the'ExecutionEnvironment' option intrainingOptions 至'multi-gpu' 。有关更多信息,请参阅Deep Learning with MATLAB on Multiple GPUs。 |
有关更多信息,请参阅Scale Up Deep Learning in Parallel, on GPUs, and in the Cloud。
通过垫子加载来解决错误
如果您无法从垫子上加载层或网络并获取表格的警告
Warning: Unable to load instances of class layerType into a heterogeneous array. The definition of layerType could be missing or contain an error. Default objects will be substituted. Warning: While loading an object of class 'SeriesNetwork': Error using 'forward' in Layer nnet.cnn.layer.MissingLayer. The function threw an error and could not be executed.
这file contains a custom layer not on the path – To load networks containing custom layers, add the custom layer files to the MATLAB®path.
该文件包含来自支持软件包的自定义层 - 用于使用支持软件包的层加载网络,使用相应万博1manbetx的函数在命令行中安装所需的支持软件包(例如,
RESNET18
) or using the Add-On Explorer.该文件包含来自文档示例的自定义层,该示例不在路径上 - 加载包含文档示例中自定义层的网络,将示例打开为实时脚本,然后将层从示例文件夹复制到您的工作目录。
这file contains a layer from a toolbox that is not installed – To access layers from other toolboxes, for example, Computer Vision Toolbox or Text Analytics Toolbox, install the corresponding toolbox.
After trying the suggested solutions, reload the MAT-file.
See Also
trainingOptions
|checkLayer
|analyzeNetwork
|Deep Network Designer