主要内容gydF4y2Ba

活动从视频和光学流数据识别使用深度学习gydF4y2Ba

这个例子展示了如何培养一个充气的3 d (I3D)二束卷积神经网络活动识别使用RGB和光学流数据从视频gydF4y2Ba[1]gydF4y2Ba。gydF4y2Ba

建立活动识别涉及到预测对象的行动,比如散步,游泳,或坐着,使用一组视频帧。活动识别的视频有很多应用,如人机交互、机器学习、异常检测、监视和目标检测。例如,在线预测的多个操作来自多个摄像头的视频可以为机器人的学习是很重要的。使用视频图像分类相比,动作识别是挑战模式因为吵了标签的视频数据集,各种动作演员的视频可以执行大量类不平衡,和计算效率低下pretraining大型视频数据集。深度学习一些技术,比如I3D二束卷积网络gydF4y2Ba[1]gydF4y2Ba,表明改进的性能通过利用pretraining在大型图像分类数据集。gydF4y2Ba

加载数据gydF4y2Ba

这个例子火车I3D网络使用gydF4y2BaHMDB51gydF4y2Ba数据集,使用gydF4y2BadownloadHMDB51gydF4y2Ba万博1manbetx支持函数,列出在这个例子中,将HMDB51数据集下载到一个文件夹命名gydF4y2Bahmdb51gydF4y2Ba。gydF4y2Ba

downloadFolder = fullfile (tempdir,gydF4y2Ba“hmdb51”gydF4y2Ba);downloadHMDB51 (downloadFolder);gydF4y2Ba

下载完成后,提取RAR文件gydF4y2Bahmdb51_org.rargydF4y2Ba到gydF4y2Bahmdb51gydF4y2Ba文件夹中。接下来,使用gydF4y2BacheckForHMDB51FoldergydF4y2Ba万博1manbetx支持函数,列出在这个例子中,确认下载并提取文件。gydF4y2Ba

allClasses = checkForHMDB51Folder (downloadFolder);gydF4y2Ba

数据集包含了约7000年的2 GB的视频数据片段51类,如gydF4y2Ba喝gydF4y2Ba,gydF4y2Ba运行gydF4y2Ba,gydF4y2Ba握手gydF4y2Ba。每个视频帧的高度为240像素和176像素的最小宽度。帧的数量范围从18到大约1000年。gydF4y2Ba

减少训练时间,这个例子训练活动识别网络分类5 action类而不是所有51类的数据集。gydF4y2BauseAllDatagydF4y2Ba来gydF4y2Ba真正的gydF4y2Ba与所有51类培训。gydF4y2Ba

useAllData = false;gydF4y2Ba如果gydF4y2BauseAllData类= allClasses;gydF4y2Ba其他的gydF4y2Ba类= [gydF4y2Ba“吻”gydF4y2Ba,gydF4y2Ba“笑”gydF4y2Ba,gydF4y2Ba“选择”gydF4y2Ba,gydF4y2Ba“倒”gydF4y2Ba,gydF4y2Ba“俯卧撑”gydF4y2Ba];gydF4y2Ba结束gydF4y2BadataFolder = fullfile (downloadFolder,gydF4y2Ba“hmdb51_org”gydF4y2Ba);gydF4y2Ba

将数据集分为训练集训练网络,和一个测试集来评估网络。使用80%的数据为训练集和测试集。使用gydF4y2BaimageDatastoregydF4y2Ba将基于每个标签的数据分为训练和测试数据集通过随机选择一个比例从每个标签的文件。gydF4y2Ba

imd = imageDatastore (fullfile (dataFolder、类),gydF4y2Ba…gydF4y2Ba“IncludeSubfolders”gydF4y2Ba,真的,gydF4y2Ba…gydF4y2Ba“LabelSource”gydF4y2Ba,gydF4y2Ba“foldernames”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“FileExtensions”gydF4y2Ba,gydF4y2Ba“.avi”gydF4y2Ba);[trainImds, testImds] = splitEachLabel (imd, 0.8,gydF4y2Ba“随机”gydF4y2Ba);trainFilenames = trainImds.Files;testFilenames = testImds.Files;gydF4y2Ba

规范化网络的输入数据,数据集的最小和最大值垫提供的文件gydF4y2BainputStatistics.matgydF4y2Ba,这个例子。发现的最大和最小值不同的数据集,使用gydF4y2BainputStatisticsgydF4y2Ba万博1manbetx支持函数,列出在这个例子。gydF4y2Ba

inputStatsFilename =gydF4y2Ba“inputStatistics.mat”gydF4y2Ba;gydF4y2Ba如果gydF4y2Ba~存在(inputStatsFilenamegydF4y2Ba“文件”gydF4y2Ba)disp (gydF4y2Ba“阅读所有的训练数据输入统计……”gydF4y2Ba)inputStats = inputStatistics (dataFolder);gydF4y2Ba其他的gydF4y2Bad =负载(inputStatsFilename);inputStats = d.inputStats;gydF4y2Ba结束gydF4y2Ba

创建数据存储培训网络gydF4y2Ba

创建两个gydF4y2BaFileDatastoregydF4y2Ba培训对象和验证通过gydF4y2BacreateFileDatastoregydF4y2Ba万博1manbetx支持函数,定义在这个例子。每个数据存储读取视频文件提供RGB数据,光流数据和相应的标签信息。gydF4y2Ba

指定的帧数为每个读的数据存储。典型值是16、32、64年或128年。使用更多的框架有助于捕获更多的时间信息,但是需要更多的内存进行训练和预测。帧的数量设置为64来平衡内存使用和性能。你可能需要降低这个值取决于您的系统资源。gydF4y2Ba

numFrames = 64;gydF4y2Ba

指定高度和宽度的帧数据存储。固定的高度和宽度相同的值为网络容易让批处理数据。典型的值(112、112),(224、224),(256、256)。视频帧的最小高度和宽度HMDB51数据集的240年和176年,分别。指定(112、112),以获得更大的帧数在空间信息的成本。如果你想指定数据存储读取一帧大小大于最小值,如(256、256),使用imresize首先调整帧。gydF4y2Ba

frameSize = [112112];gydF4y2Ba

集gydF4y2BainputSizegydF4y2Ba到gydF4y2BainputStatsgydF4y2Ba所以读的功能结构gydF4y2BafileDatastoregydF4y2Ba可以阅读指定的输入的大小。gydF4y2Ba

inputSize = [frameSize, numFrames];inputStats。inputSize=inputSize;inputStats。类=类;gydF4y2Ba

创建两个gydF4y2BaFileDatastoregydF4y2Ba对象,一个培训,另一个用于验证。gydF4y2Ba

isDataForValidation = false;dsTrain = createFileDatastore (trainFilenames inputStats isDataForValidation);isDataForValidation = true;dsVal = createFileDatastore (testFilenames inputStats isDataForValidation);disp (gydF4y2Ba“训练数据大小:gydF4y2Ba+字符串(元素个数(dsTrain.Files)))gydF4y2Ba
训练数据规模:436gydF4y2Ba
disp (gydF4y2Ba验证数据大小:“gydF4y2Ba+字符串(元素个数(dsVal.Files)))gydF4y2Ba
验证数据大小:109gydF4y2Ba

定义网络体系结构gydF4y2Ba

I3D网络gydF4y2Ba

使用3 d CNN是一种自然的方法从视频中提取时空特性。您可以创建一个从pretrained I3D网络二维图像分类网络如《盗梦空间》v1或ResNet-50通过扩大二维过滤器和池内核到3 d。这个过程重用权重从图像分类任务引导的视频识别任务。gydF4y2Ba

下面的图是一个示例显示如何吹出了一个二维卷积层三维卷积层。通货膨胀包括扩大滤波器尺寸,重量和偏见通过添加第三个维度(时间维度)。gydF4y2Ba

二束I3D网络gydF4y2Ba

视频数据可以被认为有两个部分:一个空间组件和一个时间组件。gydF4y2Ba

  • 空间组件组成的形状信息,纹理和颜色的视频对象。RGB数据包含这些信息。gydF4y2Ba

  • 颞组件包括信息帧和描绘对象的运动之间的重要运动摄像机和一个场景中的物体。计算光流是一种常见的技术,从视频中提取时间信息。gydF4y2Ba

二束CNN包含空间子网和颞子网gydF4y2Ba[2]gydF4y2Ba。卷积神经网络训练密度光学流和视频数据流可以实现更好的性能有限的训练数据与原始堆叠RGB帧。下面的插图显示了一个典型的二束I3D网络。gydF4y2Ba

创建二束I3D网络gydF4y2Ba

在本例中,您创建一个使用GoogLeNet I3D网络,网络上pretrained ImageNet数据库。gydF4y2Ba

指定数量的渠道gydF4y2Ba3gydF4y2BaRGB子网,gydF4y2Ba2gydF4y2Ba光流的子网。光流数据的两个渠道gydF4y2Ba xgydF4y2Ba 和gydF4y2Ba ygydF4y2Ba 组件的速度,gydF4y2Ba VgydF4y2Ba xgydF4y2Ba 和gydF4y2Ba VgydF4y2Ba ygydF4y2Ba ,分别。gydF4y2Ba

rgbChannels = 3;flowChannels = 2;gydF4y2Ba

获得最小和最大的RGB值和光学流的数据gydF4y2BainputStatsgydF4y2Ba结构加载的gydF4y2BainputStatistics.matgydF4y2Ba文件。这些值所需的gydF4y2Baimage3dInputLayergydF4y2BaI3D网络规范化输入数据。gydF4y2Ba

rgbInputSize = [frameSize、numFrames rgbChannels];flowInputSize = [frameSize、numFrames flowChannels];rgbMin = inputStats.rgbMin;rgbMax = inputStats.rgbMax;oflowMin = inputStats.oflowMin (:,:, 1:2);oflowMax = inputStats.oflowMax (:,:, 1:2);rgbMin =重塑(rgbMin[1、大小(rgbMin)]);rgbMax =重塑(rgbMax[1、大小(rgbMax)]);oflowMin =重塑(oflowMin[1、大小(oflowMin)]);oflowMax =重塑(oflowMax[1、大小(oflowMax)]);gydF4y2Ba

为培训网络指定类的数量。gydF4y2Ba

numClasses =元素个数(类);gydF4y2Ba

创建I3D RGB和光学流使用的子网gydF4y2BaInflated3DgydF4y2Ba万博1manbetx支持功能,这是附加到这个例子。从GoogLeNet创建子网。gydF4y2Ba

cnnNet = googlenet;netRGB = Inflated3D (numClasses rgbInputSize、rgbMin rgbMax, cnnNet);netFlow = Inflated3D (numClasses flowInputSize、oflowMin oflowMax, cnnNet);gydF4y2Ba

创建一个gydF4y2BadlnetworkgydF4y2Ba对象的层图每个I3D的网络。gydF4y2Ba

dlnetRGB = dlnetwork (netRGB);dlnetFlow = dlnetwork (netFlow);gydF4y2Ba

定义模型梯度函数gydF4y2Ba

创建支持函数万博1manbetxgydF4y2BamodelGradientsgydF4y2Ba年底上市,这个例子。的gydF4y2BamodelGradientsgydF4y2Ba函数作为输入RGB子网gydF4y2BadlnetRGBgydF4y2Ba,光流子网gydF4y2BadlnetFlowgydF4y2Bamini-batch的输入数据gydF4y2BadlRGBgydF4y2Ba和gydF4y2BadlFlowgydF4y2Ba和地面真理mini-batch标签数据gydF4y2Ba海底gydF4y2Ba。函数返回训练价值损失,损失的梯度对各自的子网,可学的参数和mini-batch子网的准确性。gydF4y2Ba

计算出的损失计算的平均损失的叉从每个子网的预测。网络的输出预测概率在0和1之间的每个类。gydF4y2Ba

rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba cgydF4y2Ba rgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba egydF4y2Ba ngydF4y2Ba tgydF4y2Ba rgydF4y2Ba ogydF4y2Ba pgydF4y2Ba ygydF4y2Ba (gydF4y2Ba rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba PgydF4y2Ba rgydF4y2Ba egydF4y2Ba dgydF4y2Ba 我gydF4y2Ba cgydF4y2Ba tgydF4y2Ba 我gydF4y2Ba ogydF4y2Ba ngydF4y2Ba )gydF4y2Ba

fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba cgydF4y2Ba rgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba egydF4y2Ba ngydF4y2Ba tgydF4y2Ba rgydF4y2Ba ogydF4y2Ba pgydF4y2Ba ygydF4y2Ba (gydF4y2Ba fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba PgydF4y2Ba rgydF4y2Ba egydF4y2Ba dgydF4y2Ba 我gydF4y2Ba cgydF4y2Ba tgydF4y2Ba 我gydF4y2Ba ogydF4y2Ba ngydF4y2Ba )gydF4y2Ba

lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba =gydF4y2Ba 米gydF4y2Ba egydF4y2Ba 一个gydF4y2Ba ngydF4y2Ba (gydF4y2Ba (gydF4y2Ba rgydF4y2Ba ggydF4y2Ba bgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba ,gydF4y2Ba fgydF4y2Ba lgydF4y2Ba ogydF4y2Ba wgydF4y2Ba lgydF4y2Ba ogydF4y2Ba 年代gydF4y2Ba 年代gydF4y2Ba ]gydF4y2Ba )gydF4y2Ba

计算每个子网的准确性通过RGB和光学流的平均预测,并对比地面实况标签的输入。gydF4y2Ba

指定培训选项gydF4y2Ba

火车mini-batch大小为20 1500迭代。指定保存模型的迭代之后最好的验证使用的准确性gydF4y2BaSaveBestAfterIterationgydF4y2Ba参数。gydF4y2Ba

指定cosine-annealing学习速率的时间表(gydF4y2Ba3gydF4y2Ba]参数。对于网络,使用:gydF4y2Ba

  • 至少学习1的军医。gydF4y2Ba

  • 1 e - 3的最大学习速率。gydF4y2Ba

  • 余弦函数的迭代次数为300、500和700年,之后重启学习速率调度周期。的选项gydF4y2BaCosineNumIterationsgydF4y2Ba定义每一个余弦函数周期的宽度。gydF4y2Ba

为个指定的参数优化。开始时初始化个优化参数训练的RGB和光学流网络。对于网络,使用:gydF4y2Ba

  • 0.9的动力。gydF4y2Ba

  • 一个初始速度参数初始化gydF4y2Ba[]gydF4y2Ba。gydF4y2Ba

  • 0.0005的L2正则化因子。gydF4y2Ba

指定数据在后台使用派遣一个平行的池。如果gydF4y2BaDispatchInBackgroundgydF4y2Ba被设置为true,打开一个平行池并行的指定数量的工人,并创建一个gydF4y2BaDispatchInBackgroundDatastoregydF4y2Ba提供这个例子中,将在后台数据加速训练使用异步数据加载和预处理。默认情况下,这个示例使用GPU如果一个是可用的。否则,它使用一个CPU。使用GPU需要并行计算工具箱™和CUDA NVIDIA GPU®®启用。关于支持计算能力的信息,看到万博1manbetxgydF4y2BaGPU的万博1manbetx支持版本gydF4y2Ba(并行计算工具箱)gydF4y2Ba。gydF4y2Ba

参数个数。类=类;参数个数。MiniBatchSize = 20;参数个数。NumIterations = 1500;参数个数。SaveBestAfterIteration = 900;参数个数。CosineNumIterations = (300、500、700); params.MinLearningRate = 1e-4; params.MaxLearningRate = 1e-3; params.Momentum = 0.9; params.VelocityRGB = []; params.VelocityFlow = []; params.L2Regularization = 0.0005; params.ProgressPlot = false; params.Verbose = true; params.ValidationData = dsVal; params.DispatchInBackground = false; params.NumWorkers = 4;

列车网络的gydF4y2Ba

火车子网使用RGB数据和光学流数据。设置gydF4y2BadoTraininggydF4y2Ba变量来gydF4y2Ba假gydF4y2Ba下载pretrained子网,而不必等待培训完成。另外,如果你想培养子网,设置gydF4y2BadoTraininggydF4y2Ba变量来gydF4y2Ba真正的gydF4y2Ba。gydF4y2Ba

doTraining = false;gydF4y2Ba

对于每个时代:gydF4y2Ba

  • 洗牌的数据在循环mini-batches之前的数据。gydF4y2Ba

  • 使用gydF4y2BaminibatchqueuegydF4y2Ba在mini-batches循环。支持函数万博1manbetxgydF4y2BacreateMiniBatchQueuegydF4y2Ba列出在本例中,使用给定的训练数据存储创建一个gydF4y2BaminibatchqueuegydF4y2Ba。gydF4y2Ba

  • 使用验证数据gydF4y2BadsValgydF4y2Ba验证网络。gydF4y2Ba

  • 显示每个时代的损失和准确性结果使用支持功能万博1manbetxgydF4y2BadisplayVerboseOutputEveryEpochgydF4y2Ba年底上市,这个例子。gydF4y2Ba

为每个mini-batch:gydF4y2Ba

  • 把图像数据或光学流数据和标签gydF4y2BadlarraygydF4y2Ba对象与底层类型单一。gydF4y2Ba

  • 治疗的时间维度的视频和光学流数据作为一个空间维度,使处理使用3 d CNN。指定尺寸的标签gydF4y2Ba“SSSCB”gydF4y2Ba(空间、空间、空间、通道、批处理)的RGB或光学流数据,和gydF4y2Ba“CB”gydF4y2Ba标签的数据。gydF4y2Ba

的gydF4y2BaminibatchqueuegydF4y2Ba使用支持函数对象万博1manbetxgydF4y2BabatchRGBAndFlowgydF4y2Ba列出在本例中,批RGB和光学流数据。gydF4y2Ba

modelFilename =gydF4y2Ba“I3D-RGBFlow——”gydF4y2Ba+ numClasses +gydF4y2Ba“Classes-hmdb51.mat”gydF4y2Ba;gydF4y2Ba如果gydF4y2BadoTraining时代= 1;bestValAccuracy = 0;accTrain = [];accTrainRGB = [];accTrainFlow = [];lossTrain = [];迭代= 1;打乱= shuffleTrainDs (dsTrain);gydF4y2Ba%的输出是三个:一个用于RGB框架,一个用于光学流gydF4y2Ba%的数据,一个用于地面实况标签。gydF4y2BanumOutputs = 3;兆贝可= createMiniBatchQueue(重组、numOutputs params);开始=抽搐;火车离站时刻表=开始;gydF4y2Ba%使用initializeTrainingProgressPlot和initializeVerboseOutputgydF4y2Ba%支万博1manbetx持功能,列出的例子,来初始化gydF4y2Ba%的培训发展情节和详细输出显示培训gydF4y2Ba%,训练精度,验证精度。gydF4y2Ba策划者= initializeTrainingProgressPlot (params);initializeVerboseOutput (params);gydF4y2Ba而gydF4y2Ba迭代< = params.NumIterationsgydF4y2Ba%遍历的数据集。gydF4y2Ba[dlX1 dlX2,海底]=下一个(兆贝可);gydF4y2Ba%计算模型使用dlfeval梯度和损失。gydF4y2Ba[gradRGB gradFlow,损失,acc, accRGB accFlow, stateRGB, stateFlow] =gydF4y2Ba…gydF4y2Badlfeval (@modelGradients、dlnetRGB dlnetFlow、dlX1 dlX2,海底);gydF4y2Ba%累积损失和精度。gydF4y2BalossTrain = (lossTrain、损失);accTrain = (accTrain, acc);accTrainRGB = [accTrainRGB, accRGB];accTrainFlow = [accTrainFlow, accFlow];gydF4y2Ba%更新网络状态。gydF4y2BadlnetRGB。状态= stateRGB;dlnetFlow。状态= stateFlow;gydF4y2Ba%更新梯度和RGB和光学流的参数gydF4y2Ba%使用个子网优化器。gydF4y2Ba[dlnetRGB, gradRGB,参数个数。VelocityRGB learnRate] =gydF4y2Ba…gydF4y2BaupdateDlNetwork (dlnetRGB gradRGB params, params.VelocityRGB,迭代);[dlnetFlow, gradFlow,参数个数。VelocityFlow] =gydF4y2Ba…gydF4y2BaupdateDlNetwork (dlnetFlow gradFlow params, params.VelocityFlow,迭代);gydF4y2Ba如果gydF4y2Ba~ hasdata(兆贝可)| | = = params.NumIterations迭代gydF4y2Ba%完成当前的时代。做验证和更新进展。gydF4y2Ba火车离站时刻表= toc(火车离站时刻表);(cmat validationTime, lossValidation、accValidation accValidationRGB, accValidationFlow] =gydF4y2Ba…gydF4y2BadoValidation (params dlnetRGB dlnetFlow);gydF4y2Ba%更新培训的进展。gydF4y2BadisplayVerboseOutputEveryEpoch(参数、启动、learnRate时代,迭代,gydF4y2Ba…gydF4y2Ba意味着(accTrain),意味着(accTrainRGB),意味着(accTrainFlow),gydF4y2Ba…gydF4y2BaaccValidation、accValidationRGB accValidationFlow,gydF4y2Ba…gydF4y2Ba意思是(lossTrain) lossValidation,火车离站时刻表,validationTime);updateProgressPlot (params,策划者,时代,迭代,开始,意味着(lossTrain),意味着(accTrain) accValidation);gydF4y2Ba%保存模型训练dlnetwork和精度值。gydF4y2Ba%使用saveData支持函数,在上市万博1manbetxgydF4y2Ba%的例子。gydF4y2Ba如果gydF4y2Ba迭代> = params.SaveBestAfterIterationgydF4y2Ba如果gydF4y2BaaccValidation > bestValAccuracy bestValAccuracy = accValidation;saveData (modelFilename、dlnetRGB dlnetFlow、cmat accValidation);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba如果gydF4y2Ba~ hasdata(兆贝可)& & < params.NumIterations迭代gydF4y2Ba%完成当前的时代。初始化培训损失,准确性gydF4y2Ba%值,minibatchqueue为下一个时代。gydF4y2BaaccTrain = [];accTrainRGB = [];accTrainFlow = [];lossTrain = [];火车离站时刻表=抽搐;时代=时代+ 1;打乱= shuffleTrainDs (dsTrain);numOutputs = 3;兆贝可= createMiniBatchQueue(重组、numOutputs params);gydF4y2Ba结束gydF4y2Ba迭代=迭代+ 1;gydF4y2Ba结束gydF4y2Ba%完成训练时显示一条消息。gydF4y2BaendVerboseOutput (params);disp (gydF4y2Ba”模型保存到:“gydF4y2Ba+ modelFilename);gydF4y2Ba结束gydF4y2Ba%下载pretrained预测模型和视频文件。gydF4y2Ba文件名=gydF4y2Ba“activityRecognition-I3D-HMDB51.zip”gydF4y2Ba;downloadURL =gydF4y2Ba“https://ssd.mathworks.com/万博1manbetxsupportfiles/vision/data/”gydF4y2Ba+文件名;文件名= fullfile (downloadFolder,文件名);gydF4y2Ba如果gydF4y2Ba~存在(文件名,gydF4y2Ba“文件”gydF4y2Ba)disp (gydF4y2Ba“下载pretrained网络…”gydF4y2Ba);websave(文件名,downloadURL);gydF4y2Ba结束gydF4y2Ba%解压缩下载文件夹的内容。gydF4y2Ba解压缩(文件名,downloadFolder);gydF4y2Ba如果gydF4y2Ba~ doTraining modelFilename = fullfile (downloadFolder modelFilename);gydF4y2Ba结束gydF4y2Ba

评估培训网络gydF4y2Ba

使用测试数据集评估的准确性训练有素的子网。gydF4y2Ba

负载的最佳模式保存在训练。gydF4y2Ba

d =负载(modelFilename);dlnetRGB = d.data.dlnetRGB;dlnetFlow = d.data.dlnetFlow;gydF4y2Ba

创建一个gydF4y2BaminibatchqueuegydF4y2Ba对象加载批次的测试数据。gydF4y2Ba

numOutputs = 3;兆贝可= createMiniBatchQueue(参数。V一个l我d一个t我onData, numOutputs, params);

每批测试数据,使用RGB和光学流网络做出预测,预测的平均值,计算出使用混淆矩阵预测精度。gydF4y2Ba

cmat =稀疏(numClasses numClasses);gydF4y2Ba而gydF4y2Bahasdata(兆贝可)[dlRGB dlFlow,海底]=下一个(兆贝可);gydF4y2Ba%通过视频输入RGB和光学流数据gydF4y2Ba%二束子网的独立的预测。gydF4y2BadlYPredRGB =预测(dlnetRGB dlRGB);dlYPredFlow =预测(dlnetFlow dlFlow);gydF4y2Ba%保险丝的预测计算的平均预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow) / 2;gydF4y2Ba%计算预测的准确性。gydF4y2Ba[~,欧美]= max(海底,[],1);[~,YPred] = max (dlYPred [], 1);欧美,cmat = aggregateConfusionMetric (cmat YPred);gydF4y2Ba结束gydF4y2Ba

计算的平均分类精度训练网络。gydF4y2Ba

accuracyEval =总和(诊断接头(cmat)。/笔(cmat,gydF4y2Ba“所有”gydF4y2Ba)gydF4y2Ba
accuracyEval = 0.60909gydF4y2Ba

显示混合矩阵。gydF4y2Ba

图图= confusionchart (cmat、类);gydF4y2Ba

由于训练样本的数量有限,提高准确性超过61%是具有挑战性的。为了提高网络的鲁棒性,与大数据集需要额外的培训。此外,pretraining在更大的数据集,如动力学gydF4y2Ba[1]gydF4y2Ba,可以帮助改善的结果。gydF4y2Ba

预测使用新的视频gydF4y2Ba

您现在可以使用经过训练的网络预测行为在新视频。读取和显示视频gydF4y2Bapour.avigydF4y2Ba使用gydF4y2BaVideoReadergydF4y2Ba和gydF4y2Bavision.VideoPlayergydF4y2Ba。gydF4y2Ba

videoFilename = fullfile (downloadFolder,gydF4y2Ba“pour.avi”gydF4y2Ba);videoReader = videoReader (videoFilename);放像机= vision.VideoPlayer;放像机。Name =gydF4y2Ba“倒”gydF4y2Ba;gydF4y2Ba而gydF4y2BahasFrame (videoReader)帧= readFrame (videoReader);步骤(放像机、框架);gydF4y2Ba结束gydF4y2Ba释放(放像机);gydF4y2Ba

使用gydF4y2BareadRGBAndFlowgydF4y2Ba万博1manbetx支持函数,列出在这个例子中,阅读RGB和光学流数据。gydF4y2Ba

isDataForValidation = true;readFcn = @ (f, u) readRGBAndFlow (f, u, inputStats isDataForValidation);gydF4y2Ba

read函数返回一个逻辑gydF4y2Ba结束gydF4y2Ba值指示是否有更多的数据从文件读取。使用gydF4y2BabatchRGBAndFlowgydF4y2Ba万博1manbetx支持函数,定义在这个例子中,批处理数据通过二束子网得到预测。gydF4y2Ba

hasdata = true;用户数据= [];YPred = [];gydF4y2Ba而gydF4y2Bahasdata[数据、用户数据结束]= readFcn (videoFilename userdata);[dlRGB, dlFlow] = batchRGBAndFlow(数据(:1),数据(:,2),数据(:,3));gydF4y2Ba%通过视频输入为RGB和光学流数据通过二束gydF4y2Ba%子网的独立的预测。gydF4y2BadlYPredRGB =预测(dlnetRGB dlRGB);dlYPredFlow =预测(dlnetFlow dlFlow);gydF4y2Ba%保险丝的预测计算的平均预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow) / 2;[~,YPredCurr] = max (dlYPred [], 1);YPred = horzcat (YPred YPredCurr);hasdata = ~结束;gydF4y2Ba结束gydF4y2BaYPred = extractdata (YPred);gydF4y2Ba

数一数使用正确的预测gydF4y2BahistcountsgydF4y2Ba,获得预测行动使用正确的最大数量的预测。gydF4y2Ba

类= params.Classes;数量= histcounts (YPred 1:元素个数(类));[~,clsIdx] = max(重要);action =类(clsIdx)gydF4y2Ba
action =“倒”gydF4y2Ba

万博1manbetx支持功能gydF4y2Ba

inputStatisticsgydF4y2Ba

的gydF4y2BainputStatisticsgydF4y2Ba函数作为输入文件夹的名称包含HMDB51数据,并计算出最小和最大的RGB值数据和光学流数据。的最小值和最大值作为标准化输入的输入层网络。这个函数也获得帧的数量在每个视频文件使用后在训练和测试网络。为了找到的最大和最小值不同的数据集,使用这个函数包含数据集的一个文件夹的名字。gydF4y2Ba

函数gydF4y2BainputStats = inputStatistics (dataFolder) ds = createDatastore (dataFolder);ds。ReadFcn = @getMinMax;抽搐;tt =高(ds);varnames = {gydF4y2Ba“rgbMax”gydF4y2Ba,gydF4y2Ba“rgbMin”gydF4y2Ba,gydF4y2Ba“oflowMax”gydF4y2Ba,gydF4y2Ba“oflowMin”gydF4y2Ba};统计=收集(groupsummary (tt, [], {gydF4y2Ba“马克斯”gydF4y2Ba,gydF4y2Ba“最小值”gydF4y2Ba},varnames));inputStats。文件名=收集(tt.Filename);inputStats。NumFrames =收集(tt.NumFrames);inputStats。rgbMax = stats.max_rgbMax; inputStats.rgbMin = stats.min_rgbMin; inputStats.oflowMax = stats.max_oflowMax; inputStats.oflowMin = stats.min_oflowMin; save(“inputStatistics.mat”gydF4y2Ba,gydF4y2Ba“inputStats”gydF4y2Ba);toc;gydF4y2Ba结束gydF4y2Ba函数gydF4y2Badata = getMinMax(文件名)读者= VideoReader(文件名);opticFlow = opticalFlowFarneback;数据= [];gydF4y2Ba而gydF4y2BahasFrame(读者)帧= readFrame(读者);[rgb, oflow] = findMinMax(框架、opticFlow);data = assignMinMax(数据、rgb oflow);gydF4y2Ba结束gydF4y2BatotalFrames =地板(读者。持续时间* reader.FrameRate);totalFrames = min (totalFrames reader.NumFrames);[labelName,文件名]= getLabelFilename(文件名);数据。文件名= fullfile (labelName,文件名);数据。NumFrames = totalFrames;data = struct2table(数据、gydF4y2Ba“AsArray”gydF4y2Ba,真正的);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Badata = assignMinMax(数据、rgb oflow)gydF4y2Ba如果gydF4y2Baisempty(数据)的数据。rgbMax = rgb.Max;数据。rgbMin = rgb.Min; data.oflowMax = oflow.Max; data.oflowMin = oflow.Min;返回gydF4y2Ba;gydF4y2Ba结束gydF4y2Ba数据。rgbMax = max(data.rgbMax, rgb.Max); data.rgbMin = min(data.rgbMin, rgb.Min); data.oflowMax = max(data.oflowMax, oflow.Max); data.oflowMin = min(data.oflowMin, oflow.Min);结束gydF4y2Ba函数gydF4y2Ba[rgbMinMax, oflowMinMax] = findMinMax rgbMinMax (rgb, opticFlow)。Max = Max (rgb, [], [1,2]);rgbMinMax。Min = Min (rgb, [], [1,2]);灰色= rgb2gray (rgb);流= estimateFlow (opticFlow、灰色);oflow =猫(3 flow.Vx flow.Vy flow.Magnitude);oflowMinMax。Max = Max (oflow [], [1,2]);oflowMinMax。Min = Min (oflow [], [1,2]);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Bads = createDatastore(文件夹)ds = fileDatastore(文件夹,gydF4y2Ba…gydF4y2Ba“IncludeSubfolders”gydF4y2Ba,真的,gydF4y2Ba…gydF4y2Ba“FileExtensions”gydF4y2Ba,gydF4y2Ba“.avi”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“UniformRead”gydF4y2Ba,真的,gydF4y2Ba…gydF4y2Ba“ReadFcn”gydF4y2Ba,@getMinMax);disp (gydF4y2Ba”NumFiles:“gydF4y2Ba+元素个数(ds.Files));gydF4y2Ba结束gydF4y2Ba

createFileDatastoregydF4y2Ba

的gydF4y2BacreateFileDatastoregydF4y2Ba函数创建一个gydF4y2BaFileDatastoregydF4y2Ba对象使用给定的文件名。的gydF4y2BaFileDatastoregydF4y2Ba对象读取数据gydF4y2Ba“partialfile”gydF4y2Ba模式,所以每读可以返回部分读从视频帧。该功能有助于在阅读了大量的视频文件,如果所有的帧不适合在内存中。gydF4y2Ba

函数gydF4y2Ba数据存储= createFileDatastore(文件名、inputStats isDataForValidation) readFcn = @ (f, u) readRGBAndFlow (f, u, inputStats isDataForValidation);数据存储= fileDatastore(文件名,gydF4y2Ba…gydF4y2Ba“ReadFcn”gydF4y2BareadFcn,gydF4y2Ba…gydF4y2Ba“ReadMode”gydF4y2Ba,gydF4y2Ba“partialfile”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba

readRGBAndFlowgydF4y2Ba

的gydF4y2BareadRGBAndFlowgydF4y2Ba函数读取RGB框架,相应的光学流数据和标签值对于一个给定的视频文件。在培训过程中,读函数读取特定数量的帧按照网络输入大小,随机选择起始帧。光流计算数据从视频文件的开头,但跳过直到到达起始帧。在测试期间,所有的框架都是按顺序读取,数据计算和相应的光流。RGB帧和光学流数据是随机裁剪到所需的网络输入大小培训、剪裁和中心进行测试和验证。gydF4y2Ba

函数gydF4y2Ba(数据、用户数据做)= readRGBAndFlow(文件名,用户数据、inputStats isDataForValidation)gydF4y2Ba如果gydF4y2Baisempty(用户数据)用户数据。re一个der=VideoReader(f我len一个米e);用户数据。batchesRead = 0; userdata.opticalFlow = opticalFlowFarneback; [totalFrames,userdata.label] = getTotalFramesAndLabel(inputStats,filename);如果gydF4y2Ba(userdata.reader isempty (totalFrames) totalFrames =地板。持续时间* userdata.reader.FrameRate);totalFrames = min (totalFrames userdata.reader.NumFrames);gydF4y2Ba结束gydF4y2Ba用户数据。来t一个lFrames = totalFrames;结束gydF4y2Ba读者= userdata.reader;totalFrames = userdata.totalFrames;标签= userdata.label;batchesRead = userdata.batchesRead;opticalFlow = userdata.opticalFlow;inputSize = inputStats.inputSize;H = inputSize (1);W = inputSize (2);rgbC = 3;flowC = 2; numFrames = inputSize(3);如果gydF4y2BanumFrames > totalFrames numBatches = 1;gydF4y2Ba其他的gydF4y2BanumBatches =地板(totalFrames / numFrames);gydF4y2Ba结束gydF4y2BaimH = userdata.reader.Height;世界地图= userdata.reader.Width;imsz = (imH,世界地图);gydF4y2Ba如果gydF4y2Ba~ isDataForValidation augmentFcn = augmentTransform ([imsz 3]);cropWindow = randomCropWindow2d (imsz inputSize (1:2));gydF4y2Ba% 1。随机选择需要的帧数,gydF4y2Ba%开始随机在特定的框架。gydF4y2Ba如果gydF4y2BanumFrames > = totalFrames idx = 1: totalFrames;gydF4y2Ba%添加更多的帧来填补在网络输入的大小。gydF4y2Ba附加=装天花板(numFrames / totalFrames);idx = repmat (idx 1额外的);idx = idx (1: numFrames);gydF4y2Ba其他的gydF4y2BastartIdx = randperm (totalFrames numFrames);startIdx = startIdx (1);endIdx = startIdx + numFrames - 1;idx = startIdx: endIdx;gydF4y2Ba结束gydF4y2Ba视频= 0 (H, W, rgbC numFrames);oflow = 0 (H, W, flowC numFrames);i = 1;gydF4y2Ba%抛弃第一组帧初始化光流。gydF4y2Ba为gydF4y2Ba2 = 1:idx(1) 1帧=阅读(读者,ii);getRGBAndFlow(框架、opticalFlow augmentFcn cropWindow);gydF4y2Ba结束gydF4y2Ba%读下一组需要的帧数进行训练。gydF4y2Ba为gydF4y2Ba2 = idx帧=阅读(读者,ii);[rgb, vxvy] = getRGBAndFlow(框架、opticalFlow augmentFcn, cropWindow);视频(::,:,i) = rgb;oflow (::,:, i) = vxvy;我=我+ 1;gydF4y2Ba结束gydF4y2Ba其他的gydF4y2BaaugmentFcn = @(数据)(数据);cropWindow = centerCropWindow2d (imsz inputSize (1:2));探路者= min ([numFrames totalFrames]);视频= 0 (H, W, rgbC,探路者);oflow = 0 (H, W, flowC,探路者);i = 1;gydF4y2Ba而gydF4y2BahasFrame(读者)& &我< = = readFrame numFrames帧(读者);[rgb, vxvy] = getRGBAndFlow(框架、opticalFlow augmentFcn, cropWindow);视频(::,:,i) = rgb;oflow (::,:, i) = vxvy;我=我+ 1;gydF4y2Ba结束gydF4y2Ba如果gydF4y2BanumFrames > totalFrames附加=装天花板(numFrames / totalFrames);视频= repmat(视频,1,1,1,额外的);oflow = repmat (oflow, 1, 1, 1,额外的);视频=视频(:,:,:1:numFrames);oflow = oflow (:,:,: 1: numFrames);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba%预计,网络视频和光学流输入gydF4y2Ba%以下dlarray格式:gydF4y2Ba%”SSSCB”= = > x频道批高度宽度x帧gydF4y2Ba%gydF4y2Ba%交换数据gydF4y2Ba%的gydF4y2Ba%的高度宽度x频道x框架gydF4y2Ba%,gydF4y2Ba%的高度宽度x x频道帧gydF4y2Ba视频=排列(视频中,[1、2、4、3]);oflow =排列(oflow [1、2、4、3]);数据={视频、oflow标签};batchesRead = batchesRead + 1;用户数据。b一个tchesRead = batchesRead;%完成标志设置为true,如果读者读过所有的帧或gydF4y2Ba%如果是培训。gydF4y2Ba做= batchesRead = = numBatches | | ~ isDataForValidation;gydF4y2Ba结束gydF4y2Ba函数gydF4y2Ba[rgb, vxvy] = getRGBAndFlow (rgb, opticalFlow、augmentFcn cropWindow) rgb = augmentFcn (rgb);灰色= rgb2gray (rgb);流= estimateFlow (opticalFlow、灰色);vxvy =猫(3 flow.Vx flow.Vy flow.Vy);rgb = imcrop (rgb, cropWindow);vxvy = imcrop (vxvy cropWindow);vxvy = vxvy (:,:, 1:2);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Ba(标签、帧)= getLabelFilename(文件名)[文件夹名称,ext] = fileparts (string(文件名));[~,标签]= fileparts(文件夹);帧= + ext名称;标签=字符串(标签);帧=字符串(帧);gydF4y2Ba结束gydF4y2Ba函数gydF4y2Ba[totalFrames,标签]= getTotalFramesAndLabel(信息,文件名)文件名= info.Filename;帧= info.NumFrames;[labelName、帧]= getLabelFilename(文件名);idx = strcmp(文件名,fullfile (labelName、帧));totalFrames =帧(idx);标签=分类(字符串(labelName),字符串(info.Classes));gydF4y2Ba结束gydF4y2Ba

augmentTransformgydF4y2Ba

的gydF4y2BaaugmentTransformgydF4y2Ba函数创建一个扩展方法与随机左右翻转和扩展的因素。gydF4y2Ba

函数gydF4y2BaaugmentFcn = augmentTransform(深圳)gydF4y2Ba%随机图像翻转和规模。gydF4y2Batform = randomAffine2d (gydF4y2Ba“XReflection”gydF4y2Ba,真的,gydF4y2Ba“规模”gydF4y2Ba1.1 [1]);tform溃败= affineOutputView(深圳,gydF4y2Ba“BoundsStyle”gydF4y2Ba,gydF4y2Ba“CenterOutput”gydF4y2Ba);augmentFcn = @(数据)augmentData(数据、tform溃败);gydF4y2Ba函数gydF4y2Badata = augmentData(数据、tform溃败)数据= imwarp(数据、tformgydF4y2Ba“OutputView”gydF4y2Ba,溃败);gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

modelGradientsgydF4y2Ba

的gydF4y2BamodelGradientsgydF4y2Ba函数作为输入mini-batch RGB数据gydF4y2BadlRGBgydF4y2Ba,相应的光学流数据gydF4y2BadlFlowgydF4y2Ba,和相应的目标gydF4y2Ba海底gydF4y2Ba,并返回相应的损失,损失的梯度对可学的参数,和训练精度。计算梯度,评估gydF4y2BamodelGradientsgydF4y2Ba函数使用gydF4y2BadlfevalgydF4y2Ba功能训练循环。gydF4y2Ba

函数gydF4y2Ba[gradientsRGB gradientsFlow,损失,acc, accRGB accFlow, stateRGB, stateFlow] = modelGradients (dlnetRGB、dlnetFlow dlRGB, dlFlow, Y)gydF4y2Ba%通过视频输入为RGB和光学流数据通过二束gydF4y2Ba%网络。gydF4y2Ba向前(dlYPredRGB stateRGB] = (dlnetRGB dlRGB);向前(dlYPredFlow stateFlow] = (dlnetFlow dlFlow);gydF4y2Ba%计算融合损失、渐变和准确性二束gydF4y2Ba%的预测。gydF4y2BargbLoss = crossentropy (dlYPredRGB Y);flowLoss = crossentropy (dlYPredFlow Y);gydF4y2Ba%保险丝的损失。gydF4y2Ba损失=意味着([rgbLoss flowLoss]);gradientsRGB = dlgradient(损失、dlnetRGB.Learnables);gradientsFlow = dlgradient(损失、dlnetFlow.Learnables);gydF4y2Ba%保险丝的预测计算的平均预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow) / 2;gydF4y2Ba%计算预测的准确性。gydF4y2Ba[~,欧美]= max (Y, [], 1);[~,YPred] = max (dlYPred [], 1);acc =收集(extractdata (sum(欧美= = YPred)。/元素个数(欧美)));gydF4y2Ba%计算RGB和流量预测的准确性。gydF4y2Ba[~,欧美]= max (Y, [], 1);[~,YPredRGB] = max (dlYPredRGB [], 1);[~,YPredFlow] = max (dlYPredFlow [], 1);accRGB =收集(extractdata (sum(欧美= = YPredRGB)。/元素个数(欧美)));accFlow =收集(extractdata (sum(欧美= = YPredFlow)。/元素个数(欧美)));gydF4y2Ba结束gydF4y2Ba

doValidationgydF4y2Ba

的gydF4y2BadoValidationgydF4y2Ba使用验证函数验证网络数据。gydF4y2Ba

函数gydF4y2Ba(cmat validationTime, lossValidation、accValidation accValidationRGB, accValidationFlow] = doValidation (params, dlnetRGB dlnetFlow) validationTime =抽搐;numOutputs = 3;兆贝可= createMiniBatchQueue(参数。V一个l我d一个t我onData, numOutputs, params); lossValidation = []; numClasses = numel(params.Classes); cmat = sparse(numClasses,numClasses); cmatRGB = sparse(numClasses,numClasses); cmatFlow = sparse(numClasses,numClasses);而gydF4y2Bahasdata(兆贝可)[dlX1 dlX2,海底]=下一个(兆贝可);[损失,欧美,YPred、YPredRGB YPredFlow] = predictValidation (dlnetRGB、dlnetFlow dlX1, dlX2,海底);lossValidation = (lossValidation、损失);欧美,cmat = aggregateConfusionMetric (cmat YPred);欧美,cmatRGB = aggregateConfusionMetric (cmatRGB YPredRGB);欧美,cmatFlow = aggregateConfusionMetric (cmatFlow YPredFlow);gydF4y2Ba结束gydF4y2BalossValidation =意味着(lossValidation);accValidation =总和(诊断接头(cmat)。/笔(cmat,gydF4y2Ba“所有”gydF4y2Ba);accValidationRGB =总和(诊断接头(cmatRGB)。/笔(cmatRGB,gydF4y2Ba“所有”gydF4y2Ba);accValidationFlow =总和(诊断接头(cmatFlow)。/笔(cmatFlow,gydF4y2Ba“所有”gydF4y2Ba);validationTime = toc (validationTime);gydF4y2Ba结束gydF4y2Ba

predictValidationgydF4y2Ba

的gydF4y2BapredictValidationgydF4y2Ba损失函数计算和预测使用提供的值gydF4y2BadlnetworkgydF4y2Ba对象为RGB和光学流数据。gydF4y2Ba

函数gydF4y2Ba[损失,欧美,YPred、YPredRGB YPredFlow] = predictValidation (dlnetRGB、dlnetFlow dlRGB, dlFlow, Y)gydF4y2Ba通过二束%通过视频输入gydF4y2Ba%网络。gydF4y2BadlYPredRGB =预测(dlnetRGB dlRGB);dlYPredFlow =预测(dlnetFlow dlFlow);gydF4y2Ba%计算叉二束分别gydF4y2Ba%输出。gydF4y2BargbLoss = crossentropy (dlYPredRGB Y);flowLoss = crossentropy (dlYPredFlow Y);gydF4y2Ba%保险丝的损失。gydF4y2Ba损失=意味着([rgbLoss flowLoss]);gydF4y2Ba%保险丝的预测计算的平均预测。gydF4y2BadlYPred = (dlYPredRGB + dlYPredFlow) / 2;gydF4y2Ba%计算预测的准确性。gydF4y2Ba[~,欧美]= max (Y, [], 1);[~,YPred] = max (dlYPred [], 1);[~,YPredRGB] = max (dlYPredRGB [], 1);[~,YPredFlow] = max (dlYPredFlow [], 1);gydF4y2Ba结束gydF4y2Ba

updateDlnetworkgydF4y2Ba

的gydF4y2BaupdateDlnetworkgydF4y2Ba功能更新提供gydF4y2BadlnetworkgydF4y2Ba对象与梯度和其他参数使用个优化功能gydF4y2BasgdmupdategydF4y2Ba。gydF4y2Ba

函数gydF4y2Ba[dlnet、渐变速度,learnRate] = updateDlNetwork (dlnet、渐变参数、速度迭代)gydF4y2Ba%确定学习速率使用cosine-annealing学习速率的时间表。gydF4y2BalearnRate = cosineAnnealingLearnRate(迭代、params);gydF4y2Ba%对权重应用L2正规化。gydF4y2Baidx = dlnet.Learnables。参数= =gydF4y2Ba“重量”gydF4y2Ba;梯度(idx:) = dlupdate (@ (g, w) g +参数。l2Regularization*w, gradients(idx,:), dlnet.Learnables(idx,:));%更新使用个优化网络参数。gydF4y2Ba[dlnet,速度]= sgdmupdate (dlnet、渐变速度,learnRate params.Momentum);gydF4y2Ba结束gydF4y2Ba

cosineAnnealingLearnRategydF4y2Ba

的gydF4y2BacosineAnnealingLearnRategydF4y2Ba函数计算基于当前迭代数,学习速率最低学习速率、最大学习速率,对退火的迭代次数gydF4y2Ba3gydF4y2Ba]。gydF4y2Ba

函数gydF4y2Balr = cosineAnnealingLearnRate(迭代参数)gydF4y2Ba如果gydF4y2Ba迭代= =参数。NumIterations lr = params.MinLearningRate;gydF4y2Ba返回gydF4y2Ba;gydF4y2Ba结束gydF4y2BacosineNumIter = [0, params.CosineNumIterations];csum = cumsum (cosineNumIter);块=找到(迭代csum > = 1gydF4y2Ba“第一”gydF4y2Ba);cosineIter =迭代- csum (block - 1);annealingIteration =国防部(cosineIter cosineNumIter(块));cosineIteration = cosineNumIter(块);minR = params.MinLearningRate;maxR = params.MaxLearningRate;cosMult = 1 + cos(π* annealingIteration / cosineIteration);lr = minR + ((maxR - minR) * cosMult / 2);gydF4y2Ba结束gydF4y2Ba

aggregateConfusionMetricgydF4y2Ba

的gydF4y2BaaggregateConfusionMetricgydF4y2Ba功能逐步填补了混淆矩阵的基础上,预测结果gydF4y2BaYPredgydF4y2Ba和预期结果gydF4y2Ba欧美gydF4y2Ba。gydF4y2Ba

函数gydF4y2Ba欧美,cmat = aggregateConfusionMetric (cmat YPred)欧美=收集(extractdata(欧美));YPred =收集(extractdata (YPred));[m, n] =大小(cmat);cmat = cmat +满(稀疏(欧美YPred 1, m, n));gydF4y2Ba结束gydF4y2Ba

createMiniBatchQueuegydF4y2Ba

的gydF4y2BacreateMiniBatchQueuegydF4y2Ba函数创建一个gydF4y2BaminibatchqueuegydF4y2Ba对象,该对象提供gydF4y2BaminiBatchSizegydF4y2Ba从给定的数据存储的数据量。它还创建了一个gydF4y2BaDispatchInBackgroundDatastoregydF4y2Ba如果一个平行池是开放的。gydF4y2Ba

函数gydF4y2Ba兆贝可= createMiniBatchQueue(数据存储、numOutputs params)gydF4y2Ba如果gydF4y2Ba参数个数。DispatchInBackground & & isempty (gcp (gydF4y2Ba“nocreate”gydF4y2Ba))gydF4y2Ba%开始平行池,如果DispatchInBackground是真的,调度gydF4y2Ba%在后台数据使用并行池。gydF4y2Bac = parcluster (gydF4y2Ba“本地”gydF4y2Ba);c。NumWorkers = params.NumWorkers;parpool (gydF4y2Ba“本地”gydF4y2Ba,params.NumWorkers);gydF4y2Ba结束gydF4y2Bagcp (p =gydF4y2Ba“nocreate”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~ isempty (p)数据存储= DispatchInBackgroundDatastore(数据存储,p.NumWorkers);gydF4y2Ba结束gydF4y2BainputFormat (1: numOutputs-1) =gydF4y2Ba“SSSCB”gydF4y2Ba;outputFormat =gydF4y2Ba“CB”gydF4y2Ba;numOutputs兆贝可= minibatchqueue(数据存储,gydF4y2Ba…gydF4y2Ba“MiniBatchSize”gydF4y2Baparams.MiniBatchSize,gydF4y2Ba…gydF4y2Ba“MiniBatchFcn”gydF4y2Ba@batchRGBAndFlow,gydF4y2Ba…gydF4y2Ba“MiniBatchFormat”gydF4y2BainputFormat, outputFormat]);gydF4y2Ba结束gydF4y2Ba

batchRGBAndFlowgydF4y2Ba

的gydF4y2BabatchRGBAndFlowgydF4y2Ba函数批量图像、流和标签到相应的数据gydF4y2BadlarraygydF4y2Ba值的数据格式gydF4y2Ba“SSSCB”gydF4y2Ba,gydF4y2Ba“SSSCB”gydF4y2Ba,gydF4y2Ba“CB”gydF4y2Ba,分别。gydF4y2Ba

函数gydF4y2Ba[dlX1 dlX2,海底]= batchRGBAndFlow(图像、流标签)gydF4y2Ba%批尺寸:5gydF4y2BaX1 =猫(5、图像{:});X2 =猫(5流{:});gydF4y2Ba%批尺寸:2gydF4y2Ba标签=猫({}):2、标签;gydF4y2Ba%功能维度:1gydF4y2BaY = onehotencode(标签,1);gydF4y2Ba%将数据单进行处理。gydF4y2BaX1 =单(X1);X2 =单(X2);Y =单(Y);gydF4y2Ba%的GPU移动数据,如果可能的话。gydF4y2Ba如果gydF4y2BacanUseGPU X1 = gpuArray (X1);X2 = gpuArray (X2);Y = gpuArray (Y);gydF4y2Ba结束gydF4y2Ba%作为dlarray对象返回X和Y。gydF4y2BadlX1 = dlarray (X1,gydF4y2Ba“SSSCB”gydF4y2Ba);dlX2 = dlarray (X2,gydF4y2Ba“SSSCB”gydF4y2Ba);海底= dlarray (Y,gydF4y2Ba“CB”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba

shuffleTrainDsgydF4y2Ba

的gydF4y2BashuffleTrainDsgydF4y2Ba函数打乱文件出现在训练数据存储gydF4y2BadsTraingydF4y2Ba。gydF4y2Ba

函数gydF4y2Ba打乱= shuffleTrainDs (dsTrain)打乱= (dsTrain)复印件;n =元素个数(shuffled.Files);shuffledIndices = randperm (n);重新洗了一遍。文件= shuffled.Files (shuffledIndices);重置(重组);gydF4y2Ba结束gydF4y2Ba

saveDatagydF4y2Ba

的gydF4y2BasaveDatagydF4y2Ba节省了给定的函数gydF4y2BadlnetworkgydF4y2Ba垫文件对象和精度值。gydF4y2Ba

函数gydF4y2BasaveData (modelFilename、dlnetRGB dlnetFlow、cmat accValidation) dlnetRGB = gatherFromGPUToSave (dlnetRGB);dlnetFlow = gatherFromGPUToSave (dlnetFlow);数据。V一个l我d一个t我onAccuracy = accValidation; data.cmat = cmat; data.dlnetRGB = dlnetRGB; data.dlnetFlow = dlnetFlow; save(modelFilename,“数据”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba

gatherFromGPUToSavegydF4y2Ba

的gydF4y2BagatherFromGPUToSavegydF4y2Ba函数从GPU为了收集数据模型保存到磁盘。gydF4y2Ba

函数gydF4y2Badlnet = gatherFromGPUToSave (dlnet)gydF4y2Ba如果gydF4y2Ba~ canUseGPUgydF4y2Ba返回gydF4y2Ba;gydF4y2Ba结束gydF4y2Badlnet。le一个rn一个ble年代=g一个therValues(dlnet.Learnables); dlnet.State = gatherValues(dlnet.State);函数gydF4y2Ba台= gatherValues(台)gydF4y2Ba为gydF4y2Ba2 = 1:高度(台)台。V一个lue{ii} = gather(tbl.Value{ii});结束gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

checkForHMDB51FoldergydF4y2Ba

的gydF4y2BacheckForHMDB51FoldergydF4y2Ba函数检查下载的下载文件夹中的数据。gydF4y2Ba

函数gydF4y2Ba类= checkForHMDB51Folder (dataLoc) hmdbFolder = fullfile (dataLoc,gydF4y2Ba“hmdb51_org”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~存在(hmdbFoldergydF4y2Ba“dir”gydF4y2Ba)错误(gydF4y2Ba“下载hmdb51_org。r一个r' file using the supporting function 'downloadHMDB51' before running the example and extract the RAR file.");gydF4y2Ba结束gydF4y2Ba类= [gydF4y2Ba“brush_hair”gydF4y2Ba,gydF4y2Ba“车轮”gydF4y2Ba,gydF4y2Ba“抓”gydF4y2Ba,gydF4y2Ba“咀嚼”gydF4y2Ba,gydF4y2Ba“鼓掌”gydF4y2Ba,gydF4y2Ba“爬”gydF4y2Ba,gydF4y2Ba“climb_stairs”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“潜水”gydF4y2Ba,gydF4y2Ba“draw_sword”gydF4y2Ba,gydF4y2Ba“口水”gydF4y2Ba,gydF4y2Ba“喝”gydF4y2Ba,gydF4y2Ba“吃”gydF4y2Ba,gydF4y2Ba“fall_floor”gydF4y2Ba,gydF4y2Ba“击剑”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“flic_flac”gydF4y2Ba,gydF4y2Ba“高尔夫球”gydF4y2Ba,gydF4y2Ba“倒立”gydF4y2Ba,gydF4y2Ba“打”gydF4y2Ba,gydF4y2Ba“拥抱”gydF4y2Ba,gydF4y2Ba“跳”gydF4y2Ba,gydF4y2Ba“踢”gydF4y2Ba,gydF4y2Ba“kick_ball”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“吻”gydF4y2Ba,gydF4y2Ba“笑”gydF4y2Ba,gydF4y2Ba“选择”gydF4y2Ba,gydF4y2Ba“倒”gydF4y2Ba,gydF4y2Ba“引体向上”gydF4y2Ba,gydF4y2Ba“打”gydF4y2Ba,gydF4y2Ba“推”gydF4y2Ba,gydF4y2Ba“俯卧撑”gydF4y2Ba,gydF4y2Ba“ride_bike”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“ride_horse”gydF4y2Ba,gydF4y2Ba“运行”gydF4y2Ba,gydF4y2Ba“shake_hands”gydF4y2Ba,gydF4y2Ba“shoot_ball”gydF4y2Ba,gydF4y2Ba“shoot_bow”gydF4y2Ba,gydF4y2Ba“shoot_gun”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“坐”gydF4y2Ba,gydF4y2Ba“仰卧起坐”gydF4y2Ba,gydF4y2Ba“微笑”gydF4y2Ba,gydF4y2Ba“烟”gydF4y2Ba,gydF4y2Ba“筋斗”gydF4y2Ba,gydF4y2Ba“站”gydF4y2Ba,gydF4y2Ba“swing_baseball”gydF4y2Ba,gydF4y2Ba“剑”gydF4y2Ba,gydF4y2Ba…gydF4y2Ba“sword_exercise”gydF4y2Ba,gydF4y2Ba“交谈”gydF4y2Ba,gydF4y2Ba“扔”gydF4y2Ba,gydF4y2Ba“转”gydF4y2Ba,gydF4y2Ba“走”gydF4y2Ba,gydF4y2Ba“波”gydF4y2Ba];expectFolders = fullfile (hmdbFolder、类);gydF4y2Ba如果gydF4y2Ba~所有(arrayfun (@ (x)存在(x,gydF4y2Ba“dir”gydF4y2Ba),expectFolders)错误(gydF4y2Ba“下载hmdb51_org。r一个r使用the supporting function 'downloadHMDB51' before running the example and extract the RAR file.");gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

downloadHMDB51gydF4y2Ba

的gydF4y2BadownloadHMDB51gydF4y2Ba功能下载数据集保存到一个目录。gydF4y2Ba

函数gydF4y2BadownloadHMDB51 (dataLoc)gydF4y2Ba如果gydF4y2Ba输入参数个数= = 0 dataLoc = pwd;gydF4y2Ba结束gydF4y2BadataLoc =字符串(dataLoc);gydF4y2Ba如果gydF4y2Ba~存在(dataLocgydF4y2Ba“dir”gydF4y2Bamkdir (dataLoc);gydF4y2Ba结束gydF4y2BadataUrl =gydF4y2Ba“http://serre-lab.clps.brown.edu/wp-content/uploads/2013/10/hmdb51_org.rar”gydF4y2Ba;选择= weboptions (gydF4y2Ba“超时”gydF4y2Ba、正);rarFileName = fullfile (dataLoc,gydF4y2Ba“hmdb51_org.rar”gydF4y2Ba);fileExists =存在(rarFileName,gydF4y2Ba“文件”gydF4y2Ba);gydF4y2Ba%下载RAR文件并将其保存到下载文件夹中。gydF4y2Ba如果gydF4y2Ba~ fileExists disp (gydF4y2Ba“下载hmdb51_org。r一个r(2GB) to the folder:")disp dataLoc disp (gydF4y2Ba“这下载可以花几分钟……”gydF4y2Ba)websave (rarFileName dataUrl选项);disp (gydF4y2Ba“下载完成了。”gydF4y2Ba)disp (gydF4y2Ba“提取hmdb51_org。r一个rf我lecontent年代到folder: ")disp (dataLoc)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

initializeTrainingProgressPlotgydF4y2Ba

的gydF4y2BainitializeTrainingProgressPlotgydF4y2Ba功能配置两个情节显示训练,训练精度,验证精度。gydF4y2Ba

函数gydF4y2Ba策划者= initializeTrainingProgressPlot (params)gydF4y2Ba如果gydF4y2Baparams.ProgressPlotgydF4y2Ba%画出损失,训练精度,验证精度。gydF4y2Ba图gydF4y2Ba%损失情节gydF4y2Ba次要情节(2,1,1)策划者。lo年代年代Plotter=一个n我米一个tedline; xlabel(“迭代”gydF4y2Ba)ylabel (gydF4y2Ba“损失”gydF4y2Ba)gydF4y2Ba%精度图gydF4y2Ba次要情节(2,1,2)策划者。TrainAccPlotter = animatedline (gydF4y2Ba“颜色”gydF4y2Ba,gydF4y2Ba“b”gydF4y2Ba);策划者。V一个lAccPlotter = animatedline(“颜色”gydF4y2Ba,gydF4y2Ba‘g’gydF4y2Ba);传奇(gydF4y2Ba“训练的准确性”gydF4y2Ba,gydF4y2Ba“验证准确性”gydF4y2Ba,gydF4y2Ba“位置”gydF4y2Ba,gydF4y2Ba“西北”gydF4y2Ba);包含(gydF4y2Ba“迭代”gydF4y2Ba)ylabel (gydF4y2Ba“准确性”gydF4y2Ba)gydF4y2Ba其他的gydF4y2Ba策划者= [];gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

initializeVerboseOutputgydF4y2Ba

的gydF4y2BainitializeVerboseOutputgydF4y2Ba函数显示表的列标题的训练价值,这显示了时代,mini-batch准确性,和其他训练价值。gydF4y2Ba

函数gydF4y2BainitializeVerboseOutput (params)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代edisp (gydF4y2Ba”“gydF4y2Ba)gydF4y2Ba如果gydF4y2BacanUseGPU disp (gydF4y2Ba“训练在GPU上。”gydF4y2Ba)gydF4y2Ba其他的gydF4y2Badisp (gydF4y2Ba“训练在CPU上。”gydF4y2Ba)gydF4y2Ba结束gydF4y2Bagcp (p =gydF4y2Ba“nocreate”gydF4y2Ba);gydF4y2Ba如果gydF4y2Ba~ isempty (p) disp (gydF4y2Ba并行集群上的“培训”gydF4y2Ba+ p.Cluster。概要文件+gydF4y2Ba“”。”gydF4y2Ba)gydF4y2Ba结束gydF4y2Badisp (gydF4y2Ba”NumIterations:“gydF4y2Ba+字符串(params.NumIterations));disp (gydF4y2Ba”MiniBatchSize:“gydF4y2Ba+字符串(params.MiniBatchSize));disp (gydF4y2Ba“类:”gydF4y2Ba+加入(字符串(params.Classes),gydF4y2Ba”、“gydF4y2Ba));disp (gydF4y2Ba“| = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = |”gydF4y2Ba)disp (gydF4y2Ba| | |时代迭代时间| | Mini-Batch精度验证准确性| Mini-Batch学习| | |验证基地训练时间|验证时间|”gydF4y2Ba)disp (gydF4y2Ba“| | | (hh: mm: ss) | (Avg: RGB:流)| (Avg: RGB:流)| | | |率损失损失(hh: mm: ss) | (hh: mm: ss) |”gydF4y2Ba)disp (gydF4y2Ba“| = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = |”gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

displayVerboseOutputEveryEpochgydF4y2Ba

的gydF4y2BadisplayVerboseOutputEveryEpochgydF4y2Ba函数显示培训的详细输出值,如时代、mini-batch精度,验证准确性和mini-batch损失。gydF4y2Ba

函数gydF4y2BadisplayVerboseOutputEveryEpoch(参数、启动、learnRate时代,迭代,gydF4y2Ba…gydF4y2BaaccTrain、accTrainRGB accTrainFlow、accValidation accValidationRGB, accValidationFlow, lossTrain, lossValidation,火车离站时刻表,validationTime)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代eD =持续时间(0,0,toc(开始),gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);火车离站时刻表=持续时间(0,0,火车离站时刻表,gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);validationTime =持续时间(0,0,validationTime,gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);lossValidation =收集(extractdata (lossValidation));lossValidation =组成(gydF4y2Ba“% .4f”gydF4y2Ba,lossValidation);accValidation = composePadAccuracy (accValidation);accValidationRGB = composePadAccuracy (accValidationRGB);accValidationFlow = composePadAccuracy (accValidationFlow);accVal =加入([accValidation、accValidationRGB accValidationFlow),gydF4y2Ba”:“gydF4y2Ba);lossTrain =收集(extractdata (lossTrain));lossTrain =组成(gydF4y2Ba“% .4f”gydF4y2Ba,lossTrain);accTrain = composePadAccuracy (accTrain);accTrainRGB = composePadAccuracy (accTrainRGB);accTrainFlow = composePadAccuracy (accTrainFlow);accTrain =加入([accTrain、accTrainRGB accTrainFlow),gydF4y2Ba”:“gydF4y2Ba);learnRate =组成(gydF4y2Ba“% .13f”gydF4y2Ba,learnRate);disp (gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string(时代),5,gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(字符串(迭代)9gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (D) 12gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (accTrain), 26岁,gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (accVal), 26岁,gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (lossTrain) 10gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (lossValidation) 10gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (learnRate), 13日gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string(火车离站时刻表)10gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba+gydF4y2Ba…gydF4y2Ba垫(string (validationTime) 15gydF4y2Ba“两个”gydF4y2Ba)+gydF4y2Ba“|”gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba函数gydF4y2Baacc = composePadAccuracy (acc) acc =组成(gydF4y2Ba“% .2f”gydF4y2Ba,100年acc *) +gydF4y2Ba“%”gydF4y2Ba;acc =垫(字符串(acc) 6gydF4y2Ba“左”gydF4y2Ba);gydF4y2Ba结束gydF4y2Ba

endVerboseOutputgydF4y2Ba

的gydF4y2BaendVerboseOutputgydF4y2Ba函数显示详细输出在训练结束。gydF4y2Ba

函数gydF4y2BaendVerboseOutput (params)gydF4y2Ba如果gydF4y2Ba参数个数。Verbo年代edisp (gydF4y2Ba“| = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = |”gydF4y2Ba)gydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

updateProgressPlotgydF4y2Ba

的gydF4y2BaupdateProgressPlotgydF4y2Ba功能更新进展情节与损失和在训练精度信息。gydF4y2Ba

函数gydF4y2BaupdateProgressPlot (params,策划者,时代,迭代,开始,lossTrain, accuracyTrain, accuracyValidation)gydF4y2Ba如果gydF4y2Baparams.ProgressPlotgydF4y2Ba%更新培训的进展。gydF4y2BaD =持续时间(0,0,toc(开始),gydF4y2Ba“格式”gydF4y2Ba,gydF4y2Ba“hh: mm: ss”gydF4y2Ba);标题(plotters.LossPlotter.Parent,gydF4y2Ba”时代:“gydF4y2Ba+时代+gydF4y2Ba”,过去:“gydF4y2Ba+字符串(D));addpoints (plotters.LossPlotter、迭代、双(收集(extractdata (lossTrain))));addpoints (plotters.TrainAccPlotter迭代,accuracyTrain);addpoints (plotters.ValAccPlotter迭代,accuracyValidation);drawnowgydF4y2Ba结束gydF4y2Ba结束gydF4y2Ba

引用gydF4y2Ba

[1]Carreira,若昂,安德鲁Zisserman。“君在何处,动作识别?一个新的模型和动力学数据集。”gydF4y2Ba《IEEE计算机视觉与模式识别会议gydF4y2Ba(CVPR): 6299 ? 6308。火奴鲁鲁,你好:IEEE 2017。gydF4y2Ba

[2]Simonyan,凯伦和安德鲁Zisserman。“二束卷积网络行动识别视频。”gydF4y2Ba先进的神经信息处理系统gydF4y2Ba27日,长滩,CA:少量的酒,2017年。gydF4y2Ba

[3]Loshchilov, Ilya和弗兰克Hutter。“SGDR:随机梯度下降法与温暖重启。”gydF4y2Ba2017年国际Conferencee学习表示gydF4y2Ba。法国土伦:ICLR, 2017。gydF4y2Ba