车道检测与GPU编码器优化

这个例子使用了:

打开实时脚本

这个例子展示了如何开发一个运行在NVIDIA®gpu上的深度学习车道检测应用程序。

预训练的车道检测网络可以从图像中检测和输出车道标记边界，并基于AlexNet网络。最后几层AlexNet网络被更小的全连接层和回归输出层所取代。该示例生成一个CUDA可执行文件，该文件运行在主机上启用CUDA的GPU上。

先决条件

CUDA启用NVIDIA GPU。
NVIDIA CUDA工具包和驱动程序。
NVIDIA cuDNN库。
编译器和库的环境变量。有关编译器和库的受支持版本的信息，请参见万博1manbetx第三方硬件．有关设置环境变量，请参见设置必备产品s manbetx 845．

检查GPU环境

使用coder.checkGpuInstall函数验证运行此示例所需的编译器和库是否正确设置。

envCfg = code . gpuenvconfig (“主机”）;envCfg。DeepLibTarget =“cudnn”；envCfg。DeepCodegen = 1;envCfg。安静= 1;coder.checkGpuInstall (envCfg);

得到预训练的车道检测网络

本例使用trainedLaneNetmat文件，包含预训练的车道检测网络。该文件大小约为143 MB。从MathWorks网站下载该文件。

laneNetFile = matlab.internal.examples.download万博1manbetxSupportFile(“gpucoder / cnn_models / lane_detection”，.．.“trainedLaneNet.mat”）;

该网络将图像作为输入，并输出两个车道边界，分别对应于自我车辆的左右车道。每个车道边界由抛物方程表示: $y ＝一个 x^{2} + b x + c$ ，其中y为横向偏移量，x为与车辆的纵向距离。网络在每个车道上输出三个参数a、b和c。网络架构类似于AlexNet除了最后几层被一个较小的全连接层和回归输出层所取代。

负载(laneNetFile);disp (laneNet)

SeriesNetwork with properties: Layers: [23×1 nnet.cnn.layer.Layer] InputNames: {'data'} OutputNames: {'output'}

要查看网络体系结构，请使用analyzeNetwork函数。

analyzeNetwork (laneNet)

下载测试视频

为了测试该模型，示例使用了来自加州理工学院lanes数据集的视频文件。该文件大小约为8 MB。从MathWorks网站下载该文件。

videoFile = matlab.internal.examples.download万博1manbetxSupportFile(“gpucoder /媒体”，“caltech_cordova1.avi”）;

主要入口功能

的detectLanesInVideo.m文件是代码生成的主要入口函数。的detectLanesInVideo函数使用愿景。VideoFileReader(计算机视觉工具箱)system对象从输入视频中读取帧，调用LaneNet网络对象的predict方法，并在输入视频上绘制检测到的车道。一个愿景。DeployableVideoPlayer(计算机视觉工具箱)系统对象用于显示车道检测的视频输出。

类型detectLanesInVideo.m

% detectLanesInVideo(videoFile,net,laneCoeffMeans,laneCoeffsStds)使用% VideoFileReader系统对象从输入视频中读取帧，调用% LaneNet网络对象的预测方法，并在输入视频中绘制检测到的%车道。DeployableVideoPlayer系统对象用于%显示检测到的通道视频输出。The MathWorks, Inc. %#codegen %%创建视频阅读器和视频播放器对象videoFReader = vision.VideoFileReader(videoFile);depVideoPlayer =视觉。DeployableVideoPlayer(Name='Lane Detection on GPU'); %% Video Frame Processing Loop while ~isDone(videoFReader) videoFrame = videoFReader(); scaledFrame = 255.*(imresize(videoFrame,[227 227])); [laneFound,ltPts,rtPts] = laneNetPredict(net,scaledFrame, ... laneCoeffMeans,laneCoeffsStds); if(laneFound) pts = [reshape(ltPts',1,[]);reshape(rtPts',1,[])]; videoFrame = insertShape(videoFrame, 'Line', pts, 'LineWidth', 4); end depVideoPlayer(videoFrame); end end

LaneNet预测函数

的laneNetPredict函数计算单个视频帧中的左右车道位置。的laneNet网络计算参数a、b和c，描述左右车道边界的抛物线方程。从这些参数中，计算出与车道位置对应的x和y坐标。坐标必须映射到图像坐标。

类型laneNetPredict.m

function [laneFound,ltPts,rtPts] = laneNetPredict(net,frame,means,stds) % laneNetPredict使用%车道检测网络预测输入图像帧上的车道标记% % Copyright 2017-2022 the MathWorks, Inc. %#codegen %一个持久对象lanenet用于加载网络对象。在%第一次调用此函数时，将构造持久对象并% setup。当该函数随后被调用时，相同的对象将被%重用，以便对输入调用predict，从而避免重构和%重新加载网络对象。持久lanenet;如果isempty(lanenet) lanenet =编码器。loadDeepLearningNetwork(网络,“lanenet”);end lanecoeffsNetworkOutput = predict(lanenet,frame);通过反向归一化步骤恢复原始系数。params = lanecoeffsNetworkOutput .* stds + means;% 'c'应该大于0.5才能成为一个lane。 isRightLaneFound = abs(params(6)) > 0.5; isLeftLaneFound = abs(params(3)) > 0.5; % From the networks output, compute left and right lane points in the image % coordinates. vehicleXPoints = 3:30; ltPts = coder.nullcopy(zeros(28,2,'single')); rtPts = coder.nullcopy(zeros(28,2,'single')); if isRightLaneFound && isLeftLaneFound rtBoundary = params(4:6); rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints); ltBoundary = params(1:3); lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints); % Visualize lane boundaries of the ego vehicle. tform = get_tformToImage; % Map vehicle to image coordinates. ltPts = tform.transformPointsInverse([vehicleXPoints', lt_y']); rtPts = tform.transformPointsInverse([vehicleXPoints', rt_y']); laneFound = true; else laneFound = false; end end %% Helper Functions % Compute boundary model. function yWorld = computeBoundaryModel(model, xWorld) yWorld = polyval(model, xWorld); end % Compute extrinsics. function tform = get_tformToImage %The camera coordinates are described by the caltech mono % camera model. yaw = 0; pitch = 14; % Pitch of the camera in degrees roll = 0; translation = translationVector(yaw, pitch, roll); rotation = rotationMatrix(yaw, pitch, roll); % Construct a camera matrix. focalLength = [309.4362, 344.2161]; principalPoint = [318.9034, 257.5352]; Skew = 0; camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ... Skew, principalPoint); % Turn camMatrix into 2-D homography. tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z tform = projective2d(tform2D); tform = tform.invert(); end % Translate to image co-ordinates. function translation = translationVector(yaw, pitch, roll) SensorLocation = [0 0]; Height = 2.1798; % mounting height in meters from the ground rotationMatrix = (... rotZ(yaw)*... % last rotation rotX(90-pitch)*... rotZ(roll)... % first rotation ); % Adjust for the SensorLocation by adding a translation. sl = SensorLocation; translationInWorldUnits = [sl(2), sl(1), Height]; translation = translationInWorldUnits*rotationMatrix; end % Rotation around X-axis. function R = rotX(a) a = deg2rad(a); R = [... 1 0 0; 0 cos(a) -sin(a); 0 sin(a) cos(a)]; end % Rotation around Y-axis. function R = rotY(a) a = deg2rad(a); R = [... cos(a) 0 sin(a); 0 1 0; -sin(a) 0 cos(a)]; end % Rotation around Z-axis. function R = rotZ(a) a = deg2rad(a); R = [... cos(a) -sin(a) 0; sin(a) cos(a) 0; 0 0 1]; end % Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles % and the sequence in which they are applied to align the camera's % coordinate system with the vehicle coordinate system. The resulting % matrix is a Rotation matrix that together with the Translation vector % defines the extrinsic parameters of the camera. function rotation = rotationMatrix(yaw, pitch, roll) rotation = (... rotY(180)*... % last rotation: point Z up rotZ(-90)*... % X-Y swap rotZ(yaw)*... % point the camera forward rotX(90-pitch)*... % "un-pitch" rotZ(roll)... % 1st rotation: "un-roll" ); end % Intrinsic matrix computation. function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint) intrinsicMat = ... [FocalLength(1) , 0 , 0; ... Skew , FocalLength(2) , 0; ... PrincipalPoint(1), PrincipalPoint(2), 1]; end

生成CUDA可执行文件

的独立CUDA可执行文件detectLanesInVideo入口点函数，创建一个GPU代码配置对象exe”目标并将目标语言设置为c++。使用编码器。DeepLearningConfig函数创建CuDNN深度学习配置对象，并将其分配给DeepLearningConfigGPU代码配置对象的属性。

cfg = code . gpuconfig (exe”）;cfg。DeepLearningConfig =编码器。DeepLearningConfig (“cudnn”）;cfg。GenerateReport = true;cfg。GenerateExampleMain =“GenerateCodeAndCompile”；cfg。TargetLang =“c++”；input = {code . constant (videoFile)，code . constant (laneNetFile)，.．.coder.Constant (laneCoeffMeans) coder.Constant (laneCoeffsStds)};

运行codegen命令。

codegenarg游戏输入配置cfgdetectLanesInVideo

代码生成成功:查看报告

生成代码说明

串联网络生成为包含18层类数组的c++类(经过层融合优化)。的设置()方法设置句柄并为每个层对象分配内存。的预测()方法调用对网络中18层中的每一层的预测。

类lanenet0_0｛公众:lanenet0_0 ();无效setSize ()；无效resetState ()；无效设置()；无效预测()；无效清理()；浮动*getLayerOutput(int layerIndex, int portIndex)；intgetLayerOutputSize(int layerIndex, int portIndex)；浮动* getInputDataPointer (int b_index)；浮动* getInputDataPointer ()；浮动* getOutputDataPointer (int b_index)；浮动* getOutputDataPointer ()；intgetBatchSize ()；~ lanenet0_0 ();私人:空白分配()；无效postsetup ()；无效释放()；公众:boolean_TisInitialized；boolean_TmatlabCodegenIsDeleted；私人:intnumLayers；MWTensorBase* inputTensors [1]；MWTensorBase* outputTensors [1]；MWCNNLayer*层[18]；MWCudnnTarget: MWTargetNetworkImpl * targetImpl;};

cnn_lanenet*_conv*_w和cnn_lanenet*_conv*_b文件是网络中卷积层的二进制权值和偏置文件。cnn_lanenet*_fc*_w和cnn_lanenet*_fc*_b文件是网络中全连接层的二进制权值和偏置文件。

Codegendir = fullfile(“codegen”，exe”，“detectLanesInVideo”）;dir ([codegendir filesep,“*。斌”])

cnn_lanenet0_0_conv1_b.bin cnn_lanenet0_0_conv3_b.bin cnn_lanenet0_0_conv5_b.bin cnn_lanenet0_0_fc6_b.bin cnn_lanenet0_0_fcLane2_b.bin cnn_lanenet0_0_conv1_w.bin cnn_lanenet0_0_conv3_w.bin cnn_lanenet0_0_conv5_w.bin cnn_lanenet0_0_fc6_w.bin cnn_lanenet0_0_fcLane2_w.bin cnn_lanenet0_0_conv2_b.bin cnn_lanenet0_0_conv4_b.bin cnn_lanenet0_0_data_offset.bin cnn_lanenet0_0_fcLane1_b.bin networkParamsInfo_lanenet0_0.bin cnn_lanenet0_0_conv2_w.bin cnn_lanenet0_0_conv4_w.bin cnn_lanenet0_0_data_scale.bincnn_lanenet0_0_fcLane1_w.bin

运行可执行文件

要运行可执行文件，取消以下代码行的注释。

如果Ispc [status, cmmdout] = system(“detectLanesInVideo.exe”）;其他的[status,cmdout] = system(”。/ detectLanesInVideo”）;结束

另请参阅

功能

codegen|编码器。DeepLearningConfig|coder.loadDeepLearningNetwork|coder.checkGpuInstall

对象

coder.gpuConfig|coder.gpuEnvConfig|编码器。CuDNNConfig|编码器。TensorRTConfig