主要内容

基于GPU编码器的车道检测优化

此示例显示如何从深度学习网络生成CUDA®代码,该网络由SeriesNetwork目的。在该示例中,串联网络是一种卷积神经网络,可以从图像中检测和输出车道标记边界。

先决条件

  • CUDA支持NVIDIA®GPU。

  • NVIDIA CUDA工具包和驱动程序。

  • 英伟达cuDNN图书馆。

  • OpenCV库用于视频读取和图像显示操作。

  • 编译器和库的环境变量。有关有关编译器和库的支持版本的信息,请参阅万博1manbetx第三方硬件.设置环境变量,请参阅设置先决条件产品s manbetx 845

验证GPU环境

使用coder.checkGpuInstall验证运行此示例所需的编译器和库是否已正确设置。

envcfg = coder.gpuenvconfig(“主机”);envCfg.DeepLibTarget=“cudnn”;envCfg。DeepCodegen = 1;envCfg。安静= 1;coder.checkGpuInstall (envCfg);

获得预训练序列网络

[laneNet, coeffMeans, coeffStds] = getLaneDetectionNetworkGPU();

该网络以图像为输入输出两个车道边界,分别对应自我车辆的左右车道。每条车道边界用抛物方程表示: y 一个 x 2 + b x + c ,其中y为横向偏移量,x为到车辆的纵向距离。网络输出每个车道的三个参数a、b和c。网络架构类似于AlexNet.除了最后几层被一个较小的完全连接层和回归输出层所取代之外分析作用

analyzeNetwork (laneNet)

检查主要入口点功能

类型探测车道
从网络输出中,计算图像中左右车道点的坐标%。摄像机坐标由caltech单摄像机模型描述。一个持久对象mynet被用来加载序列网络对象。在第一次调用该函数时,将构造持久对象并设置%。当该函数随后被调用时,相同的对象将被重用%以调用预测输入,从而避免重新构建和%重新加载网络对象。持久lanenet;如果是空的(lanenet) lanenet = code . loaddeeplearningnetwork (' lanenet . loaddeeplearningnetwork ')席”、“lanenet”);end lanecoeffsNetworkOutput = lanenet。Predict (permute(frame, [2 1 3]));params = lanecoeffsNetworkOutput .* lanecoeffstd + laneCoeffMeans;isRightLaneFound = abs(params(6)) > 0.5; %c should be more than 0.5 for it to be a right lane isLeftLaneFound = abs(params(3)) > 0.5; vehicleXPoints = 3:30; %meters, ahead of the sensor ltPts = coder.nullcopy(zeros(28,2,'single')); rtPts = coder.nullcopy(zeros(28,2,'single')); if isRightLaneFound && isLeftLaneFound rtBoundary = params(4:6); rt_y = computeBoundaryModel(rtBoundary, vehicleXPoints); ltBoundary = params(1:3); lt_y = computeBoundaryModel(ltBoundary, vehicleXPoints); % Visualize lane boundaries of the ego vehicle tform = get_tformToImage; % map vehicle to image coordinates ltPts = tform.transformPointsInverse([vehicleXPoints', lt_y']); rtPts = tform.transformPointsInverse([vehicleXPoints', rt_y']); laneFound = true; else laneFound = false; end end function yWorld = computeBoundaryModel(model, xWorld) yWorld = polyval(model, xWorld); end function tform = get_tformToImage % Compute extrinsics based on camera setup yaw = 0; pitch = 14; % pitch of the camera in degrees roll = 0; translation = translationVector(yaw, pitch, roll); rotation = rotationMatrix(yaw, pitch, roll); % Construct a camera matrix focalLength = [309.4362, 344.2161]; principalPoint = [318.9034, 257.5352]; Skew = 0; camMatrix = [rotation; translation] * intrinsicMatrix(focalLength, ... Skew, principalPoint); % Turn camMatrix into 2-D homography tform2D = [camMatrix(1,:); camMatrix(2,:); camMatrix(4,:)]; % drop Z tform = projective2d(tform2D); tform = tform.invert(); end function translation = translationVector(yaw, pitch, roll) SensorLocation = [0 0]; Height = 2.1798; % mounting height in meters from the ground rotationMatrix = (... rotZ(yaw)*... % last rotation rotX(90-pitch)*... rotZ(roll)... % first rotation ); % Adjust for the SensorLocation by adding a translation sl = SensorLocation; translationInWorldUnits = [sl(2), sl(1), Height]; translation = translationInWorldUnits*rotationMatrix; end %------------------------------------------------------------------ % Rotation around X-axis function R = rotX(a) a = deg2rad(a); R = [... 1 0 0; 0 cos(a) -sin(a); 0 sin(a) cos(a)]; end %------------------------------------------------------------------ % Rotation around Y-axis function R = rotY(a) a = deg2rad(a); R = [... cos(a) 0 sin(a); 0 1 0; -sin(a) 0 cos(a)]; end %------------------------------------------------------------------ % Rotation around Z-axis function R = rotZ(a) a = deg2rad(a); R = [... cos(a) -sin(a) 0; sin(a) cos(a) 0; 0 0 1]; end %------------------------------------------------------------------ % Given the Yaw, Pitch, and Roll, determine the appropriate Euler angles % and the sequence in which they are applied to align the camera's % coordinate system with the vehicle coordinate system. The resulting % matrix is a Rotation matrix that together with the Translation vector % defines the extrinsic parameters of the camera. function rotation = rotationMatrix(yaw, pitch, roll) rotation = (... rotY(180)*... % last rotation: point Z up rotZ(-90)*... % X-Y swap rotZ(yaw)*... % point the camera forward rotX(90-pitch)*... % "un-pitch" rotZ(roll)... % 1st rotation: "un-roll" ); end function intrinsicMat = intrinsicMatrix(FocalLength, Skew, PrincipalPoint) intrinsicMat = ... [FocalLength(1) , 0 , 0; ... Skew , FocalLength(2) , 0; ... PrincipalPoint(1), PrincipalPoint(2), 1]; end

生成网络代码和后处理代码

网络计算参数a, b和c,这些参数描述了左右车道边界的抛物线方程。

从这些参数,计算与通道位置对应的x和y坐标。必须将坐标映射到图像坐标。功能探测车道执行所有这些计算。为图形处理器创建图形处理器代码配置对象,为该函数生成CUDA代码“lib”将目标语言设置为C++。编码器。DeepLearningConfig功能创建一个CuDNN的深度学习配置对象,并将其分配给DeepLearningConfig图形处理器代码配置对象的属性。运行codegen命令。

cfg = coder.gpuConfig (“lib”);cfg.DeepLearningConfig=coder.DeepLearningConfig(“cudnn”);cfg。GenerateReport = true;cfg。TargetLang =“c++”;输入= {(227227 3'单身的'),一(1,6,'双倍的'),一(1,6,'双倍的')};codegenarg游戏输入配置cfgdetect_lane
代码生成成功:查看报告

生成的代码描述

这个系列网络是作为一个包含23个层类数组的c++类生成的。

c_lanenet公共:INT32_T.batchSize;int32_Tnumlayers.;real32_T* inputData;real32_T * outputData;MWCNNLayer*层[23];公众:c_lanenet(无效);无效设置(void);无效预测(无效);无效的清理(无效);~ c_lanenet(无效);};

设置()类的方法为每个层对象设置句柄并分配内存预测()方法调用网络中的23层中的每一个的预测。

cnn_lanenet_conv*_w和cnn_lanenet_conv*_b文件是网络中卷积层的二进制权重和偏差文件。cnn_lanenet_fc*_w和cnn_lanenet_fc*_b文件是网络中完全连接层的二进制权重和偏差文件。

codegendir = fullfile(“codegen”“lib”“探测车道”);dir (codegendir)
.MWReLULayer.o .. MWReLULayerImpl.cu的.gitignore MWReLULayerImpl.hpp DeepLearningNetwork.cu MWReLULayerImpl.o DeepLearningNetwork.h MWTargetNetworkImpl.cu DeepLearningNetwork.o MWTargetNetworkImpl.hpp MWCNNLayer.cpp MWTargetNetworkImpl.o MWCNNLayer.hpp MWTensor.hpp MWCNNLayer.o MWTensorBase.cpp MWCNNLayerImpl.cu MWTensorBase.hpp MWCNNLayerImpl.hpp MWTensorBase.o MWCNNLayerImpl.o _clang-format MWCUSOLVERUtils.cpp buildInfo.mat MWCUSOLVERUtils.hpp cnn_lanenet0_0_conv1_b.bin MWCUSOLVERUtils.o cnn_lanenet0_0_conv1_w.bin MWCudaDimUtility.hpp cnn_lanenet0_0_conv2_b.bin MWCustomLayerForCuDNN.cpp cnn_lanenet0_0_conv2_w.bin MWCustomLayerForCuDNN.hpp cnn_lanenet0_0_conv3_b.bin MWCustomLayerForCuDNN.o cnn_lanenet0_0_conv3_w.bin MWElementwiseAffineLayer.cpp cnn_lanenet0_0_conv4_b.bin MWElementwiseAffineLayer.hpp cnn_lanenet0_0_conv4_w.bin MWElementwiseAffineLayer.o cnn_lanenet0_0_conv5_b.bin MWElementwiseAffineLayerImpl.cu cnn_lanenet0_0_conv5_w.bin MWElementwiseAffineLayerImpl.hpp cnn_lanenet0_0_data_offset.bin MWElementwiseAffineLayerImpl.o cnn_lanenet0_0_data_scale.bin MWElementwiseAffineLayerImplKernel.cu cnn_lanenet0_0_fc6_b.bin MWElementwiseAffineLayerImplKernel.o cnn_lanenet0_0_fc6_w.bin MWFCLayer.cpp cnn_lanenet0_0_fcLane1_b.bin MWFCLayer.hpp cnn_lanenet0_0_fcLane1_w.bin MWFCLayer.o cnn_lanenet0_0_fcLane2_b.bin MWFCLayerImpl.cu cnn_lanenet0_0_fcLane2_w.bin MWFCLayerImpl.hpp cnn_lanenet0_0_responseNames.txt MWFCLayerImpl.o codeInfo.mat MWFusedConvReLULayer.cpp codedescriptor.dmr MWFusedConvReLULayer.hpp compileInfo.mat MWFusedConvReLULayer.o defines.txt MWFusedConvReLULayerImpl.cu detect_lane.a MWFusedConvReLULayerImpl.hpp detect_lane.cu MWFusedConvReLULayerImpl.o detect_lane.h MWInputLayer.cpp detect_lane.o MWInputLayer.hpp detect_lane_data.cu MWInputLayer.o detect_lane_data.h MWInputLayerImpl.hpp detect_lane_data.o MWKernelHeaders.hpp detect_lane_initialize.cu MWMaxPoolingLayer.cpp detect_lane_initialize.h MWMaxPoolingLayer.hpp detect_lane_initialize.o MWMaxPoolingLayer.o detect_lane_internal_types.h MWMaxPoolingLayerImpl.cu detect_lane_rtw.mk MWMaxPoolingLayerImpl.hpp detect_lane_terminate.cu MWMaxPoolingLayerImpl.o detect_lane_terminate.h MWNormLayer.cpp detect_lane_terminate.o MWNormLayer.hpp detect_lane_types.h MWNormLayer.o examples MWNormLayerImpl.cu gpu_codegen_info.mat MWNormLayerImpl.hpp html MWNormLayerImpl.o interface MWOutputLayer.cpp mean.bin MWOutputLayer.hpp predict.cu MWOutputLayer.o predict.h MWOutputLayerImpl.cu predict.o MWOutputLayerImpl.hpp rtw_proj.tmw MWOutputLayerImpl.o rtwtypes.h MWReLULayer.cpp MWReLULayer.hpp

生成用于后处理输出的其他文件

从经过培训的网络的导出均值和STD值以在执行期间使用。

codegendir=完整文件(pwd,“codegen”“lib”“探测车道”);fid = fopen (fullfile (codegendir“mean.bin”),'W');A = [coeffMeans coeffstd];写入文件(fid,,'双倍的');文件关闭(fid);

主文件

使用主文件编译网络代码。主文件使用OpenCVVideoCapture从输入视频读取帧的方法。每个帧都被处理并分类,直到读取更多帧。在为每帧显示输出之前,输出通过使用后处理detect_lane函数生成detect_lane.cu

类型main_lanenet.cu
/*2016年版权所有MathWorks,Inc.*/#include#include#include#include#include#include#include#include#include list>#include使用CMAU.hpp>检测名称空间通道;void readData(float*input,Mat&orig,Mat&im){Size Size(227227);resize(orig,im,Size,0,0,INTER_-LINEAR);for(int j=0;j<227*227;j++){//BGR to RGB input[2*227*227+j]=(float)(im.data[j*3+0]);input[1*227*227+j]=(float)(im data[j*3+1]);input[0*227*227*227+j+j]=(float)](float)(im data[2]{std::vectoriArray;for(int k=0;k>orig;if(orig.empty())中断;readData(inputBuffer,orig,im);writeData(inputBuffer,orig,6,means,stds);cudaventrecord(stop);cudaEventSynchronize(stop);char strbuf[50];float毫秒=-1.0;cudaeventrelasedtime(&毫秒,start,stop);fps=fps*.9+1000.0/毫秒*.1;sprintf(strbuf,”%.2f FPS“,FPS);putText(orig,strbuf,Point(200,30),FONT_HERSHEY_DUPLEX,1,CV_RGB(0,0,0,0),2);imshow(“车道检测演示”,orig);if(waitKey(50)%256==27)break;//按ESC*/}销毁窗口(“车道检测演示”);free(输入缓冲区);free(输出缓冲区);return 0;}

下载示例视频

如果~ ('./caltech_cordova1.avi'“文件”)URL =.“//www.tianjin-qmedu.com/万博1manbetxsupportfiles/gpucoder/media/caltech_cordova1.avi”;websave ('caltech_cordova1.avi',URL);终止

构建可执行

如果ispc setenv (“MATLAB_ROOT”,matlabroot);vcvarsall = mex.getcompilerconfigurations(“c++”).Details.CommandLineShell;setenv(“VCVARSALL”, vcvarsall);系统(“make_win_lane_detection.bat”);CD(Codegendir);系统(“lanenet.exe  ..\..\..\ caltech_cordova1.avi”);别的setenv (“MATLAB_ROOT”,matlabroot);系统(“让- f Makefile_lane_detection.mk”);CD(Codegendir);系统('./ lanenet  ../../../ caltech_cordova1.avi”);终止

输入屏幕截图

输出屏幕截图

另见

功能

物体

相关的话题