Kernels from Library Calls
GPU Coder™ supports libraries optimized for CUDA®GPU,例如Cublas,Cusolver,Cufft,推力,Cudnn和Tensorrt库。
The cuBLAS library is an implementation of Basic Linear algebra Subprograms (BLAS) on top of the NVIDIA®库达运行时间。它使您可以访问NVIDIA GPU的计算资源。
基于cuSOLVER库是一个高级包on the cuBLAS and cuSPARSE libraries. It provides useful LAPACK-like features, such as common matrix factorization and triangular solve routines for dense matrices, a sparse least-squares solver, and an Eigenvalue solver.
The cuFFT library provides a high-performance implementation of the Fast Fourier Transform (FFT) algorithm on NVIDIA GPUs. The cuBLAS, cuSOLVER, and cuFFT libraries are part of the NVIDIA CUDA Toolkit.
Thrust is a C++ template library for CUDA. The Thrust library is shipped with CUDA Toolkit and allows you to take advantage of GPU-accelerated primitives such as sort to implement complex high-performance parallel applications.
The NVIDIA CUDA Deep Neural Network library (cuDNN) is a GPU-accelerated library of primitives for deep neural networks. cuDNN provides highly tuned implementations for standard routines such as forward and backward convolution, pooling, normalization, and activation layers. The NVIDIATensorRT是高性能深度学习推理优化器和运行时库。有关更多信息,请参阅Code Generation for Deep Learning Networks by Using cuDNN和Code Generation for Deep Learning Networks by Using TensorRT.
GPU编码器不需要特殊的Pragma来生成库的内核调用。在代码生成过程中,当您选择启用CublasGPU编码器应用程序中的选项或使用config_object.GpuConfig.EnableCUBLAS = true
property in CLI, GPU Coder replaces some functionality with calls to the cuBLAS library. When you select theEnable cuSOLVERGPU编码器应用程序中的选项或使用config_object.gpuconfig.enablecusolver = true
property in CLI, GPU Coder replaces some functionality with calls to the cuSOLVER library. For GPU Coder to replace high-level math functions to library calls, the following conditions must be met:
GPU-specific library replacement must exist for these functions.
MATLAB®Coder™data size thresholds must be satisfied.
GPU编码器支持表中万博1manbetx列出的功能的Cufft,Cusolver和Cublas库替换。对于在CUDA中没有替换的功能,GPU编码器使用映射到GPU的便携式MATLAB函数。
MATLAB Function | Description | MATLAB CoderLAPACK Support | cuBLAS, cuSOLVER, cuFFT, Thrust Support |
---|---|---|---|
|
矩阵倍增 |
Yes |
Yes |
|
线性方程的求解系统 |
Yes |
Yes |
|
lu矩阵分解 |
Yes |
Yes |
|
正交三角分解 |
Yes |
Partial |
|
矩阵决定因素 |
Yes |
Yes |
|
Cholesky factorization |
Yes |
Yes |
|
Reciprocal condition number |
Yes |
Yes |
|
线性方程的求解系统s |
Yes |
Yes |
|
Eigenvalues and eigen vectors |
Yes |
不 |
|
Schur decomposition |
Yes |
不 |
|
Singular value decomposition |
Yes |
Partial |
|
Fast Fourier Transform |
Yes |
Yes |
|
逆快速傅立叶变换 |
Yes |
Yes |
Sort array elements |
是的,使用 |
When you select theEnable cuFFTGPU编码器应用程序中的选项或使用config_object.GpuConfig.EnableCUFFT = true
在CLI中的属性,GPU编码器地图fft,ifft,fft2,ifft2,fftn.ifftn
function calls in your MATLAB code to the appropriate cuFFT library calls. For 2-D transforms and higher, GPU Coder creates multiple 1-D batched transforms. These batched transforms have higher performance than single transforms. GPU Coder only supports out-of-place transforms. IfEnable cuFFTis not selected, GPU Coder uses CFFTW
可用的库或从便携式MATLAB FFT生成内核。支持单精度数据类型。万博1manbetx输入和输出可以是真实的或复杂的,但是实现的转换速度更快。Cufft库支持输入尺寸通常万博1manbetx被指定为2的功率或可以将其纳入较小素数的产品的值。通常,较小的主要因素,性能越好。
不te
使用cuda库名称,例如袖口
,cublas
, 和cudnn
as the names of your MATLAB function results in code generation errors.
See Also
coder.gpu.kernel
|coder.gpu.kernelfun
|gpucoder.matrixmatrixkernel
|coder.gpu.constantMemory
|gpucoder.stencilKernel
|gpucoder.sort