CUBLAS APIs支持状态(v1.6)
cublas
对比cublas 11.9.2,cublas APIs支持状况如下表:
目前常用的NN场景使用的APIs大部分均已支持并调优;
对标Ampere所有API都是软件可支持的,目前无PG1硬件限制的因素;后续软件版本将根据优先级逐步完善;
目前API支持率为: 89/290 = 30.7%,不支持API主要分为以下几类:
复数数据类型:131 个
特殊类型矩阵(对称、压缩、三角等)类型:36 个
功能辅助(set/get,memcpy等)类型:18 个
batched gemv:12 个
不常见算法(矩阵加geam,求逆matinv):4 个
由于AI场景里不涉及复数类型和特殊矩阵类型,AI场景API支持率为89/123 = 72.4%;
api | cublas 11.9.2 | PG1 1.5 | 功能说明 |
cublasCreate_v2 | Yes | Yes | |
cublasDestroy_v2 | Yes | Yes | |
cublasGetProperty | Yes | Yes | |
cublasSetStream_v2 | Yes | Yes | |
cublasGetStream_v2 | Yes | Yes | |
cublasGetMathMode | Yes | Yes | |
cublasSetMathMode | Yes | Yes | |
cublasGetPointerMode_v2 | Yes | Yes | |
cublasSetPointerMode_v2 | Yes | Yes | |
cublasSetWorkspace_v2 | Yes | Yes | |
cublasGetStatusString | Yes | Yes | |
cublasIamaxEx | Yes | Yes | |
cublasIsamax_v2 | Yes | Yes | |
cublasIdamax_v2 | Yes | Yes | |
cublasIaminEx | Yes | Yes | |
cublasIsamin_v2 | Yes | Yes | |
cublasIdamin_v2 | Yes | Yes | |
cublasAsumEx | Yes | Yes | |
cublasSasum_v2 | Yes | Yes | |
cublasDasum_v2 | Yes | Yes | |
cublasAxpyEx | Yes | Yes | |
cublasSaxpy_v2 | Yes | Yes | |
cublasDaxpy_v2 | Yes | Yes | |
cublasCopyEx | Yes | Yes | |
cublasScopy_v2 | Yes | Yes | |
cublasDcopy_v2 | Yes | Yes | |
cublasDotEx | Yes | Yes | |
cublasSdot_v2 | Yes | Yes | |
cublasDdot_v2 | Yes | Yes | |
cublasNrm2Ex | Yes | Yes | |
cublasSnrm2_v2 | Yes | Yes | |
cublasDnrm2_v2 | Yes | Yes | |
cublasRotEx | Yes | Yes | |
cublasSrot_v2 | Yes | Yes | |
cublasDrot_v2 | Yes | Yes | |
cublasRotgEx | Yes | Yes | |
cublasSrotg_v2 | Yes | Yes | |
cublasDrotg_v2 | Yes | Yes | |
cublasRotmEx | Yes | Yes | |
cublasSrotm_v2 | Yes | Yes | |
cublasDrotm_v2 | Yes | Yes | |
cublasRotmgEx | Yes | Yes | |
cublasSrotmg_v2 | Yes | Yes | |
cublasDrotmg_v2 | Yes | Yes | |
cublasScalEx | Yes | Yes | |
cublasSscal_v2 | Yes | Yes | |
cublasDscal_v2 | Yes | Yes | |
cublasSwapEx | Yes | Yes | |
cublasSswap_v2 | Yes | Yes | |
cublasDswap_v2 | Yes | Yes | |
cublasSgemv_v2 | Yes | Yes | |
cublasDgemv_v2 | Yes | Yes | |
cublasSgemm_v2 | Yes | Yes | |
cublasDgemm_v2 | Yes | Yes | |
cublasHgemm | Yes | Yes | |
cublasSgemmEx | Yes | Yes | |
cublasGemmEx | Yes | Yes | |
cublasHgemmBatched | Yes | Yes | |
cublasSgemmBatched | Yes | Yes | |
cublasGemmBatchedEx | Yes | Yes | |
cublasGemmStridedBatchedEx | Yes | Yes | |
cublasSgemmStridedBatched | Yes | Yes | |
cublasDgemmBatched | Yes | Yes | |
cublasDgemmStridedBatched | Yes | Yes | |
cublasHgemmStridedBatched | Yes | Yes | |
cublasSgetrfBatched | Yes | Yes | |
cublasDgetrfBatched | Yes | Yes | |
cublasSgetrsBatched | Yes | Yes | |
cublasDgetrsBatched | Yes | Yes | |
cublasSger_v2 | Yes | Yes | |
cublasDger_v2 | Yes | Yes | |
cublasSsyr_v2 | Yes | Yes | |
cublasDsyr_v2 | Yes | Yes | |
cublasSspr_v2 | Yes | Yes | |
cublasDspr_v2 | Yes | Yes | |
cublasSsyr2_v2 | Yes | Yes | |
cublasDsyr2_v2 | Yes | Yes | |
cublasSspr2_v2 | Yes | Yes | |
cublasDspr2_v2 | Yes | Yes | |
cublasStrsm_v2 | Yes | Yes | |
cublasDtrsm_v2 | Yes | Yes | |
cublasStrsmBatched | Yes | Yes | |
cublasDtrsmBatched | Yes | Yes | |
cublasSgetriBatched | Yes | Yes | |
cublasDgetriBatched | Yes | Yes | |
cublasSgeqrfBatched | Yes | Yes | |
cublasDgeqrfBatched | Yes | Yes | |
cublasSgelsBatched | Yes | Yes | |
cublasDgelsBatched | Yes | Yes | |
cublasGetVersion_v2 | Yes | No | get version |
cublasGetCudartVersion | Yes | No | |
cublasGetAtomicsMode | Yes | No | Atomics 参数 set/get |
cublasSetAtomicsMode | Yes | No | |
cublasGetSmCountTarget | Yes | No | SmCount 参数 set/get |
cublasSetSmCountTarget | Yes | No | |
cublasGetStatusName | Yes | No | get status string |
cublasLoggerConfigure | Yes | No | 回调函数configure设置 |
cublasSetLoggerCallback | Yes | No | set/get 回调函数 |
cublasGetLoggerCallback | Yes | No | |
cublasSetVector | Yes | No | memcpy H2D/D2H |
cublasGetVector | Yes | No | |
cublasSetMatrix | Yes | No | |
cublasGetMatrix | Yes | No | |
cublasSetVectorAsync | Yes | No | |
cublasGetVectorAsync | Yes | No | |
cublasSetMatrixAsync | Yes | No | |
cublasGetMatrixAsync | Yes | No | |
cublasSgemvBatched | Yes | No | gemv变种:batched 和 StridedBatched |
cublasDgemvBatched | Yes | No | |
cublasCgemvBatched | Yes | No | |
cublasZgemvBatched | Yes | No | |
cublasHSHgemvBatched | Yes | No | |
cublasHSSgemvBatched | Yes | No | |
cublasTSTgemvBatched | Yes | No | |
cublasTSSgemvBatched | Yes | No | |
cublasSgemvStridedBatched | Yes | No | |
cublasDgemvStridedBatched | Yes | No | |
cublasCgemvStridedBatched | Yes | No | |
cublasZgemvStridedBatched | Yes | No | |
cublasHSHgemvStridedBatched | Yes | No | |
cublasHSSgemvStridedBatched | Yes | No | |
cublasTSTgemvStridedBatched | Yes | No | |
cublasTSSgemvStridedBatched | Yes | No | |
cublasScnrm2_v2 | Yes | No | level1复数类型 |
cublasDznrm2_v2 | Yes | No | |
cublasDotcEx | Yes | No | |
cublasCdotu_v2 | Yes | No | |
cublasCdotc_v2 | Yes | No | |
cublasZdotu_v2 | Yes | No | |
cublasZdotc_v2 | Yes | No | |
cublasCscal_v2 | Yes | No | |
cublasCsscal_v2 | Yes | No | |
cublasZscal_v2 | Yes | No | |
cublasZdscal_v2 | Yes | No | |
cublasCaxpy_v2 | Yes | No | |
cublasZaxpy_v2 | Yes | No | |
cublasCcopy_v2 | Yes | No | |
cublasZcopy_v2 | Yes | No | |
cublasCswap_v2 | Yes | No | |
cublasZswap_v2 | Yes | No | |
cublasIcamax_v2 | Yes | No | |
cublasIzamax_v2 | Yes | No | |
cublasIcamin_v2 | Yes | No | |
cublasIzamin_v2 | Yes | No | |
cublasScasum_v2 | Yes | No | |
cublasDzasum_v2 | Yes | No | |
cublasCrot_v2 | Yes | No | |
cublasCsrot_v2 | Yes | No | |
cublasZrot_v2 | Yes | No | |
cublasZdrot_v2 | Yes | No | |
cublasCrotg_v2 | Yes | No | |
cublasZrotg_v2 | Yes | No | |
cublasCgemv_v2 | Yes | No | gemv复数类型 |
cublasZgemv_v2 | Yes | No | |
cublasCgemm_v2 | Yes | No | gemm复数类型 |
cublasCgemm3m | Yes | No | |
cublasCgemm3mEx | Yes | No | |
cublasZgemm_v2 | Yes | No | |
cublasZgemm3m | Yes | No | |
cublasCgemmEx | Yes | No | |
cublasCgemmBatched | Yes | No | batched gemm复数类型 |
cublasCgemm3mBatched | Yes | No | |
cublasZgemmBatched | Yes | No | |
cublasCgemmStridedBatched | Yes | No | |
cublasCgemm3mStridedBatched | Yes | No | |
cublasZgemmStridedBatched | Yes | No | |
cublasCgetrfBatched | Yes | No | getrf复数类型 |
cublasZgetrfBatched | Yes | No | |
cublasCgetrsBatched | Yes | No | getri复数类型 |
cublasZgetrsBatched | Yes | No | |
cublasSgbmv_v2 | Yes | No | gbmv(banded matrix-vector mul) |
cublasDgbmv_v2 | Yes | No | |
cublasCgbmv_v2 | Yes | No | |
cublasZgbmv_v2 | Yes | No | |
cublasStrmv_v2 | Yes | No | trmv(triangular matrix-vector mul) |
cublasDtrmv_v2 | Yes | No | |
cublasCtrmv_v2 | Yes | No | |
cublasZtrmv_v2 | Yes | No | |
cublasStbmv_v2 | Yes | No | tbmv(triangular bandedmatrix-vector mul) |
cublasDtbmv_v2 | Yes | No | |
cublasCtbmv_v2 | Yes | No | |
cublasZtbmv_v2 | Yes | No | |
cublasStpmv_v2 | Yes | No | tpmv(triangular bandedmatrix-vector mul) |
cublasDtpmv_v2 | Yes | No | |
cublasCtpmv_v2 | Yes | No | |
cublasZtpmv_v2 | Yes | No | |
cublasStrsv_v2 | Yes | No | trsv(solves triangular linear system with a single right-hand-side) |
cublasDtrsv_v2 | Yes | No | |
cublasCtrsv_v2 | Yes | No | |
cublasZtrsv_v2 | Yes | No | |
cublasStpsv_v2 | Yes | No | tpsv(solves the packed triangular linear system with a single right-hand-side) |
cublasDtpsv_v2 | Yes | No | |
cublasCtpsv_v2 | Yes | No | |
cublasZtpsv_v2 | Yes | No | |
cublasStbsv_v2 | Yes | No | tbsv(solves triangular banded linear system with a single right-hand-side |
cublasDtbsv_v2 | Yes | No | |
cublasCtbsv_v2 | Yes | No | |
cublasZtbsv_v2 | Yes | No | |
cublasSsymv_v2 | Yes | No | symv(symmetric matrix-vector mul) |
cublasDsymv_v2 | Yes | No | |
cublasCsymv_v2 | Yes | No | |
cublasZsymv_v2 | Yes | No | |
cublasChemv_v2 | Yes | No | hemv(Hermitian matrix-vector mul) |
cublasZhemv_v2 | Yes | No | |
cublasSsbmv_v2 | Yes | No | sbmv(symmetric banded matrix-vector mul) |
cublasDsbmv_v2 | Yes | No | |
cublasChbmv_v2 | Yes | No | hbmv(Hermitian banded matrix-vector mul) |
cublasZhbmv_v2 | Yes | No | |
cublasSspmv_v2 | Yes | No | spmv(symmetric packed matrix-vector mul) |
cublasDspmv_v2 | Yes | No | |
cublasChpmv_v2 | Yes | No | hpmv(Hermitian packed matrix-vector mul) |
cublasZhpmv_v2 | Yes | No | |
cublasCgeru_v2 | Yes | No | ger(general rank-1 update)复数类型 |
cublasCgerc_v2 | Yes | No | |
cublasZgeru_v2 | Yes | No | |
cublasZgerc_v2 | Yes | No | |
cublasCsyr_v2 | Yes | No | syr(symmetric rank-1 update)复数类型 |
cublasZsyr_v2 | Yes | No | |
cublasCher_v2 | Yes | No | her(Hermitian rank-1 update) |
cublasZher_v2 | Yes | No | |
cublasChpr_v2 | Yes | No | hpr(packed Hermitian rank-1 update) |
cublasZhpr_v2 | Yes | No | |
cublasCsyr2_v2 | Yes | No | syr2(symmetric rank-2 update)复数类型 |
cublasZsyr2_v2 | Yes | No | |
cublasCher2_v2 | Yes | No | her2(Hermitian rank-2 update) |
cublasZher2_v2 | Yes | No | |
cublasChpr2_v2 | Yes | No | hpr2(packed Hermitian rank-2 update) |
cublasZhpr2_v2 | Yes | No | |
cublasSsyrk_v2 | Yes | No | syrk(symmetric rank-k update) |
cublasDsyrk_v2 | Yes | No | |
cublasCsyrk_v2 | Yes | No | |
cublasZsyrk_v2 | Yes | No | |
cublasCsyrkEx | Yes | No | |
cublasCsyrk3mEx | Yes | No | |
cublasCherk_v2 | Yes | No | herk(Hermitian rank-k update) |
cublasZherk_v2 | Yes | No | |
cublasCherkEx | Yes | No | |
cublasCherk3mEx | Yes | No | |
cublasSsyr2k_v2 | Yes | No | syr2k(symmetric rank-2k update) |
cublasDsyr2k_v2 | Yes | No | |
cublasCsyr2k_v2 | Yes | No | |
cublasZsyr2k_v2 | Yes | No | |
cublasCher2k_v2 | Yes | No | her2k(Hermitian rank-2 update) |
cublasZher2k_v2 | Yes | No | |
cublasSsyrkx | Yes | No | syrkx(variation of symmetric rank-k update) |
cublasDsyrkx | Yes | No | |
cublasCsyrkx | Yes | No | |
cublasZsyrkx | Yes | No | |
cublasCherkx | Yes | No | herkx(variation of Hermitian rank-k update) |
cublasZherkx | Yes | No | |
cublasSsymm_v2 | Yes | No | symm(symmetric matrix-matrix mul) |
cublasDsymm_v2 | Yes | No | |
cublasCsymm_v2 | Yes | No | |
cublasZsymm_v2 | Yes | No | |
cublasChemm_v2 | Yes | No | hemm(Hermitian matrix-matrix mul) |
cublasZhemm_v2 | Yes | No | |
cublasCtrsm_v2 | Yes | No | trsm(solves triangular linear system with multiple right-hand-sides)复数类型 |
cublasZtrsm_v2 | Yes | No | |
cublasStrmm_v2 | Yes | No | trmm(triangular matrix-matrix mul) |
cublasDtrmm_v2 | Yes | No | |
cublasCtrmm_v2 | Yes | No | |
cublasZtrmm_v2 | Yes | No | |
cublasSgeam | Yes | No | geam(matrix-matrix addition) |
cublasDgeam | Yes | No | |
cublasCgeam | Yes | No | |
cublasZgeam | Yes | No | |
cublasCgetriBatched | Yes | No | getri复数类型 |
cublasZgetriBatched | Yes | No | |
cublasCtrsmBatched | Yes | No | trsmBatched复数类型 |
cublasZtrsmBatched | Yes | No | |
cublasSmatinvBatched | Yes | No | the inversion of batched matrices |
cublasDmatinvBatched | Yes | No | |
cublasCmatinvBatched | Yes | No | |
cublasZmatinvBatched | Yes | No | |
cublasCgeqrfBatched | Yes | No | geqrfBatched(QR factorization)复数类型 |
cublasZgeqrfBatched | Yes | No | |
cublasCgelsBatched | Yes | No | gelsBatched(least squares problem)复数类型 |
cublasZgelsBatched | Yes | No | |
cublasSdgmm | Yes | No | dgmm(diag matirx-matrix mul) |
cublasDdgmm | Yes | No | |
cublasCdgmm | Yes | No | |
cublasZdgmm | Yes | No | |
cublasStpttr | Yes | No | |
cublasDtpttr | Yes | No | tp2tr(triangular packed 2 triangular) / tr2tp |
cublasCtpttr | Yes | No | |
cublasZtpttr | Yes | No | |
cublasStrttp | Yes | No | |
cublasDtrttp | Yes | No | |
cublasCtrttp | Yes | No | |
cublasZtrttp | Yes | No |
cublasLt
对比 cublasLt 11.9.2,cublasLt APIs支持状况如下表:
目前常用的NN场景使用的APIs大部分均已支持并调优;
对标Ampere所有API都是软件可支持的,目前无PG1硬件限制的因素;后续软件版本将根据优先级逐步完善;
目前API支持率为: 31/42 = 73.8%;
api | cublasLt 11.9.2 | PG1 1.5 | 功能说明 |
cublasLtCreate | Yes | Yes | |
cublasLtDestroy | Yes | Yes | |
cublasLtMatmul | Yes | Yes | |
cublasLtMatrixTransform | Yes | Yes | |
cublasLtMatrixLayoutInit_internal | Yes | Yes | |
cublasLtMatrixLayoutCreate | Yes | Yes | |
cublasLtMatrixLayoutDestroy | Yes | Yes | |
cublasLtMatrixLayoutSetAttribute | Yes | Yes | |
cublasLtMatrixLayoutGetAttribute | Yes | Yes | |
cublasLtMatmulDescInit_internal | Yes | Yes | |
cublasLtMatmulDescCreate | Yes | Yes | |
cublasLtMatmulDescDestroy | Yes | Yes | |
cublasLtMatmulDescSetAttribute | Yes | Yes | |
cublasLtMatmulDescGetAttribute | Yes | Yes | |
cublasLtMatrixTransformDescInit_internal | Yes | Yes | |
cublasLtMatrixTransformDescCreate | Yes | Yes | |
cublasLtMatrixTransformDescDestroy | Yes | Yes | |
cublasLtMatrixTransformDescSetAttribute | Yes | Yes | |
cublasLtMatrixTransformDescGetAttribute | Yes | Yes | |
cublasLtMatmulPreferenceInit_internal | Yes | Yes | |
cublasLtMatmulPreferenceCreate | Yes | Yes | |
cublasLtMatmulPreferenceDestroy | Yes | Yes | |
cublasLtMatmulPreferenceSetAttribute | Yes | Yes | |
cublasLtMatmulPreferenceGetAttribute | Yes | Yes | |
cublasLtMatmulAlgoGetHeuristic | Yes | Yes | |
cublasLtMatmulAlgoGetIds | Yes | Yes | |
cublasLtMatmulAlgoInit | Yes | Yes | |
cublasLtMatmulAlgoCheck | Yes | Yes | |
cublasLtMatmulAlgoCapGetAttribute | Yes | Yes | |
cublasLtMatmulAlgoConfigSetAttribute | Yes | Yes | |
cublasLtMatmulAlgoConfigGetAttribute | Yes | Yes | |
cublasLtGetVersion | Yes | No | 获取版本信息 |
cublasLtGetCudartVersion | Yes | No | |
cublasLtGetProperty | Yes | No | |
cublasLtGetStatusName | Yes | No | 获取状态字符串 |
cublasLtGetStatusString | Yes | No | |
cublasLtLoggerSetCallback | Yes | No | Logger控制选项 |
cublasLtLoggerSetFile | Yes | No | |
cublasLtLoggerOpenFile | Yes | No | |
cublasLtLoggerSetLevel | Yes | No | |
cublasLtLoggerSetMask | Yes | No | |
cublasLtLoggerForceDisable | Yes | No |