-
Notifications
You must be signed in to change notification settings - Fork 238
Open
Labels
Description
Describe the bug
A clear and concise description of what the bug is.
When running onnxruntime-genai-cuda , the CUDA provider fails to load because it requires cublasLt64_12.dll, which only exists in CUDA 12.
File "D:\Venna\main.py", line 76, in <module>
main()
File "D:\Venna\main.py", line 27, in main
model = og.Model(config)
^^^^^^^^^^^^^^^^
RuntimeError: E:\_work\1\s\onnxruntime\core\session\provider_bridge_ort.cc:1844 onnxruntime::ProviderLibrary::Get [ONNXRuntimeError] : 1 : FAIL : Error loading "D:\Venna\.venv\Lib\site-packages\onnxruntime\capi\onnxruntime_providers_cuda.dll" which depends on "cublasLt64_12.dll" which is missing. (Error 126: "The specified module could not be found.")
I then tried building onnxruntime-genai from source using CUDA 13, but the build fails during nvcc CUDA compilation.
.\build.bat --use_cuda --config Release --skip_examples
2025-11-16 00:15:00,435 util.run [INFO] - Running subprocess in 'D:\onnxruntime-genai'
'D:\Program Files\Microsoft Visual Studio\18\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.EXE' -G 'Visual Studio 17 2022' -T 'cuda=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0' -DCMAKE_BUILD_TYPE=Release -S 'D:\onnxruntime-genai' -B 'D:\onnxruntime-genai\build\Windows\Release' -DCMAKE_POSITION_INDEPENDENT_CODE=ON -DUSE_CUDA=ON -DUSE_TRT_RTX=OFF -DUSE_ROCM=OFF -DUSE_DML=OFF -DENABLE_JAVA=OFF -DBUILD_WHEEL=ON -DUSE_GUIDANCE=OFF -DPUBLISH_JAVA_MAVEN_LOCAL=OFF '-DCMAKE_CUDA_COMPILER=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc'
-- Selecting Windows SDK version 10.0.26100.0 to target Windows 10.0.26200.
-- CET shadow stack enabled (/CETCOMPAT).
-- Using ONNX Runtime package Microsoft.ML.OnnxRuntime.Gpu.Windows version 1.22.0
-- ORT_HEADER_DIR: D:/onnxruntime-genai/build/Windows/Release/_deps/ortlib-src/buildTransitive/native/include
-- ORT_LIB_DIR: D:/onnxruntime-genai/build/Windows/Release/_deps/ortlib-src/runtimes/win-x64/native
Loading Dependencies URLs ...
Loading Dependencies ...
CMake Deprecation Warning at build/Windows/Release/_deps/pybind11_project-src/CMakeLists.txt:13 (cmake_minimum_required):
Compatibility with CMake < 3.10 will be removed from a future version of
CMake.
Update the VERSION argument <min> value. Or, use the <min>...<max> syntax
to tell CMake that the project requires at least <min> but has been updated
to work with policies introduced by <max> or earlier.
-- pybind11 v2.13.6
-- _STATIC_MSVC_RUNTIME_LIBRARY: OFF
-- Fetch json
-- Tokenizer needed.
-- CMAKE_CUDA_COMPILER_VERSION: 13.0.88
CMake Deprecation Warning at cmake/check_cuda.cmake:20 (cmake_policy):
The OLD behavior for policy CMP0104 will be removed from a future version
of CMake.
The cmake-policies(7) manual explains that the OLD behaviors of all
policies are deprecated and that a policy should be set to OLD only under
specific short-term circumstances. Projects should be ported to the NEW
behavior and not rely on setting a policy to OLD.
Call Stack (most recent call first):
CMakeLists.txt:102 (include)
-- CMAKE_CUDA_COMPILER_VERSION: 13.0.88
------------------Enabling tests------------------
-- Including CUDA kernel tests in the build.
-- Enable STABLE_TOPK in CUDA kernel tests.
------------------Enabling Python Wheel------------------
Setting up wheel files in : D:/onnxruntime-genai/build/Windows/Release/wheel
------------------Enabling model benchmark------------------
-- Configuring done (9.9s)
-- Generating done (0.2s)
-- Build files have been written to: D:/onnxruntime-genai/build/Windows/Release
2025-11-16 00:15:10,752 util.run [INFO] - Running subprocess in 'D:\onnxruntime-genai'
'D:\Program Files\Microsoft Visual Studio\18\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.EXE' --build 'D:\onnxruntime-genai\build\Windows\Release' --config Release
MSBuild version 18.0.2+995a3dce4 for .NET Framework
Checking File Globs
noexcep_operators.vcxproj -> D:\onnxruntime-genai\build\Windows\Release\lib\Release\noexcep_operators.lib
ocos_operators.vcxproj -> D:\onnxruntime-genai\build\Windows\Release\lib\Release\ocos_operators.lib
Compiling CUDA source file ..\..\..\src\cuda\beam_search_scorer_cuda.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\/src/cuda/beam_search_scorer_cuda.cu.obj "D:\onnxruntime-genai\src\cuda\beam_s
earch_scorer_cuda.cu"
beam_search_scorer_cuda.cu
tmpxft_00008450_00000000-7_beam_search_scorer_cuda.cudafe1.cpp
Compiling CUDA source file ..\..\..\src\cuda\beam_search_topk.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\beam_search_topk.obj "D:\onnxruntime-genai\src\cuda\beam_search_topk.cu"
beam_search_topk.cu
tmpxft_000012ac_00000000-7_beam_search_topk.cudafe1.cpp
Compiling CUDA source file ..\..\..\src\cuda\cuda_sampling.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\cuda_sampling.obj "D:\onnxruntime-genai\src\cuda\cuda_sampling.cu"
cuda_sampling.cu
tmpxft_00000ab8_00000000-7_cuda_sampling.cudafe1.cpp
Compiling CUDA source file ..\..\..\src\cuda\cuda_topk.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\cuda_topk.obj "D:\onnxruntime-genai\src\cuda\cuda_topk.cu"
cuda_topk.cu
tmpxft_00005db4_00000000-7_cuda_topk.cudafe1.cpp
Compiling CUDA source file ..\..\..\src\cuda\model_kernels.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\model_kernels.obj "D:\onnxruntime-genai\src\cuda\model_kernels.cu"
model_kernels.cu
tmpxft_00006ed8_00000000-7_model_kernels.cudafe1.cpp
Compiling CUDA source file ..\..\..\src\cuda\search_cuda.cu...
D:\onnxruntime-genai\build\Windows\Release>"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\bin\nvcc.exe" --use-local-env -ccbin "D:\Program Files\Microsoft Visual Studio\18\Community\VC\Tools\MSVC\14.44.35207\bin\HostX64\x64" -x cu -I"C:\Program Files\NVIDIA GPU C
omputing Toolkit\CUDA\v13.0\include" -I"C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\cccl" -I"D:\onnxruntime-genai\build\Windows\Release\_deps\ortlib-src\buildTransitive\native\include" -I"D:\onnxruntime-genai\src" -I"C:\Program Files\NVIDIA GPU Computing T
oolkit\CUDA\v13.0\include" --keep-dir onnxrunt.A5C0B7CE\x64\Release -maxrregcount=0 --machine 64 --compile -cudart static -allow-unsupported-compiler -Xcudafe --diag_suppress=2803 --expt-relaxed-constexpr -std=c++17 --generate-code=arch=compute_75,code=[compute_75,sm_7
5] -diag-suppress=1650 -diag-suppress=221 -Wno-deprecated-gpu-targets -Xcompiler="/EHsc -Ob2 /WX" -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=
0 -D"CMAKE_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -D_WINDLL -D_MBCS -DWIN32 -D_WINDOWS -DNDEBUG -D_DISABLE_CONSTEXPR_MUTEX_CONSTRUCTOR -DUSE_WINML=0 -DUSE_CUDA=1 -DUSE_ROCM=0 -DUSE_DML=0 -DUSE_CXX17=1 -DBUILDING_ORT_GENAI_C -DUSE_GUIDANCE=0 -DTEST_PHI2=0 -D"CMAKE
_INTDIR=\"Release\"" -Donnxruntime_genai_cuda_EXPORTS -Xcompiler "/EHsc /W4 /nologo /O2 /FS /MD " -Xcompiler "/Fdonnxruntime-genai-cuda.dir\Release\vc143.pdb" -o onnxruntime-genai-cuda.dir\Release\/src/cuda/search_cuda.cu.obj "D:\onnxruntime-genai\src\cuda\search_cuda.cu"
search_cuda.cu
tmpxft_00001388_00000000-7_search_cuda.cudafe1.cpp
beam_search_scorer_cuda.cpp
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(649,55): error C2220: the following warning is treated as an error [D:\onnxruntime-genai\build\Windows\Release\onnxruntime-genai-cuda.vcxproj]
(compiling source file '../../../src/cuda/beam_search_scorer_cuda.cpp')
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(649,55): warning C4996: 'double4::x': use double4_16a or double4_32a [D:\onnxruntime-genai\build\Windows\Release\onnxruntime-genai-cuda.vcxproj]
(compiling source file '../../../src/cuda/beam_search_scorer_cuda.cpp')
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(650,55): warning C4996: 'double4::y': use double4_16a or double4_32a [D:\onnxruntime-genai\build\Windows\Release\onnxruntime-genai-cuda.vcxproj]
(compiling source file '../../../src/cuda/beam_search_scorer_cuda.cpp')
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(651,55): warning C4996: 'double4::z': use double4_16a or double4_32a [D:\onnxruntime-genai\build\Windows\Release\onnxruntime-genai-cuda.vcxproj]
(compiling source file '../../../src/cuda/beam_search_scorer_cuda.cpp')
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v13.0\include\curand_poisson.h(652,55): warning C4996: 'double4::w': use double4_16a or double4_32a [D:\onnxruntime-genai\build\Windows\Release\onnxruntime-genai-cuda.vcxproj]
(compiling source file '../../../src/cuda/beam_search_scorer_cuda.cpp')
ortcustomops.vcxproj -> D:\onnxruntime-genai\build\Windows\Release\lib\Release\ortcustomops.lib
Traceback (most recent call last):
File "D:\onnxruntime-genai\\build.py", line 793, in <module>
build(arguments, environment)
File "D:\onnxruntime-genai\\build.py", line 650, in build
util.run(make_command, env=env)
File "D:\onnxruntime-genai\tools\python\util\run.py", line 56, in run
completed_process = subprocess.run(
^^^^^^^^^^^^^^^
File "C:\Users\Lenovo\miniforge3\Lib\subprocess.py", line 571, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['D:\\Program Files\\Microsoft Visual Studio\\18\\Community\\Common7\\IDE\\CommonExtensions\\Microsoft\\CMake\\CMake\\bin\\cmake.EXE', '--build', 'D:\\onnxruntime-genai\\build\\Windows\\Release', '--config', 'Release']' returned non-zero exit status 1.
Desktop (please complete the following information):
- OS: Windows
- Version 11
nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2025 NVIDIA Corporation
Built on Wed_Aug_20_13:58:20_Pacific_Daylight_Time_2025
Cuda compilation tools, release 13.0, V13.0.88
Build cuda_13.0.r13.0/compiler.36424714_0