Deploy High-Performance AI Models in Windows Applications on NVIDIA RTX AI PCs
Deploy High-Performance AI Models in Windows Applications on NVIDIA RTX AI PCs | NVIDIA Technical Blog

“Today, Microsoft is making Windows ML available to developers. Windows ML enables C#, C++ and Python developers to optimally run AI models locally across PC hardware from CPU, NPU and GPUs. On NVIDIA RTX GPUs, it utilizes the NVIDIA TensorRT for RTX Execution Provider (EP) leveraging the GPU’s Tensor Cores and architectural advancements like FP8 and FP4, to provide the fastest AI inference performance on Windows-based RTX AI PCs.
“Windows ML unlocks full TensorRT acceleration for GeForce RTX and RTX Pro GPUs, delivering exceptional AI performance on Windows 11,” said Logan Iyer, VP, Distinguished Engineer, Windows Platform and Developer. “We’re excited it’s generally available for developers today to build and deploy powerful AI experiences at scale.”
Windows ML is built upon the ONNX Runtime APIs for inferencing. It extends the ONNX Runtime APIs to handle dynamic initialization and dependency management of the execution provider across CPU, NPU, and GPU hardware on the PC. In addition, Windows ML also automatically downloads the necessary execution provider on demand, mitigating the need for app developers to manage dependencies and packages across multiple different hardware vendors…”
Source: developer.nvidia.com/blog/deploy-ai-models-faster-with-windows-ml-on-rtx-pcs