Introduced support for the Maxwell architecture (sm_50). More information on Maxwell can be found here:
https://developer.nvidia.com/maxwell-compute- architecture. Although the CUDA Toolkit supports developing applications targeted to sm_50, the driver bundled with the CUDA installer does not. Users will need to obtain a driver compatible with the Maxwell architecture from http://
www.nvidia.com/drivers.
Unified Memory is a new feature enabling a type of memory that can be accessed by both the CPU and GPU without explicit copying between the two. This is called "managed memory" in the software APIs. Unified Memory is automatically migrated to the physical memory attached to the processor that is accessing it. This migration provides high performance access from either processor, unlike "zero- copy" memory where all accesses are out of CPU system memory.
Added a standalone header library for calculating occupancy (the library is not dependent on the CUDA Runtime or CUDA Driver APIs). The header library provides a programmatic interface for the occupancy calculations previously contained in the CUDA Occupancy Calculator. This library is currently in beta status. The interface and implementation are subject to change.
The Dynamic Parallelism runtime should no longer generate a cudaErrorLaunchPendingCountExceeded error when the number of
pending launches exceeds cudaLimitDevRuntimePendingLaunchCount. Instead, the runtime automatically extends the pending launch buffer beyond cudaLimitDevRuntimePendingLaunchCount, albeit with a performance penalty.
Support for the following Linux distributions has been added as of CUDA 6.0: Fedora 19, Ubuntu 13.04, CentOS 5.5+, CentOS 6.4, OpenSUSE 12.3, SLES SP11, and NVIDIA Linux For Tegra (L4T) 19.1.
Support for the ICC Compiler has been upgraded to version 13.1.
Support for the Windows Server 2012 R2 operating system has been added as of CUDA 6.0.
RDMA (remote direct memory access) for GPUDirect is now supported for applications running under MPS (Multi-Process Service).
CUDA Inter-Process Communication (IPC) is now supported for applications running under MPS. CUDA IPC event and memory handles can be exported and opened by the MPS clients of a single MPS server.
Applications running under MPS can now use assert() in their kernels. When an assert is triggered, all work submitted by MPS clients will be stalled until the assert is handled. The MPS client that triggered the assert will exit, but will not interfere with other running MPS clients.
Previously, a wide variety of errors were reported by an "Unspecified
Launch Failure (ULF)" message or by the corresponding error codes CUDA_ERROR_LAUNCH_FAILED and cudaErrorLaunchFailed. The CUDA driver now supports enhanced error reporting by providing richer error messages when exceptions occur. This will help developers determine the causes of application faults without the need of additional tools.