Class GPUContext
- java.lang.Object
- 
- org.apache.sysds.runtime.instructions.gpu.context.GPUContext
 
- 
 public class GPUContext extends Object Represents a context per GPU accessible through the same JVM. Each context holds cublas, cusparse, cudnn... handles which are separate for each GPU.
- 
- 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description jcuda.Pointerallocate(String instructionName, long size)Default behavior for gpu memory allocation (init to zero)jcuda.Pointerallocate(String instructionName, long size, boolean initialize)Invokes memory manager's malloc methodvoidclearMemory()Clears all memory used by thisGPUContext.voidclearTemporaryMemory()GPUObjectcreateGPUObject(MatrixObject mo)Instantiates a newGPUObjectinitialized with the givenMatrixObject.voidcudaFreeHelper(String instructionName, jcuda.Pointer toFree, boolean eager)Does cudaFree calls, lazily.static intcudaGetDevice()Returns which device is currently being used.voiddestroy()Destroys this GPUContext object.voidensureComputeCapability()Makes sure that GPU that SystemDS is trying to use has the minimum compute capability needed.longgetAvailableMemory()Gets the available memory on GPU that SystemDS can use.jcuda.jcublas.cublasHandlegetCublasHandle()Returns cublasHandle for BLAS operations on the GPU.jcuda.jcudnn.cudnnHandlegetCudnnHandle()Returns the cudnnHandle for Deep Neural Network operations on the GPU.jcuda.jcusolver.cusolverDnHandlegetCusolverDnHandle()Returns cusolverDnHandle for invoking solve() function on dense matrices on the GPU.jcuda.jcusparse.cusparseHandlegetCusparseHandle()Returns cusparseHandle for certain sparse BLAS operations on the GPU.intgetDeviceNum()Returns which device is assigned to this GPUContext instance.jcuda.runtime.cudaDevicePropgetGPUProperties()Gets the device properties for the active GPU (set with cudaSetDevice()).JCudaKernelsgetKernels()Returns utility class used to launch custom CUDA kernel, specific to the active GPU for this GPUContext.intgetMaxBlocks()Gets the maximum number of blocks supported by the active cuda device.longgetMaxSharedMemory()Gets the shared memory per block supported by the active cuda device.intgetMaxThreadsPerBlock()Gets the maximum number of threads per block for "active" GPU.GPUMemoryManagergetMemoryManager()intgetWarpSize()Gets the warp size supported by the active cuda device.voidinitializeThread()Sets the device for the calling thread.voidprintMemoryInfo(String opcode)Print information of memory usage.GPUObjectshallowCopyGPUObject(GPUObject source, MatrixObject mo)Shallow copy the given sourceGPUObjectto a newGPUObjectand assign that to the givenMatrixObject.StringtoString()
 
- 
- 
- 
Method Detail- 
getMemoryManagerpublic GPUMemoryManager getMemoryManager() 
 - 
cudaGetDevicepublic static int cudaGetDevice() Returns which device is currently being used.- Returns:
- the current device for the calling host thread
 
 - 
printMemoryInfopublic void printMemoryInfo(String opcode) Print information of memory usage.- Parameters:
- opcode- opcode of caller
 
 - 
getDeviceNumpublic int getDeviceNum() Returns which device is assigned to this GPUContext instance.- Returns:
- active device assigned to this GPUContext instance
 
 - 
initializeThreadpublic void initializeThread() Sets the device for the calling thread. This method must be called afterExecutionContext.getGPUContext(int)If in a multithreaded environment like parfor, this method must be called when in the appropriate thread.
 - 
allocatepublic jcuda.Pointer allocate(String instructionName, long size, boolean initialize) Invokes memory manager's malloc method- Parameters:
- instructionName- name of instruction for which to record per instruction performance statistics, null if you don't want to record
- size- size of data (in bytes) to allocate
- initialize- if cudaMemset() should be called
- Returns:
- jcuda pointer
 
 - 
allocatepublic jcuda.Pointer allocate(String instructionName, long size) Default behavior for gpu memory allocation (init to zero)- Parameters:
- instructionName- Name of the instruction calling allocate
- size- size in bytes
- Returns:
- jcuda pointer
 
 - 
cudaFreeHelperpublic void cudaFreeHelper(String instructionName, jcuda.Pointer toFree, boolean eager) Does cudaFree calls, lazily.- Parameters:
- instructionName- name of the instruction for which to record per instruction free time, null if you do not want to record
- toFree-- Pointerinstance to be freed
- eager- true if to be done eagerly
 
 - 
getAvailableMemorypublic long getAvailableMemory() Gets the available memory on GPU that SystemDS can use.- Returns:
- the available memory in bytes
 
 - 
ensureComputeCapabilitypublic void ensureComputeCapability() Makes sure that GPU that SystemDS is trying to use has the minimum compute capability needed.
 - 
createGPUObjectpublic GPUObject createGPUObject(MatrixObject mo) Instantiates a newGPUObjectinitialized with the givenMatrixObject.- Parameters:
- mo- a- MatrixObjectthat represents a matrix
- Returns:
- a new GPUObjectinstance
 
 - 
shallowCopyGPUObjectpublic GPUObject shallowCopyGPUObject(GPUObject source, MatrixObject mo) Shallow copy the given sourceGPUObjectto a newGPUObjectand assign that to the givenMatrixObject. This copy doesn't memcopy the device memory.- Parameters:
- source- a- GPUObjectwhich is the source of the copy
- mo- a- MatrixObjectto associate with the new- GPUObject
- Returns:
- a new GPUObjectinstance
 
 - 
getGPUPropertiespublic jcuda.runtime.cudaDeviceProp getGPUProperties() Gets the device properties for the active GPU (set with cudaSetDevice()).- Returns:
- the device properties
 
 - 
getMaxThreadsPerBlockpublic int getMaxThreadsPerBlock() Gets the maximum number of threads per block for "active" GPU.- Returns:
- the maximum number of threads per block
 
 - 
getMaxBlockspublic int getMaxBlocks() Gets the maximum number of blocks supported by the active cuda device.- Returns:
- the maximum number of blocks supported
 
 - 
getMaxSharedMemorypublic long getMaxSharedMemory() Gets the shared memory per block supported by the active cuda device.- Returns:
- the shared memory per block
 
 - 
getWarpSizepublic int getWarpSize() Gets the warp size supported by the active cuda device.- Returns:
- the warp size
 
 - 
getCudnnHandlepublic jcuda.jcudnn.cudnnHandle getCudnnHandle() Returns the cudnnHandle for Deep Neural Network operations on the GPU.- Returns:
- cudnnHandle for current thread
 
 - 
getCublasHandlepublic jcuda.jcublas.cublasHandle getCublasHandle() Returns cublasHandle for BLAS operations on the GPU.- Returns:
- cublasHandle for current thread
 
 - 
getCusparseHandlepublic jcuda.jcusparse.cusparseHandle getCusparseHandle() Returns cusparseHandle for certain sparse BLAS operations on the GPU.- Returns:
- cusparseHandle for current thread
 
 - 
getCusolverDnHandlepublic jcuda.jcusolver.cusolverDnHandle getCusolverDnHandle() Returns cusolverDnHandle for invoking solve() function on dense matrices on the GPU.- Returns:
- cusolverDnHandle for current thread
 
 - 
getKernelspublic JCudaKernels getKernels() Returns utility class used to launch custom CUDA kernel, specific to the active GPU for this GPUContext.- Returns:
- JCudaKernelsfor current thread
 
 - 
destroypublic void destroy() Destroys this GPUContext object.
 - 
clearMemorypublic void clearMemory() Clears all memory used by thisGPUContext. Be careful to ensure that no memory is currently being used in the temporary memory before invoking this. If memory is being used between MLContext invocations, they are pointed to by aGPUObjectinstance which would be part of theMatrixObject. The cleanup of thatMatrixObjectinstance will cause the memory associated with that block on the GPU to be freed up.
 - 
clearTemporaryMemorypublic void clearTemporaryMemory() 
 
- 
 
-