Class CSRPointer
- java.lang.Object
- 
- org.apache.sysds.runtime.instructions.gpu.context.CSRPointer
 
- 
 public class CSRPointer extends Object Compressed Sparse Row (CSR) format for CUDA Generalized matrix multiply is implemented for CSR format in the cuSparse library among other operations Since we assume that the matrix is stored with zero-based indexing (i.e. CUSPARSE_INDEX_BASE_ZERO), the matrix 1.0 4.0 0.0 0.0 0.0 0.0 2.0 3.0 0.0 0.0 5.0 0.0 0.0 7.0 8.0 0.0 0.0 9.0 0.0 6.0 is stored as val = 1.0 4.0 2.0 3.0 5.0 7.0 8.0 9.0 6.0 rowPtr = 0.0 2.0 4.0 7.0 9.0 colInd = 0.0 1.0 1.0 2.0 0.0 3.0 4.0 2.0 4.0
- 
- 
Field SummaryFields Modifier and Type Field Description jcuda.PointercolIndinteger array of nnz values' column indicesjcuda.jcusparse.cusparseMatDescrdescrdescriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supportedstatic jcuda.jcusparse.cusparseMatDescrmatrixDescriptorlongnnzNumber of non zeroesjcuda.PointerrowPtrinteger array of start of all rows and end of last row + 1jcuda.Pointervaldouble array of non zero values
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static CSRPointerallocateEmpty(GPUContext gCtx, long nnz2, long rows)static CSRPointerallocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize)Factory method to allocate an empty CSR Sparse matrix on the GPUstatic CSRPointerallocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n)Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)static CSRPointerallocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k)Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointerto C with the appropriate GPU memory.CSRPointerclone(int rows)static voidcopyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd)Static method to copy a CSR sparse matrix from Device to hoststatic voidcopyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values)Static method to copy a CSR sparse matrix from Host to Devicevoiddeallocate()Calls cudaFree lazily on the allocatedPointerinstancesvoiddeallocate(boolean eager)Calls cudaFree lazily or eagerly on the allocatedPointerinstancesstatic longestimateSize(long nnz2, long rows)Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added instatic jcuda.jcusparse.cusparseMatDescrgetDefaultCuSparseMatrixDescriptor()booleanisUltraSparse(int rows, int cols)Check for ultra sparsityjcuda.PointertoColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU.static inttoIntExact(long l)StringtoString()
 
- 
- 
- 
Field Detail- 
matrixDescriptorpublic static jcuda.jcusparse.cusparseMatDescr matrixDescriptor 
 - 
nnzpublic long nnz Number of non zeroes
 - 
valpublic jcuda.Pointer val double array of non zero values
 - 
rowPtrpublic jcuda.Pointer rowPtr integer array of start of all rows and end of last row + 1
 - 
colIndpublic jcuda.Pointer colInd integer array of nnz values' column indices
 - 
descrpublic jcuda.jcusparse.cusparseMatDescr descr descriptor of matrix, only CUSPARSE_MATRIX_TYPE_GENERAL supported
 
- 
 - 
Method Detail- 
toIntExactpublic static int toIntExact(long l) 
 - 
getDefaultCuSparseMatrixDescriptorpublic static jcuda.jcusparse.cusparseMatDescr getDefaultCuSparseMatrixDescriptor() - Returns:
- Singleton default matrix descriptor object (set with CUSPARSE_MATRIX_TYPE_GENERAL, CUSPARSE_INDEX_BASE_ZERO)
 
 - 
estimateSizepublic static long estimateSize(long nnz2, long rows)Estimate the size of a CSR matrix in GPU memory Size of pointers is not needed and is not added in- Parameters:
- nnz2- number of non zeroes
- rows- number of rows
- Returns:
- size estimate
 
 - 
copyToDevicepublic static void copyToDevice(GPUContext gCtx, CSRPointer dest, int rows, long nnz, int[] rowPtr, int[] colInd, double[] values) Static method to copy a CSR sparse matrix from Host to Device- Parameters:
- gCtx- GPUContext
- dest- [input] destination location (on GPU)
- rows- number of rows
- nnz- number of non-zeroes
- rowPtr- integer array of row pointers
- colInd- integer array of column indices
- values- double array of non zero values
 
 - 
copyPtrToHostpublic static void copyPtrToHost(CSRPointer src, int rows, long nnz, int[] rowPtr, int[] colInd) Static method to copy a CSR sparse matrix from Device to host- Parameters:
- src- [input] source location (on GPU)
- rows- [input] number of rows
- nnz- [input] number of non-zeroes
- rowPtr- [output] pre-allocated integer array of row pointers of size (rows+1)
- colInd- [output] pre-allocated integer array of column indices of size nnz
 
 - 
allocateForDgeampublic static CSRPointer allocateForDgeam(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, CSRPointer B, int m, int n) Estimates the number of non zero elements from the results of a sparse cusparseDgeam operation C = a op(A) + b op(B)- Parameters:
- gCtx- a valid- GPUContext
- handle- a valid- cusparseHandle
- A- Sparse Matrix A on GPU
- B- Sparse Matrix B on GPU
- m- Rows in A
- n- Columns in Bs
- Returns:
- CSR (compressed sparse row) pointer
 
 - 
allocateForMatrixMultiplypublic static CSRPointer allocateForMatrixMultiply(GPUContext gCtx, jcuda.jcusparse.cusparseHandle handle, CSRPointer A, int transA, CSRPointer B, int transB, int m, int n, int k) Estimates the number of non-zero elements from the result of a sparse matrix multiplication C = A * B and returns theCSRPointerto C with the appropriate GPU memory.- Parameters:
- gCtx- a valid- GPUContext
- handle- a valid- cusparseHandle
- A- Sparse Matrix A on GPU
- transA- 'T' if A is to be transposed, 'N' otherwise
- B- Sparse Matrix B on GPU
- transB- 'T' if B is to be transposed, 'N' otherwise
- m- Rows in A
- n- Columns in B
- k- Columns in A / Rows in B
- Returns:
- a CSRPointerinstance that encapsulates the CSR matrix on GPU
 
 - 
allocateEmptypublic static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows, boolean initialize) Factory method to allocate an empty CSR Sparse matrix on the GPU- Parameters:
- gCtx- a valid- GPUContext
- nnz2- number of non-zeroes
- rows- number of rows
- initialize- memset to zero?
- Returns:
- a CSRPointerinstance that encapsulates the CSR matrix on GPU
 
 - 
allocateEmptypublic static CSRPointer allocateEmpty(GPUContext gCtx, long nnz2, long rows) 
 - 
clonepublic CSRPointer clone(int rows) 
 - 
isUltraSparsepublic boolean isUltraSparse(int rows, int cols)Check for ultra sparsity- Parameters:
- rows- number of rows
- cols- number of columns
- Returns:
- true if ultra sparse
 
 - 
toColumnMajorDenseMatrixpublic jcuda.Pointer toColumnMajorDenseMatrix(jcuda.jcusparse.cusparseHandle cusparseHandle, jcuda.jcublas.cublasHandle cublasHandle, int rows, int cols, String instName)Copies this CSR matrix on the GPU to a dense column-major matrix on the GPU. This is a temporary matrix for operations such as cusparseDcsrmv. Since the allocated matrix is temporary, bookkeeping is not updated. The caller is responsible for calling "free" on the returned Pointer object- Parameters:
- cusparseHandle- a valid- cusparseHandle
- cublasHandle- a valid- cublasHandle
- rows- number of rows in this CSR matrix
- cols- number of columns in this CSR matrix
- instName- name of the invoking instruction to record- Statistics.
- Returns:
- A Pointerto the allocated dense matrix (in column-major format)
 
 - 
deallocatepublic void deallocate() Calls cudaFree lazily on the allocatedPointerinstances
 - 
deallocatepublic void deallocate(boolean eager) Calls cudaFree lazily or eagerly on the allocatedPointerinstances- Parameters:
- eager- whether to do eager or lazy cudaFrees
 
 
- 
 
-