Class LibMatrixCUDA
- java.lang.Object
- 
- org.apache.sysds.runtime.matrix.data.LibMatrixCUDA
 
- 
- Direct Known Subclasses:
- LibMatrixCuDNN,- LibMatrixCuDNNInputRowFetcher,- LibMatrixCuMatMult
 
 public class LibMatrixCUDA extends Object All CUDA kernels and library calls are redirected through this class- See Also:
- GPUContext,- GPUObject
 
- 
- 
Field SummaryFields Modifier and Type Field Description static CudaSupportFunctionscudaSupportFunctionsstatic StringcustomKernelSuffixstatic intsizeOfDataType
 - 
Constructor SummaryConstructors Constructor Description LibMatrixCUDA()
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static voidabs(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "abs" operation on a matrix on the GPUstatic voidacos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "acos" operation on a matrix on the GPUstatic voidasin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "asin" operation on a matrix on the GPUstatic voidatan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "atan" operation on a matrix on the GPUstatic voidaxpy(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, double constant)Performs daxpy operationstatic voidbiasAdd(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input + matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in functionstatic voidbiasMultiply(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock)Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input * matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in functionstatic voidcbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)static voidceil(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "ceil" operation on a matrix on the GPUstatic voidchannelSums(GPUContext gCtx, String instName, MatrixObject input, MatrixObject outputBlock, long C, long HW)Perform channel_sums operations: out = rowSums(matrix(colSums(A), rows=C, cols=HW))static intcomputeNNZ(GPUContext gCtx, jcuda.Pointer densePtr, int length)Utility to compute number of non-zeroes on the GPUstatic voidcos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "cos" operation on a matrix on the GPUstatic voidcosh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "cosh" operation on a matrix on the GPUstatic voidcumulativeScan(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)Cumulative scanstatic voidcumulativeSumProduct(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName)Cumulative sum-product kernel cascade invokationstatic voiddenseTranspose(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer A, jcuda.Pointer C, long numRowsA, long numColsA)Computes C = t(A)static voiddeviceCopy(String instName, jcuda.Pointer src, jcuda.Pointer dest, int rlen, int clen)Performs a deep copy of input device double pointer corresponding to matrixstatic jcuda.Pointerdouble2float(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)static voidexp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "exp" operation on a matrix on the GPUstatic jcuda.Pointerfloat2double(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems)static voidfloor(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "floor" operation on a matrix on the GPUstatic JCudaKernelsgetCudaKernels(GPUContext gCtx)static MatrixObjectgetDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols)Helper method to get the output block (allocated on the GPU) Also records performance information intoStatisticsstatic MatrixObjectgetDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols, boolean initialize)static jcuda.PointergetDensePointer(GPUContext gCtx, MatrixObject input, String instName)Convenience method to get jcudaDenseMatrixPtr.static longgetNnz(GPUContext gCtx, String instName, MatrixObject mo, boolean recomputeDenseNNZ)Note: if the matrix is in dense format, it explicitly re-computes the number of nonzeros.static booleanisInSparseFormat(GPUContext gCtx, MatrixObject mo)static voidlog(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "log" operation on a matrix on the GPUstatic voidmatmultTSMM(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject left, String outputName, boolean isLeftTransposed)Performs tsmm, A %*% A' or A' %*% A, on GPU by exploiting cublasDsyrk(...)static voidmatrixMatrixArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, boolean isLeftTransposed, boolean isRightTransposed, BinaryOperator op)Performs elementwise arithmetic operation specified by op of two input matrices in1 and in2static voidmatrixMatrixRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, BinaryOperator op)Performs elementwise operation relational specified by op of two input matrices in1 and in2static voidmatrixScalarArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)Entry point to perform elementwise matrix-scalar arithmetic operation specified by opstatic voidmatrixScalarOp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op)Utility to do matrix-scalar operation kernelstatic voidmatrixScalarRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, ScalarOperator op)Entry point to perform elementwise matrix-scalar relational operation specified by opstatic jcuda.Pointerone()Convenience method to get a pointer to value '1.0' on device.static voidrbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)static voidreluBackward(GPUContext gCtx, String instName, MatrixObject input, MatrixObject dout, MatrixObject outputBlock)This method computes the backpropagation errors for previous layer of relu operationstatic voidresetFloatingPointPrecision()Sets the internal state based on the DMLScript.DATA_TYPEstatic voidround(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "round" operation on a matrix on the GPUstatic voidsigmoid(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sigmoid" operation on a matrix on the GPUstatic voidsign(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sign" operation on a matrix on the GPUstatic voidsin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sin" operation on a matrix on the GPUstatic voidsinh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sinh" operation on a matrix on the GPUstatic voidsliceOperations(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, IndexRange ixrange, String outputName)Method to perform rightIndex operation for a given lower and upper bounds in row and column dimensions.static voidsolve(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName)Implements the "solve" function for systemds Ax = B (A is of size m*n, B is of size m*1, x is of size n*1)static voidsqrt(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "sqrt" operation on a matrix on the GPUstatic voidtan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "tan" operation on a matrix on the GPUstatic voidtanh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName)Performs an "tanh" operation on a matrix on the GPUstatic inttoInt(long num)static voidtranspose(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName)Transposes the input matrix using cublasDgeamstatic voidunaryAggregate(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String output, AggregateUnaryOperator op)Entry point to perform Unary aggregate operations on the GPU.static jcuda.Pointerzero()Convenience method to get a pointer to value '0.0f' on device.
 
- 
- 
- 
Field Detail- 
cudaSupportFunctionspublic static CudaSupportFunctions cudaSupportFunctions 
 - 
sizeOfDataTypepublic static int sizeOfDataType 
 - 
customKernelSuffixpublic static String customKernelSuffix 
 
- 
 - 
Method Detail- 
resetFloatingPointPrecisionpublic static void resetFloatingPointPrecision() Sets the internal state based on the DMLScript.DATA_TYPE
 - 
isInSparseFormatpublic static boolean isInSparseFormat(GPUContext gCtx, MatrixObject mo) 
 - 
getNnzpublic static long getNnz(GPUContext gCtx, String instName, MatrixObject mo, boolean recomputeDenseNNZ) Note: if the matrix is in dense format, it explicitly re-computes the number of nonzeros.- Parameters:
- gCtx- a valid GPU context
- instName- instruction name
- mo- matrix object
- recomputeDenseNNZ- recompute NNZ if dense
- Returns:
- number of non-zeroes
 
 - 
getCudaKernelspublic static JCudaKernels getCudaKernels(GPUContext gCtx) throws DMLRuntimeException - Throws:
- DMLRuntimeException
 
 - 
double2floatpublic static jcuda.Pointer double2float(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems) 
 - 
float2doublepublic static jcuda.Pointer float2double(GPUContext gCtx, jcuda.Pointer A, jcuda.Pointer ret, int numElems) 
 - 
onepublic static jcuda.Pointer one() Convenience method to get a pointer to value '1.0' on device. Instead of allocating and deallocating it for every kernel invocation.- Returns:
- jcuda pointer
 
 - 
zeropublic static jcuda.Pointer zero() Convenience method to get a pointer to value '0.0f' on device. Instead of allocating and deallocating it for every kernel invocation.- Returns:
- jcuda pointer
 
 - 
getDensePointerpublic static jcuda.Pointer getDensePointer(GPUContext gCtx, MatrixObject input, String instName) throws DMLRuntimeException Convenience method to get jcudaDenseMatrixPtr. This method explicitly converts sparse to dense format, so use it judiciously.- Parameters:
- gCtx- a valid- GPUContext
- input- input matrix object
- instName- the invoking instruction's name for record- Statistics.
- Returns:
- jcuda pointer
- Throws:
- DMLRuntimeException
 
 - 
reluBackwardpublic static void reluBackward(GPUContext gCtx, String instName, MatrixObject input, MatrixObject dout, MatrixObject outputBlock) This method computes the backpropagation errors for previous layer of relu operation- Parameters:
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- input- input image
- dout- next layer error propogation
- outputBlock- output
 
 - 
channelSumspublic static void channelSums(GPUContext gCtx, String instName, MatrixObject input, MatrixObject outputBlock, long C, long HW) Perform channel_sums operations: out = rowSums(matrix(colSums(A), rows=C, cols=HW))- Parameters:
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- input- input image
- outputBlock- output
- C- number of channels
- HW- height*width
 
 - 
biasMultiplypublic static void biasMultiply(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock) Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input * matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function- Parameters:
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- input- input image
- bias- bias
- outputBlock- output
 
 - 
biasAddpublic static void biasAdd(GPUContext gCtx, String instName, MatrixObject input, MatrixObject bias, MatrixObject outputBlock) Performs the operation corresponding to the DML script: ones = matrix(1, rows=1, cols=Hout*Wout) output = input + matrix(bias %*% ones, rows=1, cols=F*Hout*Wout) This operation is often followed by conv2d and hence we have introduced bias_add(input, bias) built-in function- Parameters:
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- input- input image
- bias- bias
- outputBlock- output
 
 - 
matmultTSMMpublic static void matmultTSMM(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject left, String outputName, boolean isLeftTransposed) Performs tsmm, A %*% A' or A' %*% A, on GPU by exploiting cublasDsyrk(...)Memory Usage - If dense, input space - rows * cols, no intermediate memory, output - Max(rows*rows, cols*cols) If sparse, calls matmult - Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- left- input matrix, as in a tsmm expression like A %*% A' or A' %*% A, we just need to check whether the left one is transposed or not, I named it 'left'
- outputName- output matrix name
- isLeftTransposed- if true, left transposed
 
 - 
unaryAggregatepublic static void unaryAggregate(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String output, AggregateUnaryOperator op) Entry point to perform Unary aggregate operations on the GPU. The execution context object is used to allocate memory for the GPU.- Parameters:
- ec- Instance of- ExecutionContext, from which the output variable will be allocated
- gCtx- a valid- GPUContext
- instName- name of the invoking instruction to record- Statistics.
- in1- input matrix
- output- output matrix/scalar name
- op- Instance of- AggregateUnaryOperatorwhich encapsulates the direction of reduction/aggregation and the reduction operation.
 
 - 
matrixScalarRelationalpublic static void matrixScalarRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, ScalarOperator op) Entry point to perform elementwise matrix-scalar relational operation specified by op- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in- input matrix
- outputName- output matrix name
- op- scalar operator
 
 - 
matrixScalarArithmeticpublic static void matrixScalarArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op) Entry point to perform elementwise matrix-scalar arithmetic operation specified by op- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in- input matrix
- outputName- output matrix name
- isInputTransposed- true if input transposed
- op- scalar operator
 
 - 
matrixMatrixRelationalpublic static void matrixMatrixRelational(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, BinaryOperator op) Performs elementwise operation relational specified by op of two input matrices in1 and in2- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix 1
- in2- input matrix 2
- outputName- output matrix name
- op- binary operator
 
 - 
matrixMatrixArithmeticpublic static void matrixMatrixArithmetic(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, boolean isLeftTransposed, boolean isRightTransposed, BinaryOperator op) Performs elementwise arithmetic operation specified by op of two input matrices in1 and in2- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix 1
- in2- input matrix 2
- outputName- output matrix name
- isLeftTransposed- true if left-transposed
- isRightTransposed- true if right-transposed
- op- binary operator
 
 - 
matrixScalarOppublic static void matrixScalarOp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName, boolean isInputTransposed, ScalarOperator op) Utility to do matrix-scalar operation kernel- Parameters:
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- ec- execution context
- in- input matrix
- outputName- output variable name
- isInputTransposed- true if input is transposed
- op- operator
 
 - 
deviceCopypublic static void deviceCopy(String instName, jcuda.Pointer src, jcuda.Pointer dest, int rlen, int clen) Performs a deep copy of input device double pointer corresponding to matrix- Parameters:
- instName- the invoking instruction's name for record- Statistics.
- src- source matrix
- dest- destination matrix
- rlen- number of rows
- clen- number of columns
 
 - 
denseTransposepublic static void denseTranspose(ExecutionContext ec, GPUContext gCtx, String instName, jcuda.Pointer A, jcuda.Pointer C, long numRowsA, long numColsA) throws DMLRuntimeException Computes C = t(A)- Parameters:
- ec- execution context
- gCtx- gpu context
- instName- name of the instruction
- A- pointer to the input matrix
- C- pointer to the output matrix
- numRowsA- number of rows of the input matrix
- numColsA- number of columns of the output matrix
- Throws:
- DMLRuntimeException- if error
 
 - 
transposepublic static void transpose(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in, String outputName) Transposes the input matrix using cublasDgeam- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in- input matrix
- outputName- output matrix name
 
 - 
toIntpublic static int toInt(long num) 
 - 
sliceOperationspublic static void sliceOperations(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, IndexRange ixrange, String outputName) Method to perform rightIndex operation for a given lower and upper bounds in row and column dimensions.- Parameters:
- ec- current execution context
- gCtx- current gpu context
- instName- name of the instruction for maintaining statistics
- in1- input matrix object
- ixrange- index range (0-based)
- outputName- output matrix object
 
 - 
cbindpublic static void cbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName) 
 - 
rbindpublic static void rbind(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName) 
 - 
exppublic static void exp(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "exp" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
sqrtpublic static void sqrt(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "sqrt" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
roundpublic static void round(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "round" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
abspublic static void abs(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "abs" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
logpublic static void log(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "log" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
floorpublic static void floor(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "floor" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
ceilpublic static void ceil(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "ceil" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
sinpublic static void sin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "sin" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
cospublic static void cos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "cos" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
tanpublic static void tan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "tan" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
sinhpublic static void sinh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "sinh" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
coshpublic static void cosh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "cosh" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
tanhpublic static void tanh(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "tanh" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
asinpublic static void asin(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "asin" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
acospublic static void acos(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "acos" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
atanpublic static void atan(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "atan" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
signpublic static void sign(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "sign" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
sigmoidpublic static void sigmoid(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, String outputName) Performs an "sigmoid" operation on a matrix on the GPU- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix
- outputName- output matrix name
 
 - 
cumulativeScanpublic static void cumulativeScan(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName) Cumulative scan- Parameters:
- ec- valid execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- kernelFunction- The name of the cuda kernel to call
- in- input matrix
- outputName- output matrix name
 
 - 
cumulativeSumProductpublic static void cumulativeSumProduct(ExecutionContext ec, GPUContext gCtx, String instName, String kernelFunction, MatrixObject in, String outputName) Cumulative sum-product kernel cascade invokation- Parameters:
- ec- valid execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- kernelFunction- The name of the cuda kernel to call
- in- input matrix
- outputName- output matrix name
 
 - 
axpypublic static void axpy(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName, double constant) Performs daxpy operation- Parameters:
- ec- execution context
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix 1
- in2- input matrix 2
- outputName- output matrix name
- constant- pointer constant
 
 - 
solvepublic static void solve(ExecutionContext ec, GPUContext gCtx, String instName, MatrixObject in1, MatrixObject in2, String outputName) Implements the "solve" function for systemds Ax = B (A is of size m*n, B is of size m*1, x is of size n*1)- Parameters:
- ec- a valid- ExecutionContext
- gCtx- a valid- GPUContext
- instName- the invoking instruction's name for record- Statistics.
- in1- input matrix A
- in2- input matrix B
- outputName- name of the output matrix
 
 - 
getDenseMatrixOutputForGPUInstructionpublic static MatrixObject getDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols) Helper method to get the output block (allocated on the GPU) Also records performance information intoStatistics- Parameters:
- ec- active- ExecutionContext
- instName- the invoking instruction's name for record- Statistics.
- name- name of input matrix (that the- ExecutionContextis aware of)
- numRows- number of rows of output matrix object
- numCols- number of columns of output matrix object
- Returns:
- the matrix object
 
 - 
getDenseMatrixOutputForGPUInstructionpublic static MatrixObject getDenseMatrixOutputForGPUInstruction(ExecutionContext ec, String instName, String name, long numRows, long numCols, boolean initialize) 
 - 
computeNNZpublic static int computeNNZ(GPUContext gCtx, jcuda.Pointer densePtr, int length) Utility to compute number of non-zeroes on the GPU- Parameters:
- gCtx- the associated GPUContext
- densePtr- device pointer to the dense matrix
- length- length of the dense pointer
- Returns:
- the number of non-zeroes
 
 
- 
 
-