Package org.apache.sysds.hops
Class OptimizerUtils
- java.lang.Object
- 
- org.apache.sysds.hops.OptimizerUtils
 
- 
 public class OptimizerUtils extends Object 
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classOptimizerUtils.MemoryManagerMemory managers (static partitioned, unified)static classOptimizerUtils.OptimizationLevelOptimization Types for Compilation O0 STATIC - Decisions for scheduling operations on CP/MR are based on predefined set of rules, which check if the dimensions are below a fixed/static threshold (OLD Method of choosing between CP and MR).
 - 
Field SummaryFields Modifier and Type Field Description static booleanALLOW_ALGEBRAIC_SIMPLIFICATIONstatic booleanALLOW_AUTO_VECTORIZATIONstatic booleanALLOW_BRANCH_REMOVALEnables if-else branch removal for constant predicates (original literals or results of constant folding).static booleanALLOW_CODE_MOTIONEnables a specific rewrite for code motion, i.e., hoisting loop invariant code out of while, for, and parfor loops.static booleanALLOW_COMBINE_FILE_INPUT_FORMATEnables the use of CombineSequenceFileInputFormat with splitsize = 2x hdfs blocksize, if sort buffer size large enough and parallelism not hurt.static booleanALLOW_COMMON_SUBEXPRESSION_ELIMINATIONEnables common subexpression elimination in dags.static booleanALLOW_COMPRESSION_REWRITEBoolean specifying if compression rewrites is allowed.static booleanALLOW_CONSTANT_FOLDINGEnables constant folding in dags.static booleanALLOW_EVAL_FCALL_REPLACEMENTReplace eval second-order function calls with normal function call if the function name is a known string (after constant propagation).static booleanALLOW_FOR_LOOP_REMOVALEnables the removal of (par)for-loops when from, to, and increment are constants (original literals or results of constant folding) and lead to an empty sequence, i.e., (par)for-loops without a single iteration.static booleanALLOW_INTER_PROCEDURAL_ANALYSISEnables interprocedural analysis between main script and functions as well as functions and other functions.static booleanALLOW_LOOP_UPDATE_IN_PLACEEnables a specific rewrite that enables update in place for loop variables that are only read/updated via cp leftindexing.static booleanALLOW_OPERATOR_FUSIONstatic booleanALLOW_RAND_JOB_RECOMPILEstatic booleanALLOW_RUNTIME_PIGGYBACKINGEnables parfor runtime piggybacking of MR jobs into the packed jobs for scan sharing.static booleanALLOW_SCRIPT_LEVEL_COMPRESS_COMMANDThis variable allows for insertion of Compress and decompress in the dml script from the user.static booleanALLOW_SCRIPT_LEVEL_LOCAL_COMMANDThis variable allows for use of explicit local command, that forces a spark block to be executed and returned as a local block.static booleanALLOW_SIZE_EXPRESSION_EVALUATIONEnables simple expression evaluation for datagen parameters 'rows', 'cols'.static booleanALLOW_SPLIT_HOP_DAGSEnables a specific hop dag rewrite that splits hop dags after csv persistent reads with unknown size in order to allow for recompile.static booleanALLOW_SUM_PRODUCT_REWRITESEnables sum product rewrites such as mapmultchains.static booleanALLOW_TRANSITIVE_SPARK_EXEC_TYPEEnable transitive spark execution type selection.static booleanALLOW_UNARY_UPDATE_IN_PLACEEnables the update-in-place for all unary operators with a single consumer.static booleanALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATIONEnables simple expression evaluation for datagen parameters 'rows', 'cols'.static booleanASYNC_TRIGGER_RDD_OPERATIONSEnable prefetch and broadcast.static longBOOLEAN_SIZEstatic longBUFFER_POOL_SIZEBuffer pool size in bytesstatic longCHAR_SIZEstatic intDEFAULT_BLOCKSIZEDefault blocksize if unspecified or for testing purposesstatic intDEFAULT_FRAME_BLOCKSIZEDefault frame blocksizestatic doubleDEFAULT_MEM_UTIL_FACTORDefault buffer pool sizes for static (15%) and unified (85%) memorystatic OptimizerUtils.OptimizationLevelDEFAULT_OPTLEVELDefault optimization level if unspecifiedstatic doubleDEFAULT_SIZEDefault memory size, which is used if the actual estimate can not be computed e.g., when input/output dimensions are unknown.static doubleDEFAULT_UMM_UTIL_FACTORstatic longDOUBLE_SIZEstatic booleanFEDERATED_COMPILATIONCompile federated instructions based on input federation state and privacy constraints.static Map<Integer,FEDInstruction.FederatedOutput>FEDERATED_SPECSstatic longINT_SIZEstatic doubleINVALID_SIZEstatic intIPA_NUM_REPETITIONSNumber of inter-procedural analysis (IPA) repetitions.static longMAX_NNZ_CP_SPARSEstatic longMAX_NUMCELLS_CP_DENSEstatic doubleMEM_UTIL_FACTORUtilization factor used in deciding whether an operation to be scheduled on CP or MR.static OptimizerUtils.MemoryManagerMEMORY_MANAGERIndicate the current memory manager in effectstatic doublePARALLEL_CP_READ_PARALLELISM_MULTIPLIERSpecifies a multiplier computing the degree of parallelism of parallel text read/write out of the available degree of parallelism.static doublePARALLEL_CP_WRITE_PARALLELISM_MULTIPLIERstatic longSAFE_REP_CHANGE_THRES
 - 
Constructor SummaryConstructors Constructor Description OptimizerUtils()
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static booleanallowsToFilterEmptyBlockOutputs(Hop hop)static booleancheckSparkBroadcastMemoryBudget(double size)static booleancheckSparkBroadcastMemoryBudget(long rlen, long clen, long blen, long nnz)static booleancheckSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned)static booleancheckSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned, boolean checkBP)static booleancheckSparseBlockCSRConversion(DataCharacteristics dcIn)static CompilerConfigconstructCompilerConfig(CompilerConfig cconf, DMLConfig dmlconf)static CompilerConfigconstructCompilerConfig(DMLConfig dmlconf)static voiddisableUMM()Disable unified memory manager and fallback to static partitioning.static voidenableUMM()Enable unified memory manager and initialize with the default size (85%).static longestimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.static longestimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp, boolean outputEmptyBlocks)static longestimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and number of non-zeros nnz.static longestimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz, boolean outputEmptyBlocks)static longestimatePartitionedSizeExactSparsity(Hop hop)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the hops dimensions and number of non-zeros nnz.static longestimatePartitionedSizeExactSparsity(DataCharacteristics dc)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the given matrix characteristicsstatic longestimatePartitionedSizeExactSparsity(DataCharacteristics dc, boolean outputEmptyBlocks)static longestimateSize(long nrows, long ncols)Similar to estimate() except that it provides worst-case estimates when the optimization type is ROBUST.static longestimateSize(DataCharacteristics dc)static longestimateSizeEmptyBlock(long nrows, long ncols)static longestimateSizeExactSparsity(long nrows, long ncols, double sp)Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.static longestimateSizeExactSparsity(long nrows, long ncols, long nnz)Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and and number of non-zeros nnz.static longestimateSizeExactSparsity(DataCharacteristics dc)static longestimateSizeTextOutput(int[] dims, long nnz, Types.FileFormat fmt)static longestimateSizeTextOutput(long rows, long cols, long nnz, Types.FileFormat fmt)static booleanexceedsCachingThreshold(long dim2, double outMem)Indicates if the given matrix characteristics exceed the threshold for caching, i.e., the matrix should be cached.static doublegetBinaryOpSparsity(double sp1, double sp2, Types.OpOp2 op, boolean worstcase)Estimates the result sparsity for matrix-matrix binary operations (A op B)static doublegetBinaryOpSparsityConditionalSparseSafe(double sp1, Types.OpOp2 op, LiteralOp lit)static longgetBufferPoolLimit()Returns buffer pool size as set in the configstatic intgetConstrainedNumThreads(int maxNumThreads)static Types.ExecModegetDefaultExecutionMode()static intgetDefaultFrameSize()static org.apache.log4j.LevelgetDefaultLogLevel()static longgetDefaultSize()static doublegetLeftIndexingSparsity(long rlen1, long clen1, long nnz1, long rlen2, long clen2, long nnz2)static doublegetLocalMemBudget()Returns memory budget (according to util factor) in bytesstatic longgetMatMultNnz(double sp1, double sp2, long m, long k, long n, boolean worstcase)static doublegetMatMultSparsity(double sp1, double sp2, long m, long k, long n, boolean worstcase)Estimates the result sparsity for Matrix Multiplication A %*% B.static longgetNnz(long dim1, long dim2, double sp)static longgetNumIterations(ForStatementBlock fsb, long defaultValue)static longgetNumIterations(ForProgramBlock fpb, long defaultValue)static longgetNumIterations(ForProgramBlock fpb, LocalVariableMap vars, long defaultValue)static intgetNumMappers()static intgetNumReducers(boolean configOnly)Returns the number of reducers that potentially run in parallel.static OptimizerUtils.OptimizationLevelgetOptLevel()static longgetOuterNonZeros(long n1, long n2, long nnz1, long nnz2, Types.OpOp2 op)static intgetParallelBinaryReadParallelism()static intgetParallelBinaryWriteParallelism()static intgetParallelTextReadParallelism()Returns the degree of parallelism used for parallel text read.static intgetParallelTextWriteParallelism()Returns the degree of parallelism used for parallel text write.static doublegetSparsity(long[] dims, long nnz)static doublegetSparsity(long dim1, long dim2, long nnz)static doublegetSparsity(Hop hop)static doublegetSparsity(DataCharacteristics dc)static doublegetTotalMemEstimate(Hop[] in, Hop out)static doublegetTotalMemEstimate(Hop[] in, Hop out, boolean denseOut)static intgetTransformNumThreads()static StringgetUniqueTempFileName()Wrapper over internal filename construction for external usage.static booleanisBinaryOpConditionalSparseSafe(Types.OpOp2 op)Determines if a given binary op is potentially conditional sparse safe.static booleanisBinaryOpConditionalSparseSafeExact(Types.OpOp2 op, LiteralOp lit)Determines if a given binary op with scalar literal guarantee an output sparsity which is exactly the same as its matrix input sparsity.static booleanisBinaryOpSparsityConditionalSparseSafe(Types.OpOp2 op, LiteralOp lit)static booleanisHybridExecutionMode()static booleanisIndexingRangeBlockAligned(long rl, long ru, long cl, long cu, long blen)Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.static booleanisIndexingRangeBlockAligned(IndexRange ixrange, DataCharacteristics mc)Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.static booleanisMaxLocalParallelism(int k)static booleanisMemoryBasedOptLevel()static booleanisOptLevel(OptimizerUtils.OptimizationLevel level)static booleanisSparkExecutionMode()static booleanisTopLevelParFor()static booleanisUMMEnabled()Check if unified memory manager is in effectstatic booleanisValidCPDimensions(long rows, long cols)Returns false if dimensions known to be invalid; other truestatic booleanisValidCPDimensions(Types.ValueType[] schema, String[] names)Returns false if schema and names are not properly specified; other true Length to be > 0, and length of both to be equal.static booleanisValidCPDimensions(DataCharacteristics mc)static booleanisValidCPMatrixSize(long rows, long cols, double sparsity)Determines if valid matrix size to be represented in CP data structures.static voidresetDefaultSize()static voidresetStaticCompilerFlags()static doublerEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo)static doublerEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo, LocalVariableMap vars)static longrEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo)Function to evaluate simple size expressions over literals and now/ncol.static longrEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo, LocalVariableMap vars)static StringtoMB(double inB)
 
- 
- 
- 
Field Detail- 
MEM_UTIL_FACTORpublic static double MEM_UTIL_FACTOR Utilization factor used in deciding whether an operation to be scheduled on CP or MR. NOTE: it is important that MEM_UTIL_FACTOR+CacheableData.CACHING_BUFFER_SIZE < 1.0
 - 
DEFAULT_MEM_UTIL_FACTORpublic static double DEFAULT_MEM_UTIL_FACTOR Default buffer pool sizes for static (15%) and unified (85%) memory
 - 
DEFAULT_UMM_UTIL_FACTORpublic static double DEFAULT_UMM_UTIL_FACTOR 
 - 
MEMORY_MANAGERpublic static OptimizerUtils.MemoryManager MEMORY_MANAGER Indicate the current memory manager in effect
 - 
BUFFER_POOL_SIZEpublic static long BUFFER_POOL_SIZE Buffer pool size in bytes
 - 
DEFAULT_BLOCKSIZEpublic static final int DEFAULT_BLOCKSIZE Default blocksize if unspecified or for testing purposes- See Also:
- Constant Field Values
 
 - 
DEFAULT_FRAME_BLOCKSIZEpublic static final int DEFAULT_FRAME_BLOCKSIZE Default frame blocksize- See Also:
- Constant Field Values
 
 - 
DEFAULT_OPTLEVELpublic static final OptimizerUtils.OptimizationLevel DEFAULT_OPTLEVEL Default optimization level if unspecified
 - 
DEFAULT_SIZEpublic static double DEFAULT_SIZE Default memory size, which is used if the actual estimate can not be computed e.g., when input/output dimensions are unknown. The default is set to a large value so that operations are scheduled on MR while avoiding overflows as well.
 - 
DOUBLE_SIZEpublic static final long DOUBLE_SIZE - See Also:
- Constant Field Values
 
 - 
INT_SIZEpublic static final long INT_SIZE - See Also:
- Constant Field Values
 
 - 
CHAR_SIZEpublic static final long CHAR_SIZE - See Also:
- Constant Field Values
 
 - 
BOOLEAN_SIZEpublic static final long BOOLEAN_SIZE - See Also:
- Constant Field Values
 
 - 
INVALID_SIZEpublic static final double INVALID_SIZE - See Also:
- Constant Field Values
 
 - 
MAX_NUMCELLS_CP_DENSEpublic static final long MAX_NUMCELLS_CP_DENSE - See Also:
- Constant Field Values
 
 - 
MAX_NNZ_CP_SPARSEpublic static final long MAX_NNZ_CP_SPARSE 
 - 
SAFE_REP_CHANGE_THRESpublic static final long SAFE_REP_CHANGE_THRES - See Also:
- Constant Field Values
 
 - 
ALLOW_COMMON_SUBEXPRESSION_ELIMINATIONpublic static boolean ALLOW_COMMON_SUBEXPRESSION_ELIMINATION Enables common subexpression elimination in dags. There is however, a potential tradeoff between computation redundancy and data transfer between MR jobs. Since, we do not reason about transferred data yet, this rewrite rule is enabled by default.
 - 
ALLOW_CONSTANT_FOLDINGpublic static boolean ALLOW_CONSTANT_FOLDING Enables constant folding in dags. Constant folding computes simple expressions of binary operations and literals and replaces the hop sub-DAG with a new literal operator.
 - 
ALLOW_ALGEBRAIC_SIMPLIFICATIONpublic static boolean ALLOW_ALGEBRAIC_SIMPLIFICATION 
 - 
ALLOW_OPERATOR_FUSIONpublic static boolean ALLOW_OPERATOR_FUSION 
 - 
ALLOW_BRANCH_REMOVALpublic static boolean ALLOW_BRANCH_REMOVAL Enables if-else branch removal for constant predicates (original literals or results of constant folding).
 - 
ALLOW_FOR_LOOP_REMOVALpublic static boolean ALLOW_FOR_LOOP_REMOVAL Enables the removal of (par)for-loops when from, to, and increment are constants (original literals or results of constant folding) and lead to an empty sequence, i.e., (par)for-loops without a single iteration.
 - 
ALLOW_AUTO_VECTORIZATIONpublic static boolean ALLOW_AUTO_VECTORIZATION 
 - 
ALLOW_SIZE_EXPRESSION_EVALUATIONpublic static boolean ALLOW_SIZE_EXPRESSION_EVALUATION Enables simple expression evaluation for datagen parameters 'rows', 'cols'. Simple expressions are defined as binary operations on literals and nrow/ncol. This applies only to exact size information.
 - 
ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATIONpublic static boolean ALLOW_WORSTCASE_SIZE_EXPRESSION_EVALUATION Enables simple expression evaluation for datagen parameters 'rows', 'cols'. Simple expressions are defined as binary operations on literals and b(+) or b(*) on nrow/ncol. This applies also to worst-case size information.
 - 
ALLOW_RAND_JOB_RECOMPILEpublic static boolean ALLOW_RAND_JOB_RECOMPILE 
 - 
ALLOW_RUNTIME_PIGGYBACKINGpublic static boolean ALLOW_RUNTIME_PIGGYBACKING Enables parfor runtime piggybacking of MR jobs into the packed jobs for scan sharing.
 - 
ALLOW_INTER_PROCEDURAL_ANALYSISpublic static boolean ALLOW_INTER_PROCEDURAL_ANALYSIS Enables interprocedural analysis between main script and functions as well as functions and other functions. This includes, for example, to propagate statistics into functions if save to do so (e.g., if called once).
 - 
IPA_NUM_REPETITIONSpublic static int IPA_NUM_REPETITIONS Number of inter-procedural analysis (IPA) repetitions. If set to >=2, we apply IPA multiple times in order to allow scalar propagation over complex function call graphs and various interactions between constant propagation, constant folding, and other rewrites such as branch removal and the merge of statement block sequences.
 - 
ALLOW_SUM_PRODUCT_REWRITESpublic static boolean ALLOW_SUM_PRODUCT_REWRITES Enables sum product rewrites such as mapmultchains. In the future, this will cover all sum-product related rewrites.
 - 
ALLOW_SPLIT_HOP_DAGSpublic static boolean ALLOW_SPLIT_HOP_DAGS Enables a specific hop dag rewrite that splits hop dags after csv persistent reads with unknown size in order to allow for recompile.
 - 
ALLOW_LOOP_UPDATE_IN_PLACEpublic static boolean ALLOW_LOOP_UPDATE_IN_PLACE Enables a specific rewrite that enables update in place for loop variables that are only read/updated via cp leftindexing.
 - 
ALLOW_UNARY_UPDATE_IN_PLACEpublic static boolean ALLOW_UNARY_UPDATE_IN_PLACE Enables the update-in-place for all unary operators with a single consumer. In this case we do not allocate the output, but directly write the output values back to the input block.
 - 
ALLOW_EVAL_FCALL_REPLACEMENTpublic static boolean ALLOW_EVAL_FCALL_REPLACEMENT Replace eval second-order function calls with normal function call if the function name is a known string (after constant propagation).
 - 
ALLOW_CODE_MOTIONpublic static boolean ALLOW_CODE_MOTION Enables a specific rewrite for code motion, i.e., hoisting loop invariant code out of while, for, and parfor loops.
 - 
FEDERATED_COMPILATIONpublic static boolean FEDERATED_COMPILATION Compile federated instructions based on input federation state and privacy constraints.
 - 
FEDERATED_SPECSpublic static Map<Integer,FEDInstruction.FederatedOutput> FEDERATED_SPECS 
 - 
PARALLEL_CP_READ_PARALLELISM_MULTIPLIERpublic static final double PARALLEL_CP_READ_PARALLELISM_MULTIPLIER Specifies a multiplier computing the degree of parallelism of parallel text read/write out of the available degree of parallelism. Set it to 1.0 to get a number of threads equal the number of virtual cores.- See Also:
- Constant Field Values
 
 - 
PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIERpublic static final double PARALLEL_CP_WRITE_PARALLELISM_MULTIPLIER - See Also:
- Constant Field Values
 
 - 
ALLOW_COMBINE_FILE_INPUT_FORMATpublic static final boolean ALLOW_COMBINE_FILE_INPUT_FORMAT Enables the use of CombineSequenceFileInputFormat with splitsize = 2x hdfs blocksize, if sort buffer size large enough and parallelism not hurt. This solves to issues: (1) it combines small files (depending on producers), and (2) it reduces task latency of large jobs with many tasks by factor 2.- See Also:
- Constant Field Values
 
 - 
ALLOW_SCRIPT_LEVEL_LOCAL_COMMANDpublic static boolean ALLOW_SCRIPT_LEVEL_LOCAL_COMMAND This variable allows for use of explicit local command, that forces a spark block to be executed and returned as a local block.
 - 
ALLOW_SCRIPT_LEVEL_COMPRESS_COMMANDpublic static boolean ALLOW_SCRIPT_LEVEL_COMPRESS_COMMAND This variable allows for insertion of Compress and decompress in the dml script from the user. This is added because we want to have a way to test, and verify the correct placement of compress and decompress commands.
 - 
ALLOW_COMPRESSION_REWRITEpublic static boolean ALLOW_COMPRESSION_REWRITE Boolean specifying if compression rewrites is allowed. This is disabled at run time if the IPA for Workload aware compression is activated.
 - 
ALLOW_TRANSITIVE_SPARK_EXEC_TYPEpublic static boolean ALLOW_TRANSITIVE_SPARK_EXEC_TYPE Enable transitive spark execution type selection. This refines the exec-type selection logic of unary aggregates by pushing * the unary aggregates, whose inputs are created by spark instructions, to spark execution type as well.
 - 
ASYNC_TRIGGER_RDD_OPERATIONSpublic static boolean ASYNC_TRIGGER_RDD_OPERATIONS Enable prefetch and broadcast. Prefetch asynchronously calls acquireReadAndRelease() to trigger a chain of spark transformations, which would would otherwise make the next instruction wait till completion. Broadcast allows asynchronously transferring the data to all the nodes.
 
- 
 - 
Method Detail- 
getOptLevelpublic static OptimizerUtils.OptimizationLevel getOptLevel() 
 - 
isMemoryBasedOptLevelpublic static boolean isMemoryBasedOptLevel() 
 - 
isOptLevelpublic static boolean isOptLevel(OptimizerUtils.OptimizationLevel level) 
 - 
constructCompilerConfigpublic static CompilerConfig constructCompilerConfig(DMLConfig dmlconf) 
 - 
constructCompilerConfigpublic static CompilerConfig constructCompilerConfig(CompilerConfig cconf, DMLConfig dmlconf) 
 - 
resetStaticCompilerFlagspublic static void resetStaticCompilerFlags() 
 - 
getDefaultSizepublic static long getDefaultSize() 
 - 
resetDefaultSizepublic static void resetDefaultSize() 
 - 
getDefaultFrameSizepublic static int getDefaultFrameSize() 
 - 
getLocalMemBudgetpublic static double getLocalMemBudget() Returns memory budget (according to util factor) in bytes- Returns:
- local memory budget
 
 - 
getBufferPoolLimitpublic static long getBufferPoolLimit() Returns buffer pool size as set in the config- Returns:
- buffer pool size in bytes
 
 - 
isUMMEnabledpublic static boolean isUMMEnabled() Check if unified memory manager is in effect- Returns:
- boolean
 
 - 
disableUMMpublic static void disableUMM() Disable unified memory manager and fallback to static partitioning. Initialize LazyWriteBuffer with the default size (15%).
 - 
enableUMMpublic static void enableUMM() Enable unified memory manager and initialize with the default size (85%).
 - 
isMaxLocalParallelismpublic static boolean isMaxLocalParallelism(int k) 
 - 
isTopLevelParForpublic static boolean isTopLevelParFor() 
 - 
checkSparkBroadcastMemoryBudgetpublic static boolean checkSparkBroadcastMemoryBudget(double size) 
 - 
checkSparkBroadcastMemoryBudgetpublic static boolean checkSparkBroadcastMemoryBudget(long rlen, long clen, long blen, long nnz)
 - 
checkSparkCollectMemoryBudgetpublic static boolean checkSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned) 
 - 
checkSparkCollectMemoryBudgetpublic static boolean checkSparkCollectMemoryBudget(DataCharacteristics dc, long memPinned, boolean checkBP) 
 - 
checkSparseBlockCSRConversionpublic static boolean checkSparseBlockCSRConversion(DataCharacteristics dcIn) 
 - 
getNumReducerspublic static int getNumReducers(boolean configOnly) Returns the number of reducers that potentially run in parallel. This is either just the configured value (SystemDS config) or the minimum of configured value and available reduce slots.- Parameters:
- configOnly- true if configured value
- Returns:
- number of reducers
 
 - 
getNumMapperspublic static int getNumMappers() 
 - 
getDefaultExecutionModepublic static Types.ExecMode getDefaultExecutionMode() 
 - 
isSparkExecutionModepublic static boolean isSparkExecutionMode() 
 - 
isHybridExecutionModepublic static boolean isHybridExecutionMode() 
 - 
getParallelTextReadParallelismpublic static int getParallelTextReadParallelism() Returns the degree of parallelism used for parallel text read. This is computed as the number of virtual cores scales by the PARALLEL_READ_PARALLELISM_MULTIPLIER. If PARALLEL_READ_TEXTFORMATS is disabled, this method returns 1.- Returns:
- degree of parallelism
 
 - 
getParallelBinaryReadParallelismpublic static int getParallelBinaryReadParallelism() 
 - 
getParallelTextWriteParallelismpublic static int getParallelTextWriteParallelism() Returns the degree of parallelism used for parallel text write. This is computed as the number of virtual cores scales by the PARALLEL_WRITE_PARALLELISM_MULTIPLIER. If PARALLEL_WRITE_TEXTFORMATS is disabled, this method returns 1.- Returns:
- degree of parallelism
 
 - 
getParallelBinaryWriteParallelismpublic static int getParallelBinaryWriteParallelism() 
 - 
estimateSizepublic static long estimateSize(DataCharacteristics dc) 
 - 
estimateSizeExactSparsitypublic static long estimateSizeExactSparsity(DataCharacteristics dc) 
 - 
estimateSizeExactSparsitypublic static long estimateSizeExactSparsity(long nrows, long ncols, long nnz)Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and and number of non-zeros nnz.- Parameters:
- nrows- number of rows
- ncols- number of cols
- nnz- number of non-zeros
- Returns:
- memory footprint
 
 - 
estimateSizeExactSparsitypublic static long estimateSizeExactSparsity(long nrows, long ncols, double sp)Estimates the footprint (in bytes) for an in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp. This function can be used directly in Hops, when the actual sparsity is known i.e.,spis guaranteed to give worst-case estimate (e.g., Rand with a fixed sparsity). In all other cases, estimateSize() must be used so that worst-case estimates are computed, whenever applicable.- Parameters:
- nrows- number of rows
- ncols- number of cols
- sp- sparsity
- Returns:
- memory footprint
 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(DataCharacteristics dc) Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the given matrix characteristics- Parameters:
- dc- matrix characteristics
- Returns:
- memory estimate
 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(DataCharacteristics dc, boolean outputEmptyBlocks) 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and number of non-zeros nnz.- Parameters:
- rlen- number of rows
- clen- number of cols
- blen- rows/cols per block
- nnz- number of non-zeros
- Returns:
- memory estimate
 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, long nnz, boolean outputEmptyBlocks)
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(Hop hop) Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with the hops dimensions and number of non-zeros nnz.- Parameters:
- hop- The hop to extract dimensions and nnz from
- Returns:
- the memory estimate
 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp)Estimates the footprint (in bytes) for a partitioned in-memory representation of a matrix with dimensions=(nrows,ncols) and sparsity=sp.- Parameters:
- rlen- number of rows
- clen- number of cols
- blen- rows/cols per block
- sp- sparsity
- Returns:
- memory estimate
 
 - 
estimatePartitionedSizeExactSparsitypublic static long estimatePartitionedSizeExactSparsity(long rlen, long clen, long blen, double sp, boolean outputEmptyBlocks)
 - 
estimateSizepublic static long estimateSize(long nrows, long ncols)Similar to estimate() except that it provides worst-case estimates when the optimization type is ROBUST.- Parameters:
- nrows- number of rows
- ncols- number of cols
- Returns:
- memory estimate
 
 - 
estimateSizeEmptyBlockpublic static long estimateSizeEmptyBlock(long nrows, long ncols)
 - 
estimateSizeTextOutputpublic static long estimateSizeTextOutput(long rows, long cols, long nnz, Types.FileFormat fmt)
 - 
estimateSizeTextOutputpublic static long estimateSizeTextOutput(int[] dims, long nnz, Types.FileFormat fmt)
 - 
isIndexingRangeBlockAlignedpublic static boolean isIndexingRangeBlockAligned(IndexRange ixrange, DataCharacteristics mc) Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.- Parameters:
- ixrange- indexing range
- mc- matrix characteristics
- Returns:
- true if indexing range is block aligned
 
 - 
isIndexingRangeBlockAlignedpublic static boolean isIndexingRangeBlockAligned(long rl, long ru, long cl, long cu, long blen)Indicates if the given indexing range is block aligned, i.e., it does not require global aggregation of blocks.- Parameters:
- rl- rows lower
- ru- rows upper
- cl- cols lower
- cu- cols upper
- blen- rows/cols per block
- Returns:
- true if indexing range is block aligned
 
 - 
isValidCPDimensionspublic static boolean isValidCPDimensions(DataCharacteristics mc) 
 - 
isValidCPDimensionspublic static boolean isValidCPDimensions(long rows, long cols)Returns false if dimensions known to be invalid; other true- Parameters:
- rows- number of rows
- cols- number of cols
- Returns:
- true if dimensions valid
 
 - 
isValidCPDimensionspublic static boolean isValidCPDimensions(Types.ValueType[] schema, String[] names) Returns false if schema and names are not properly specified; other true Length to be > 0, and length of both to be equal.- Parameters:
- schema- the schema
- names- the names
- Returns:
- false if schema and names are not properly specified
 
 - 
isValidCPMatrixSizepublic static boolean isValidCPMatrixSize(long rows, long cols, double sparsity)Determines if valid matrix size to be represented in CP data structures. Note that sparsity needs to be specified as rows*cols if unknown.- Parameters:
- rows- number of rows
- cols- number of cols
- sparsity- the sparsity
- Returns:
- true if valid matrix size
 
 - 
exceedsCachingThresholdpublic static boolean exceedsCachingThreshold(long dim2, double outMem)Indicates if the given matrix characteristics exceed the threshold for caching, i.e., the matrix should be cached.- Parameters:
- dim2- dimension 2
- outMem- ?
- Returns:
- true if the given matrix characteristics exceed threshold
 
 - 
getUniqueTempFileNamepublic static String getUniqueTempFileName() Wrapper over internal filename construction for external usage.- Returns:
- unique temp file name
 
 - 
allowsToFilterEmptyBlockOutputspublic static boolean allowsToFilterEmptyBlockOutputs(Hop hop) 
 - 
getConstrainedNumThreadspublic static int getConstrainedNumThreads(int maxNumThreads) 
 - 
getTransformNumThreadspublic static int getTransformNumThreads() 
 - 
getDefaultLogLevelpublic static org.apache.log4j.Level getDefaultLogLevel() 
 - 
getMatMultNnzpublic static long getMatMultNnz(double sp1, double sp2, long m, long k, long n, boolean worstcase)
 - 
getMatMultSparsitypublic static double getMatMultSparsity(double sp1, double sp2, long m, long k, long n, boolean worstcase)Estimates the result sparsity for Matrix Multiplication A %*% B.- Parameters:
- sp1- sparsity of A
- sp2- sparsity of B
- m- nrow(A)
- k- ncol(A), nrow(B)
- n- ncol(B)
- worstcase- true if worst case
- Returns:
- the sparsity
 
 - 
getLeftIndexingSparsitypublic static double getLeftIndexingSparsity(long rlen1, long clen1, long nnz1, long rlen2, long clen2, long nnz2)
 - 
isBinaryOpConditionalSparseSafepublic static boolean isBinaryOpConditionalSparseSafe(Types.OpOp2 op) Determines if a given binary op is potentially conditional sparse safe.- Parameters:
- op- the HOP OpOp2
- Returns:
- true if potentially conditional sparse safe
 
 - 
isBinaryOpConditionalSparseSafeExactpublic static boolean isBinaryOpConditionalSparseSafeExact(Types.OpOp2 op, LiteralOp lit) Determines if a given binary op with scalar literal guarantee an output sparsity which is exactly the same as its matrix input sparsity.- Parameters:
- op- the HOP OpOp2
- lit- literal operator
- Returns:
- true if output sparsity same as matrix input sparsity
 
 - 
isBinaryOpSparsityConditionalSparseSafepublic static boolean isBinaryOpSparsityConditionalSparseSafe(Types.OpOp2 op, LiteralOp lit) 
 - 
getBinaryOpSparsityConditionalSparseSafepublic static double getBinaryOpSparsityConditionalSparseSafe(double sp1, Types.OpOp2 op, LiteralOp lit)
 - 
getBinaryOpSparsitypublic static double getBinaryOpSparsity(double sp1, double sp2, Types.OpOp2 op, boolean worstcase)Estimates the result sparsity for matrix-matrix binary operations (A op B)- Parameters:
- sp1- sparsity of A
- sp2- sparsity of B
- op- binary operation
- worstcase- true if worst case
- Returns:
- result sparsity for matrix-matrix binary operations
 
 - 
getOuterNonZerospublic static long getOuterNonZeros(long n1, long n2, long nnz1, long nnz2, Types.OpOp2 op)
 - 
getNnzpublic static long getNnz(long dim1, long dim2, double sp)
 - 
getSparsitypublic static double getSparsity(DataCharacteristics dc) 
 - 
getSparsitypublic static double getSparsity(long dim1, long dim2, long nnz)
 - 
getSparsitypublic static double getSparsity(Hop hop) 
 - 
getSparsitypublic static double getSparsity(long[] dims, long nnz)
 - 
toMBpublic static String toMB(double inB) 
 - 
getNumIterationspublic static long getNumIterations(ForProgramBlock fpb, long defaultValue) 
 - 
getNumIterationspublic static long getNumIterations(ForStatementBlock fsb, long defaultValue) 
 - 
getNumIterationspublic static long getNumIterations(ForProgramBlock fpb, LocalVariableMap vars, long defaultValue) 
 - 
rEvalSimpleLongExpressionpublic static long rEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo) Function to evaluate simple size expressions over literals and now/ncol. It returns the exact results of this expressions if known, otherwise Long.MAX_VALUE if unknown.- Parameters:
- root- the root high-level operator
- valMemo- ?
- Returns:
- size expression
 
 - 
rEvalSimpleLongExpressionpublic static long rEvalSimpleLongExpression(Hop root, HashMap<Long,Long> valMemo, LocalVariableMap vars) 
 - 
rEvalSimpleDoubleExpressionpublic static double rEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo) 
 - 
rEvalSimpleDoubleExpressionpublic static double rEvalSimpleDoubleExpression(Hop root, HashMap<Long,Double> valMemo, LocalVariableMap vars) 
 
- 
 
-