Class CacheableData<T extends CacheBlock>
- java.lang.Object
- 
- org.apache.sysds.runtime.instructions.cp.Data
- 
- org.apache.sysds.runtime.controlprogram.caching.CacheableData<T>
 
 
- 
- All Implemented Interfaces:
- Serializable
 - Direct Known Subclasses:
- FrameObject,- MatrixObject,- TensorObject
 
 public abstract class CacheableData<T extends CacheBlock> extends Data Each object of this class is a cache envelope for some large piece of data called "cache block". For example, the body of a matrix can be the cache block. The term cache block refers strictly to the cacheable portion of the data object, often excluding metadata and auxiliary parameters, as defined in the subclasses. Under the protection of the envelope, the data blob may be evicted to the file system; then the subclass must set its reference tonullto allow Java garbage collection. If other parts of the system continue keep references to the cache block, its eviction will not release any memory.- See Also:
- Serialized Form
 
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classCacheableData.CacheStatusDefines all possible cache status types for a data blob.
 - 
Field SummaryFields Modifier and Type Field Description static StringcacheEvictionLocalFilePathstatic StringcacheEvictionLocalFilePrefixstatic booleanCACHING_ASYNC_FILECLEANUPstatic booleanCACHING_ASYNC_SERIALIZEstatic booleanCACHING_BUFFER_PAGECACHEstatic LazyWriteBuffer.RPolicyCACHING_BUFFER_POLICYstatic StringCACHING_COUNTER_GROUP_NAMEstatic StringCACHING_EVICTION_FILEEXTENSIONstatic longCACHING_THRESHOLDstatic booleanCACHING_WRITE_CACHE_ON_READ
 - 
Method SummaryAll Methods Static Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description TacquireModify(T newData)Acquires the exclusive "write" lock for a thread that wants to throw away the old cache block data and link up with new cache block data.TacquireRead()Acquires a shared "read-only" lock, produces the reference to the cache block, restores the cache block to main memory, reads from HDFS if needed.TacquireReadAndRelease()static voidaddBroadcastSize(long size)static voidcleanupCacheDir()static voidcleanupCacheDir(boolean withDir)Deletes the DML-script-specific caching working dir.voidclearData()voidclearData(long tid)Sets the cache block reference tonull, abandons the old block.static voiddisableCaching()static voidenableCaching()voidenableCleanup(boolean flag)Enables or disables the cleanup of the associated data object on clearData().voidexportData()voidexportData(int replication)Writes, or flushes, the cache block data to HDFS.voidexportData(String fName, String outputFormat)voidexportData(String fName, String outputFormat, int replication, FileFormatProperties formatProperties)Synchronized because there might be parallel threads (parfor local) that access the same object (in case it was created before the loop).voidexportData(String fName, String outputFormat, FileFormatProperties formatProperties)voidfreeEvictedBlob()Low-level cache I/O method that deletes the file containing the evicted data blob, without reading it.longgetBlocksize()BroadcastObject<T>getBroadcastHandle()static longgetBroadcastSize()LineageItemgetCacheLineage()longgetCompressedSize()DataCharacteristicsgetDataCharacteristics()longgetDataSize()StringgetDebugName()longgetDim(int dim)FederationMapgetFedMapping()Gets the mapping of indices ranges to federated objects.FileFormatPropertiesgetFileFormatProperties()StringgetFileName()GPUObjectgetGPUObject(GPUContext gCtx)MetaDatagetMetaData()longgetNumColumns()longgetNumRows()RDDObjectgetRDDHandle()CacheableData.CacheStatusgetStatus()longgetUniqueID()booleanhasValidLineage()static voidinitCaching()Inits caching with the default uuid of DMLScriptstatic voidinitCaching(String uuid)Creates the DML-script-specific caching working dir.static booleanisBelowCachingThreshold(CacheBlock data)booleanisCached(boolean inclCachedNoWrite)static booleanisCachingActive()booleanisCleanupEnabled()Indicates if cleanup of the associated data object is enabled on clearData().booleanisCompressed()booleanisDirty()trueif the in-memory or evicted matrix may be different from the matrix located at_hdfsFileName;falseif the two matrices are supposed to be the same.booleanisFederated()Check if object is federated.booleanisFederated(FTypes.FType type)booleanisFederatedExcept(FTypes.FType type)booleanisHDFSFileExists()booleanisPendingRDDOps()booleanmoveData(String fName, String outputFormat)abstract voidrefreshMetaData()voidrelease()Releases the shared ("read-only") or exclusive ("write") lock.voidremoveGPUObject(GPUContext gCtx)voidremoveMetaData()voidsetBroadcastHandle(BroadcastObject bc)voidsetCacheLineage(LineageItem li)voidsetCompressedSize(long size)voidsetDirty(boolean flag)voidsetEmptyStatus()voidsetFedMapping(FederationMap fedMapping)Sets the mapping of indices ranges to federated objects.voidsetFileFormatProperties(FileFormatProperties props)voidsetFileName(String file)voidsetGPUObject(GPUContext gCtx, GPUObject gObj)voidsetHDFSFileExists(boolean flag)voidsetMetaData(MetaData md)voidsetRDDHandle(RDDObject rdd)StringtoString()- 
Methods inherited from class org.apache.sysds.runtime.instructions.cp.DatagetDataType, getPrivacyConstraint, getValueType, setPrivacyConstraints, updateDataCharacteristics
 
- 
 
- 
- 
- 
Field Detail- 
CACHING_THRESHOLDpublic static final long CACHING_THRESHOLD 
 - 
CACHING_BUFFER_POLICYpublic static final LazyWriteBuffer.RPolicy CACHING_BUFFER_POLICY 
 - 
CACHING_BUFFER_PAGECACHEpublic static final boolean CACHING_BUFFER_PAGECACHE - See Also:
- Constant Field Values
 
 - 
CACHING_WRITE_CACHE_ON_READpublic static final boolean CACHING_WRITE_CACHE_ON_READ - See Also:
- Constant Field Values
 
 - 
CACHING_COUNTER_GROUP_NAMEpublic static final String CACHING_COUNTER_GROUP_NAME - See Also:
- Constant Field Values
 
 - 
CACHING_EVICTION_FILEEXTENSIONpublic static final String CACHING_EVICTION_FILEEXTENSION - See Also:
- Constant Field Values
 
 - 
CACHING_ASYNC_FILECLEANUPpublic static final boolean CACHING_ASYNC_FILECLEANUP - See Also:
- Constant Field Values
 
 - 
CACHING_ASYNC_SERIALIZEpublic static final boolean CACHING_ASYNC_SERIALIZE - See Also:
- Constant Field Values
 
 - 
cacheEvictionLocalFilePathpublic static String cacheEvictionLocalFilePath 
 - 
cacheEvictionLocalFilePrefixpublic static String cacheEvictionLocalFilePrefix 
 
- 
 - 
Method Detail- 
enableCleanuppublic void enableCleanup(boolean flag) Enables or disables the cleanup of the associated data object on clearData().- Parameters:
- flag- true if cleanup
 
 - 
isCleanupEnabledpublic boolean isCleanupEnabled() Indicates if cleanup of the associated data object is enabled on clearData().- Returns:
- true if cleanup enabled
 
 - 
getStatuspublic CacheableData.CacheStatus getStatus() 
 - 
isHDFSFileExistspublic boolean isHDFSFileExists() 
 - 
setHDFSFileExistspublic void setHDFSFileExists(boolean flag) 
 - 
getFileNamepublic String getFileName() 
 - 
getUniqueIDpublic long getUniqueID() 
 - 
setFileNamepublic void setFileName(String file) 
 - 
isDirtypublic boolean isDirty() trueif the in-memory or evicted matrix may be different from the matrix located at_hdfsFileName;falseif the two matrices are supposed to be the same.- Returns:
- true if dirty
 
 - 
setDirtypublic void setDirty(boolean flag) 
 - 
getFileFormatPropertiespublic FileFormatProperties getFileFormatProperties() 
 - 
setFileFormatPropertiespublic void setFileFormatProperties(FileFormatProperties props) 
 - 
setMetaDatapublic void setMetaData(MetaData md) - Overrides:
- setMetaDatain class- Data
 
 - 
setCompressedSizepublic void setCompressedSize(long size) 
 - 
isCompressedpublic boolean isCompressed() 
 - 
getCompressedSizepublic long getCompressedSize() 
 - 
getMetaDatapublic MetaData getMetaData() - Overrides:
- getMetaDatain class- Data
 
 - 
removeMetaDatapublic void removeMetaData() - Overrides:
- removeMetaDatain class- Data
 
 - 
getDataCharacteristicspublic DataCharacteristics getDataCharacteristics() 
 - 
getDimpublic long getDim(int dim) 
 - 
getNumRowspublic long getNumRows() 
 - 
getNumColumnspublic long getNumColumns() 
 - 
getBlocksizepublic long getBlocksize() 
 - 
refreshMetaDatapublic abstract void refreshMetaData() 
 - 
getCacheLineagepublic LineageItem getCacheLineage() 
 - 
setCacheLineagepublic void setCacheLineage(LineageItem li) 
 - 
hasValidLineagepublic boolean hasValidLineage() 
 - 
isFederatedpublic boolean isFederated() Check if object is federated.- Returns:
- true if federated else false
 
 - 
isFederatedpublic boolean isFederated(FTypes.FType type) 
 - 
isFederatedExceptpublic boolean isFederatedExcept(FTypes.FType type) 
 - 
getFedMappingpublic FederationMap getFedMapping() Gets the mapping of indices ranges to federated objects.- Returns:
- fedMapping mapping
 
 - 
setFedMappingpublic void setFedMapping(FederationMap fedMapping) Sets the mapping of indices ranges to federated objects.- Parameters:
- fedMapping- mapping
 
 - 
getRDDHandlepublic RDDObject getRDDHandle() 
 - 
setRDDHandlepublic void setRDDHandle(RDDObject rdd) 
 - 
getBroadcastHandlepublic BroadcastObject<T> getBroadcastHandle() 
 - 
setBroadcastHandlepublic void setBroadcastHandle(BroadcastObject bc) 
 - 
getGPUObjectpublic GPUObject getGPUObject(GPUContext gCtx) 
 - 
setGPUObjectpublic void setGPUObject(GPUContext gCtx, GPUObject gObj) 
 - 
removeGPUObjectpublic void removeGPUObject(GPUContext gCtx) 
 - 
acquireReadAndReleasepublic T acquireReadAndRelease() 
 - 
acquireReadpublic T acquireRead() Acquires a shared "read-only" lock, produces the reference to the cache block, restores the cache block to main memory, reads from HDFS if needed. Synchronized because there might be parallel threads (parfor local) that access the same object (in case it was created before the loop). In-Status: EMPTY, EVICTABLE, EVICTED, READ; Out-Status: READ(+1).- Returns:
- cacheable data
 
 - 
acquireModifypublic T acquireModify(T newData) Acquires the exclusive "write" lock for a thread that wants to throw away the old cache block data and link up with new cache block data. Abandons the old data without reading it and sets the new data reference. In-Status: EMPTY, EVICTABLE, EVICTED; Out-Status: MODIFY.- Parameters:
- newData- new data
- Returns:
- cacheable data
 
 - 
releasepublic void release() Releases the shared ("read-only") or exclusive ("write") lock. Updates size information, last-access time, metadata, etc. Synchronized because there might be parallel threads (parfor local) that access the same object (in case it was created before the loop). In-Status: READ, MODIFY; Out-Status: READ(-1), EVICTABLE, EMPTY.
 - 
clearDatapublic void clearData() 
 - 
clearDatapublic void clearData(long tid) Sets the cache block reference tonull, abandons the old block. Makes the "envelope" empty. Run it to finalize the object (otherwise the evicted cache block file may remain undeleted). In-Status: EMPTY, EVICTABLE, EVICTED; Out-Status: EMPTY.- Parameters:
- tid- thread ID
 
 - 
exportDatapublic void exportData() 
 - 
exportDatapublic void exportData(int replication) Writes, or flushes, the cache block data to HDFS. In-Status: EMPTY, EVICTABLE, EVICTED, READ; Out-Status: EMPTY, EVICTABLE, EVICTED, READ.- Parameters:
- replication- ?
 
 - 
exportDatapublic void exportData(String fName, String outputFormat, FileFormatProperties formatProperties) 
 - 
exportDatapublic void exportData(String fName, String outputFormat, int replication, FileFormatProperties formatProperties) Synchronized because there might be parallel threads (parfor local) that access the same object (in case it was created before the loop). If all threads export the same data object concurrently it results in errors because they all write to the same file. Efficiency for loops and parallel threads is achieved by checking if the in-memory block is dirty. NOTE: MB: we do not use dfs copy from local (evicted) to HDFS because this would ignore the output format and most importantly would bypass reblocking during write (which effects the potential degree of parallelism). However, we copy files on HDFS if certain criteria are given.- Parameters:
- fName- file name
- outputFormat- format
- replication- ?
- formatProperties- file format properties
 
 - 
freeEvictedBlobpublic final void freeEvictedBlob() Low-level cache I/O method that deletes the file containing the evicted data blob, without reading it. Must be defined by a subclass, never called by users.
 - 
isBelowCachingThresholdpublic static boolean isBelowCachingThreshold(CacheBlock data) 
 - 
getDataSizepublic long getDataSize() 
 - 
getDebugNamepublic String getDebugName() - Specified by:
- getDebugNamein class- Data
 
 - 
isCachedpublic boolean isCached(boolean inclCachedNoWrite) 
 - 
setEmptyStatuspublic void setEmptyStatus() 
 - 
isPendingRDDOpspublic boolean isPendingRDDOps() 
 - 
addBroadcastSizepublic static void addBroadcastSize(long size) 
 - 
getBroadcastSizepublic static long getBroadcastSize() 
 - 
cleanupCacheDirpublic static void cleanupCacheDir() 
 - 
cleanupCacheDirpublic static void cleanupCacheDir(boolean withDir) Deletes the DML-script-specific caching working dir.- Parameters:
- withDir- if true, delete directory
 
 - 
initCachingpublic static void initCaching() throws IOExceptionInits caching with the default uuid of DMLScript- Throws:
- IOException- if IOException occurs
 
 - 
initCachingpublic static void initCaching(String uuid) throws IOException Creates the DML-script-specific caching working dir. Takes the UUID in order to allow for custom uuid, e.g., for remote parfor caching- Parameters:
- uuid- ID
- Throws:
- IOException- if IOException occurs
 
 - 
isCachingActivepublic static boolean isCachingActive() 
 - 
disableCachingpublic static void disableCaching() 
 - 
enableCachingpublic static void enableCaching() 
 
- 
 
-