Class RDDConverterUtilsExt
- java.lang.Object
- 
- org.apache.sysds.runtime.instructions.spark.utils.RDDConverterUtilsExt
 
- 
 public class RDDConverterUtilsExt extends Object NOTE: These are experimental converter utils. Once thoroughly tested, they can be moved to RDDConverterUtils.
- 
- 
Nested Class SummaryNested Classes Modifier and Type Class Description static classRDDConverterUtilsExt.AddRowIDstatic classRDDConverterUtilsExt.RDDConverterTypes
 - 
Constructor SummaryConstructors Constructor Description RDDConverterUtilsExt()
 - 
Method SummaryAll Methods Static Methods Concrete Methods Modifier and Type Method Description static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>addIDToDataFrame(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df, org.apache.spark.sql.SparkSession sparkSession, String nameOfCol)Add element indices as new column to DataFramestatic org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>coordinateMatrixToBinaryBlock(org.apache.spark.api.java.JavaSparkContext sc, org.apache.spark.mllib.linalg.distributed.CoordinateMatrix input, DataCharacteristics mcIn, boolean outputEmptyBlocks)static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock>coordinateMatrixToBinaryBlock(org.apache.spark.SparkContext sc, org.apache.spark.mllib.linalg.distributed.CoordinateMatrix input, DataCharacteristics mcIn, boolean outputEmptyBlocks)static voidcopyRowBlocks(MatrixBlock mb, int rowIndex, MatrixBlock ret, int numRowsPerBlock, int rlen, int clen)static voidcopyRowBlocks(MatrixBlock mb, int rowIndex, MatrixBlock ret, long numRowsPerBlock, long rlen, long clen)static voidcopyRowBlocks(MatrixBlock mb, long rowIndex, MatrixBlock ret, int numRowsPerBlock, int rlen, int clen)static voidcopyRowBlocks(MatrixBlock mb, long rowIndex, MatrixBlock ret, long numRowsPerBlock, long rlen, long clen)static voidpostProcessAfterCopying(MatrixBlock ret)static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>projectColumns(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df, ArrayList<String> columns)static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row>stringDataFrameToVectorDataFrame(org.apache.spark.sql.SparkSession sparkSession, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> inputDF)Convert a dataframe of comma-separated string rows to a dataframe of ml.linalg.Vector rows.
 
- 
- 
- 
Method Detail- 
coordinateMatrixToBinaryBlockpublic static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> coordinateMatrixToBinaryBlock(org.apache.spark.api.java.JavaSparkContext sc, org.apache.spark.mllib.linalg.distributed.CoordinateMatrix input, DataCharacteristics mcIn, boolean outputEmptyBlocks) 
 - 
coordinateMatrixToBinaryBlockpublic static org.apache.spark.api.java.JavaPairRDD<MatrixIndexes,MatrixBlock> coordinateMatrixToBinaryBlock(org.apache.spark.SparkContext sc, org.apache.spark.mllib.linalg.distributed.CoordinateMatrix input, DataCharacteristics mcIn, boolean outputEmptyBlocks) 
 - 
projectColumnspublic static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> projectColumns(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df, ArrayList<String> columns)
 - 
copyRowBlockspublic static void copyRowBlocks(MatrixBlock mb, int rowIndex, MatrixBlock ret, int numRowsPerBlock, int rlen, int clen) 
 - 
copyRowBlockspublic static void copyRowBlocks(MatrixBlock mb, long rowIndex, MatrixBlock ret, int numRowsPerBlock, int rlen, int clen) 
 - 
copyRowBlockspublic static void copyRowBlocks(MatrixBlock mb, int rowIndex, MatrixBlock ret, long numRowsPerBlock, long rlen, long clen) 
 - 
copyRowBlockspublic static void copyRowBlocks(MatrixBlock mb, long rowIndex, MatrixBlock ret, long numRowsPerBlock, long rlen, long clen) 
 - 
postProcessAfterCopyingpublic static void postProcessAfterCopying(MatrixBlock ret) 
 - 
addIDToDataFramepublic static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> addIDToDataFrame(org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> df, org.apache.spark.sql.SparkSession sparkSession, String nameOfCol)Add element indices as new column to DataFrame- Parameters:
- df- input data frame
- sparkSession- the Spark Session
- nameOfCol- name of index column
- Returns:
- new data frame
 
 - 
stringDataFrameToVectorDataFramepublic static org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> stringDataFrameToVectorDataFrame(org.apache.spark.sql.SparkSession sparkSession, org.apache.spark.sql.Dataset<org.apache.spark.sql.Row> inputDF)Convert a dataframe of comma-separated string rows to a dataframe of ml.linalg.Vector rows.Example input rows: 
 ((1.2, 4.3, 3.4))
 (1.2, 3.4, 2.2)
 [[1.2, 34.3, 1.2, 1.25]]
 [1.2, 3.4]
 - Parameters:
- sparkSession- Spark Session
- inputDF- dataframe of comma-separated row strings to convert to dataframe of ml.linalg.Vector rows
- Returns:
- dataframe of ml.linalg.Vector rows
 
 
- 
 
-