Interface SampleEstimatorFactory
- 
 public interface SampleEstimatorFactory
- 
- 
Nested Class SummaryNested Classes Modifier and Type Interface Description static classSampleEstimatorFactory.EstimationType
 - 
Field SummaryFields Modifier and Type Field Description static org.apache.commons.logging.LogLOG
 - 
Method SummaryStatic Methods Modifier and Type Method Description static intdistinctCount(int[] frequencies, int nRows, int sampleSize, SampleEstimatorFactory.EstimationType type)Estimate a distinct number of values based on frequencies.static intdistinctCount(int[] frequencies, int nRows, int sampleSize, SampleEstimatorFactory.EstimationType type, HashMap<Integer,Double> solveCache)Estimate a distinct number of values based on frequencies.
 
- 
- 
- 
Method Detail- 
distinctCountstatic int distinctCount(int[] frequencies, int nRows, int sampleSize, SampleEstimatorFactory.EstimationType type)Estimate a distinct number of values based on frequencies.- Parameters:
- frequencies- A list of frequencies of unique values, NOTE all values contained should be larger than zero
- nRows- The total number of rows to consider, NOTE should always be larger or equal to sum(frequencies)
- sampleSize- The size of the sample, NOTE this should ideally be scaled to match the sum(frequencies) and should always be lower or equal to nRows
- type- The type of estimator to use
- Returns:
- A estimated number of unique values
 
 - 
distinctCountstatic int distinctCount(int[] frequencies, int nRows, int sampleSize, SampleEstimatorFactory.EstimationType type, HashMap<Integer,Double> solveCache)Estimate a distinct number of values based on frequencies.- Parameters:
- frequencies- A list of frequencies of unique values, NOTE all values contained should be larger than zero
- nRows- The total number of rows to consider, NOTE should always be larger or equal to sum(frequencies)
- sampleSize- The size of the sample, NOTE this should ideally be scaled to match the sum(frequencies) and should always be lower or equal to nRows
- type- The type of estimator to use
- solveCache- A solve cache to avoid repeated calculations
- Returns:
- A estimated number of unique values
 
 
- 
 
-