public class ChiSquareTest extends Object
This implementation handles both known and unknown distributions.
Two samples tests can be used when the distribution is unknown a priori but provided by one sample, or when the hypothesis under test is that the two samples come from the same underlying distribution.
| Constructor and Description | 
|---|
| ChiSquareTest()Construct a ChiSquareTest | 
| Modifier and Type | Method and Description | 
|---|---|
| double | chiSquare(double[] expected,
         long[] observed) | 
| double | chiSquare(long[][] counts)Computes the Chi-Square statistic associated with a
 
  chi-square test of independence based on the input  countsarray, viewed as a two-way table. | 
| double | chiSquareDataSetsComparison(long[] observed1,
                           long[] observed2)Computes a
 
 Chi-Square two sample test statistic comparing bin frequency counts
 in  observed1andobserved2. | 
| double | chiSquareTest(double[] expected,
             long[] observed)Returns the observed significance level, or 
 p-value, associated with a
 
 Chi-square goodness of fit test comparing the  observedfrequency counts to those in theexpectedarray. | 
| boolean | chiSquareTest(double[] expected,
             long[] observed,
             double alpha)Performs a 
 Chi-square goodness of fit test evaluating the null hypothesis that the
 observed counts conform to the frequency distribution described by the expected
 counts, with significance level  alpha. | 
| double | chiSquareTest(long[][] counts)Returns the observed significance level, or 
 p-value, associated with a
 
 chi-square test of independence based on the input  countsarray, viewed as a two-way table. | 
| boolean | chiSquareTest(long[][] counts,
             double alpha)Performs a 
 chi-square test of independence evaluating the null hypothesis that the
 classifications represented by the counts in the columns of the input 2-way table
 are independent of the rows, with significance level  alpha. | 
| double | chiSquareTestDataSetsComparison(long[] observed1,
                               long[] observed2)Returns the observed significance level, or 
 p-value, associated with a Chi-Square two sample test comparing
 bin frequency counts in  observed1andobserved2. | 
| boolean | chiSquareTestDataSetsComparison(long[] observed1,
                               long[] observed2,
                               double alpha)Performs a Chi-Square two sample test comparing two binned data
 sets. | 
public double chiSquare(double[] expected,
               long[] observed)
                 throws NotPositiveException,
                        NotStrictlyPositiveException,
                        DimensionMismatchException
observed and expected
 frequency counts.
 This statistic can be used to perform a Chi-Square test evaluating the null hypothesis that the observed counts follow the expected distribution.
Preconditions:
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
Note: This implementation rescales the
 expected array if necessary to ensure that the sum of the
 expected and observed counts are equal.
observed - array of observed frequency countsexpected - array of expected frequency countsNotPositiveException - if observed has negative entriesNotStrictlyPositiveException - if expected has entries that are
 not strictly positiveDimensionMismatchException - if the arrays length is less than 2public double chiSquareTest(double[] expected,
                   long[] observed)
                     throws NotPositiveException,
                            NotStrictlyPositiveException,
                            DimensionMismatchException,
                            MaxCountExceededException
observed
 frequency counts to those in the expected array.
 The number returned is the smallest significance level at which one can reject the null hypothesis that the observed counts conform to the frequency distribution described by the expected counts.
Preconditions:
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
Note: This implementation rescales the
 expected array if necessary to ensure that the sum of the
 expected and observed counts are equal.
observed - array of observed frequency countsexpected - array of expected frequency countsNotPositiveException - if observed has negative entriesNotStrictlyPositiveException - if expected has entries that are
 not strictly positiveDimensionMismatchException - if the arrays length is less than 2MaxCountExceededException - if an error occurs computing the p-valuepublic boolean chiSquareTest(double[] expected,
                    long[] observed,
                    double alpha)
                      throws NotPositiveException,
                             NotStrictlyPositiveException,
                             DimensionMismatchException,
                             OutOfRangeException,
                             MaxCountExceededException
alpha.  Returns true iff the null
 hypothesis can be rejected with 100 * (1 - alpha) percent confidence.
 
 Example:
 To test the hypothesis that observed follows
 expected at the 99% level, use 
 chiSquareTest(expected, observed, 0.01) 
Preconditions:
 0 < alpha < 0.5 
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
Note: This implementation rescales the
 expected array if necessary to ensure that the sum of the
 expected and observed counts are equal.
observed - array of observed frequency countsexpected - array of expected frequency countsalpha - significance level of the testNotPositiveException - if observed has negative entriesNotStrictlyPositiveException - if expected has entries that are
 not strictly positiveDimensionMismatchException - if the arrays length is less than 2OutOfRangeException - if alpha is not in the range (0, 0.5]MaxCountExceededException - if an error occurs computing the p-valuepublic double chiSquare(long[][] counts)
                 throws NullArgumentException,
                        NotPositiveException,
                        DimensionMismatchException
counts
  array, viewed as a two-way table.
 
 The rows of the 2-way table are
 count[0], ... , count[count.length - 1] 
Preconditions:
counts must have at
  least 2 columns and at least 2 rows.
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
counts - array representation of 2-way tableNullArgumentException - if the array is nullDimensionMismatchException - if the array is not rectangularNotPositiveException - if counts has negative entriespublic double chiSquareTest(long[][] counts)
                     throws NullArgumentException,
                            DimensionMismatchException,
                            NotPositiveException,
                            MaxCountExceededException
counts
 array, viewed as a two-way table.
 
 The rows of the 2-way table are
 count[0], ... , count[count.length - 1] 
Preconditions:
counts must have at least 2
     columns and at least 2 rows.
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
counts - array representation of 2-way tableNullArgumentException - if the array is nullDimensionMismatchException - if the array is not rectangularNotPositiveException - if counts has negative entriesMaxCountExceededException - if an error occurs computing the p-valuepublic boolean chiSquareTest(long[][] counts,
                    double alpha)
                      throws NullArgumentException,
                             DimensionMismatchException,
                             NotPositiveException,
                             OutOfRangeException,
                             MaxCountExceededException
alpha.
 Returns true iff the null hypothesis can be rejected with 100 * (1 - alpha) percent
 confidence.
 
 The rows of the 2-way table are
 count[0], ... , count[count.length - 1] 
 Example:
 To test the null hypothesis that the counts in
 count[0], ... , count[count.length - 1] 
  all correspond to the same underlying probability distribution at the 99% level, use
chiSquareTest(counts, 0.01)
Preconditions:
counts must have at least 2 columns and
     at least 2 rows.
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
counts - array representation of 2-way tablealpha - significance level of the testNullArgumentException - if the array is nullDimensionMismatchException - if the array is not rectangularNotPositiveException - if counts has any negative entriesOutOfRangeException - if alpha is not in the range (0, 0.5]MaxCountExceededException - if an error occurs computing the p-valuepublic double chiSquareDataSetsComparison(long[] observed1,
                                 long[] observed2)
                                   throws DimensionMismatchException,
                                          NotPositiveException,
                                          ZeroException
Computes a
 
 Chi-Square two sample test statistic comparing bin frequency counts
 in observed1 and observed2.  The
 sums of frequency counts in the two samples are not required to be the
 same.  The formula used to compute the test statistic is
 ∑[(K * observed1[i] - observed2[i]/K)2 / (observed1[i] + observed2[i])]
  where
 K = &sqrt;[&sum(observed2 / ∑(observed1)]
 
 This statistic can be used to perform a Chi-Square test evaluating the null hypothesis that both observed counts follow the same distribution.
Preconditions:
observed1 and observed2 must have
 the same length and their common length must be at least 2.
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
observed1 - array of observed frequency counts of the first data setobserved2 - array of observed frequency counts of the second data setDimensionMismatchException - the the length of the arrays does not matchNotPositiveException - if any entries in observed1 or
 observed2 are negativeZeroException - if either all counts of observed1 or
 observed2 are zero, or if the count at some index is zero
 for both arrayspublic double chiSquareTestDataSetsComparison(long[] observed1,
                                     long[] observed2)
                                       throws DimensionMismatchException,
                                              NotPositiveException,
                                              ZeroException,
                                              MaxCountExceededException
Returns the observed significance level, or 
 p-value, associated with a Chi-Square two sample test comparing
 bin frequency counts in observed1 and
 observed2.
 
The number returned is the smallest significance level at which one can reject the null hypothesis that the observed counts conform to the same distribution.
See chiSquareDataSetsComparison(long[], long[]) for details
 on the formula used to compute the test statistic. The degrees of
 of freedom used to perform the test is one less than the common length
 of the input observed count arrays.
 
observed1 and observed2 must
 have the same length and
 their common length must be at least 2.
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
observed1 - array of observed frequency counts of the first data setobserved2 - array of observed frequency counts of the second data setDimensionMismatchException - the the length of the arrays does not matchNotPositiveException - if any entries in observed1 or
 observed2 are negativeZeroException - if either all counts of observed1 or
 observed2 are zero, or if the count at the same index is zero
 for both arraysMaxCountExceededException - if an error occurs computing the p-valuepublic boolean chiSquareTestDataSetsComparison(long[] observed1,
                                      long[] observed2,
                                      double alpha)
                                        throws DimensionMismatchException,
                                               NotPositiveException,
                                               ZeroException,
                                               OutOfRangeException,
                                               MaxCountExceededException
Performs a Chi-Square two sample test comparing two binned data
 sets. The test evaluates the null hypothesis that the two lists of
 observed counts conform to the same frequency distribution, with
 significance level alpha.  Returns true iff the null
 hypothesis can be rejected with 100 * (1 - alpha) percent confidence.
 
See chiSquareDataSetsComparison(long[], long[]) for
 details on the formula used to compute the Chisquare statistic used
 in the test. The degrees of of freedom used to perform the test is
 one less than the common length of the input observed count arrays.
 
observed1 and observed2 must
 have the same length and their common length must be at least 2.
  0 < alpha < 0.5 
 
 If any of the preconditions are not met, an
 IllegalArgumentException is thrown.
observed1 - array of observed frequency counts of the first data setobserved2 - array of observed frequency counts of the second data setalpha - significance level of the testDimensionMismatchException - the the length of the arrays does not matchNotPositiveException - if any entries in observed1 or
 observed2 are negativeZeroException - if either all counts of observed1 or
 observed2 are zero, or if the count at the same index is zero
 for both arraysOutOfRangeException - if alpha is not in the range (0, 0.5]MaxCountExceededException - if an error occurs performing the testCopyright © 2003–2016 The Apache Software Foundation. All rights reserved.