stats/sample
src/stats/sample.tur
defn
bootstrap
(bootstrap [col :int stat-fn :int n-reps :int conf-level :float method :int rng :int] :int)
one-sample bootstrap confidence interval.
Parameters
| col | :int float64 column | |
| stat-fn | :int function handle (fn [col :int] :float) | |
| n-reps | :int number of bootstrap replicates | |
| conf-level | :float confidence level, e.g. 0.95 | |
| method | :int 0 = percentile, 1 = BCa | |
| rng | :int rng handle from stats/rng |
Returns
:int test-result with estimate = observed stat, ci-low/ci-high set.
Since: ST8
defn
bootstrap-2samp
(bootstrap-2samp [a :int b :int stat-fn :int n-reps :int conf-level :float method :int rng :int] :int)
two-sample bootstrap CI on the difference of a statistic.
Parameters
| a | :int float64 column | |
| b | :int float64 column | |
| stat-fn | :int tag (0=mean-diff, 1=sd-ratio) | |
| n-reps | :int number of replicates | |
| conf-level | :float | |
| method | :int 0=percentile, 1=BCa (BCa uses jackknife approx) | |
| rng | :int rng handle |
Returns
:int test-result
Since: ST8
defn
permutation-test
(permutation-test [a :int b :int stat-fn :int n-reps :int alt-tag :int rng :int] :int)
non-parametric permutation test.
Parameters
| a | :int float64 column (group 1) | |
| b | :int float64 column (group 2) | |
| stat-fn | :int tag (0=mean-diff, 1=abs-mean-diff) | |
| n-reps | :int number of permutations | |
| alt-tag | :int 0=two.sided, 1=less, 2=greater | |
| rng | :int rng handle |
Returns
:int test-result with statistic=observed, p-value=permutation p
Since: ST8
defn
train-test-split
(train-test-split [f :int test-frac :float stratify-col :int rng :int] :int)
split a frame into train/test subsets.
Parameters
| f | :int frame handle | |
| test-frac | :float fraction of rows in the test set (e.g. 0.2) | |
| stratify-col | :int column handle to stratify on, or 0 for none | |
| rng | :int rng handle |
Returns
:int result<(cons train-frame test-frame)>
Since: ST8
defn
cv-folds
(cv-folds [n :int k :int shuffle? :int rng :int] :int)
K-fold cross-validation index sets.
Parameters
| n | :int total number of observations | |
| k | :int number of folds | |
| shuffle? | :int 1 to shuffle indices before splitting | |
| rng | :int rng handle (ignored if shuffle?=0) |
Returns
:int cons list of (cons train-indices test-indices) pairs. Each element is a cons cell; car = cons list of train row indices, cdr = cons list of test row indices.
Since: ST8
defn
cv-folds-stratified
(cv-folds-stratified [col :int k :int shuffle? :int rng :int] :int)
stratified K-fold CV.
Parameters
| col | :int float64 or int column of class labels | |
| k | :int number of folds | |
| shuffle? | :int 1 to shuffle within each stratum before assigning | |
| rng | :int rng handle |
Returns
:int same structure as cv-folds.
Since: ST8