stats::sample -- the domain of
statistical samples
IntroductionA sample represents a collection of statistical data, organized as a matrix. Usually, each row refers to an individual of the population described by the sample. Each column represents an attribute.
Introductionstats::sample([[a11, ..., a1.n], ..., [a.m.1,
..., a.m.n]]) creates a sample with m rows and
n columns, a.i.j being the entry in the
i-th row, j-th column.
stats::sample([a11, ..., a.m.1]) creates a
sample with m rows and one column.
Creating
Elementsstats::sample([[a11, a12, ...], [a21, a22, ...],
...])
stats::sample([a11, a21, ...])
Parametersa11, a12, ... |
- | arithmetical expressions or strings. |
Cat::Set
Details[a.i.1, ..., a.i.n] must contain the same
number of entries.DOM_COMPLEX, DOM_EXPR, DOM_FLOAT, DOM_IDENT, DOM_INT, or DOM_RAT are regarded as ``data'' and are
stored in a sample as on input. All other types of input parameters are
converted to strings (DOM_STRING).
If one element in a column is a string or is converted to a string, then all elements of that column are converted to strings.
This produces two kinds of columns: data columns and string columns.
equal(dom s1, dom
s2)s1 and s2
are equal. Returns TRUE or FALSE,
respectively.convert(list x )x to a sample. Returns
FAIL, if this is not possible.convert_to(dom s, type T)T = DOM_LIST is implemented; a list of
all rows of s is returned as a list of lists. All other
target types T yield FAIL.expr(dom s)s to a list of lists. All entries are
converted to expressions.col2list(dom s, positive
integer or range c, ...)c-th column of the sample s
as a list. It is possible to specify more than one column index or
range of column indices.append(dom s, list
row)row as a row to the sample
s. The length of the row has to coincide with the number
of columns of the sample s._concat(dom s, dom or
list s1, ...)delCol(dom s, positive
integer or range c, ...)s specified by the argument c. NIL is returned, if all columns of
s are deleted.
It is possible to specify more than one column index or range of column indices.
delRow(dom s, positive
integer or range r, ...)s specified by the argument r.
NIL is returned, if all rows of s are
deleted.
It is possible to specify more than one row index or range of row indices.
has(dom s, list or set or
expression e)e is among the entries of
s. Returns TRUE or FALSE,
respectively.e is a list or a set, then this method tests,
whether at least one of its elements is among the entries of
s._index(dom s, positive
integer i, positive integer j)j-th entry of the i-th row of
the sample s._index.s[i, j] call this method.set_index(dom s, positive
integer i, positive integer j, any x)i,j)-th element of
s to x.
Note that no conversion to strings occurs, even if
the type of s is not one of the ``data types'' described
in the `Details' section.
s[i,
j] := x.map(dom s, any
f)f onto the rows of the sample
s. Note that rows are internally represented by lists. The
function must accept a list as input parameter and must return a list
of the same length.map.nops(dom s)s.nops.op(dom s, positive
integer i )op(dom s, [positive
integer i, positive integer j])s or the
j-th element of the i-th row, respectively.op.subsop(dom s, integer
i = list newrow, ..)s by
newrow. The length of the new row has to match the number
of columns of s. It is possible to replace several rows
simultaneously.subsop.row2list(dom s, positive
integer or range r, ..)r-th row of the sample s as a
list. It is possible to specify more than one row index or range of row
indices.print(dom s)s. This method is called by the system for displaying
samples.print.fastprint(dom s)s. The usual print command may be slow for large
samples. This method provides a somewhat faster alternative.
Example
1A sample is created from a list of rows:
>> stats::sample([[5, a], [b, 7.534], [7/4, c+d]])
5 a
b 7.534
7/4 c + d
For a sample with only one column one can use a flat list instead of a list of rows:
>> stats::sample([5, 3, 8])
5
3
8
Example
2The following input creates a small sample with columns for ``gender'', ``age'' and ``height'', respectively:
>> stats::sample([["m", 26, 180], ["f", 22, 160],
["f", 48, 155], ["m", 30, 172]])
"m" 26 180
"f" 22 160
"f" 48 155
"m" 30 172
Note that all entries in a column are automatically converted to strings, if one entry of that column is a string:
>> stats::sample([[m, 26, 180], [f, 22, 160],
["f", 48, 155], [m, 30, 172]])
"m" 26 180
"f" 22 160
"f" 48 155
"m" 30 172
Example
3The functions float, has, map, nops, op, and subsop are overloaded to work on
samples as on lists of lists:
>> s := stats::sample([[a, 1], [b, 2], [c, 3]])
a 1
b 2
c 3
>> float(s), has(s, a), map(s, list -> [list[1], list[2]^2]), nops(s), subsop(s, 1 = [d, 4]), op(s, [1, 2])
a 1.0 , TRUE, a 1 , 3, d 4 , 1
b 2.0 b 4 b 2
c 3.0 c 9 c 3
Indexing works like on arrays:
>> s[1, 2] := x : s
a x
b 2
c 3
>> delete s:
Example
4The dot operator may be used to concatenate samples and lists (regarded a samples with one row):
>> s := stats::sample([[1, a], [2, b]]): s.[X, Y].s
1 a
2 b
X Y
1 a
2 b
>> delete s:
Super-DomainAx::canonicalRep