Table of Contents
Stat
Stat module for DataFrame, providing basic statistical metrics for numeric columns.
Parameters
df
DataFrame An instance of DataFrame.
sum
Compute the sum of a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.
Examples
df.stat.sum('column1')
Returns Number The sum of the column.
max
Compute the maximal value into a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.
Examples
df.stat.max('column1')
Returns Number The maximal value into the column.
min
Compute the minimal value into a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.
Examples
df.stat.min('column1')
Returns Number The minimal value into the column.
mean
Compute the mean value into a numeric column.
Parameters
columnName
String The column to evaluate,isNumber(n.get(columnName)) ? p + Number( containing Numbers.
Examples
df.stat.mean('column1')
Returns Number The mean value into the column.
average
Compute the mean value into a numeric column. Alias from mean.
Parameters
columnName
String The column to evaluate, containing Numbers.
Examples
df.stat.min('column1')
Returns Number The mean value into the column.
var
Compute the variance into a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.population
Boolean Population mode. If true, provide the population variance, not the sample one. (optional, defaultfalse
)
Examples
df.stat.var('column1')
Returns Number The variance into the column.
sd
Compute the standard deviation into a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.population
Boolean Population mode. If true, provide the population standard deviation, not the sample one. (optional, defaultfalse
)
Examples
df.stat.sd('column1')
Returns Number The standard deviation into the column.
stats
Compute all the stats available with the Stat module on a numeric column.
Parameters
columnName
String The column to evaluate, containing Numbers.
Examples
df.stat.stats('column1')
Returns Object An dictionnary containing all statistical metrics available.