/** This function contains methods for finding the averages (mean and mode(s)) along with the standard deviation of an enumerating expression of data. This is now part of Frink's standard library and this file does not need to be included. These use Welford's algorithm for calculating the "Corrected Sum of Squares" which are the "sum of squares of the deviations of the values about their mean." Welford, B. P. (1962). Note on a Method for Calculating Corrected Sums of Squares and Products. Technometrics, 4(3), 419–420. doi:10.1080/00401706.1962.10490022 https://sci-hub.se/https://doi.org/10.1080/00401706.1962.10490022 */ /** Calculates the mean and standard deviation of an array or enumerating expression. This uses Welford's algorithm as cited in Knuth, The Art of Computer Programming, Vol. 2, 3rd edition, page 232. This gets the units of measurement right and can even be used symbolically. Arguments: [list, sample] where list: is an array or enumerating expression of the items to average. sample: a boolean flag indicating if we want the standard deviation to be the sample standard deviation (=true) or the population standard deviation (=false). If you are in doubt, it's probably safer and more conservative to set this to true (for sample standard deviation) giving a larger standard deviation. Returns: [mean, sd, number] where mean is the mean of the sequence of data sd is the (population or sample) standard deviation number is the number of elements in the list. */ meanAndSD[list, sample] := { M = undef // Make units come out right S = undef k = 1 for v = list { if M == undef { M = v S = 0 v^2 // Make units come out right } else { oldM = M diff = v - oldM M = M + diff / k S = S + diff * (v - M) } k = k+1 } if sample == true sub = 1 // Sample standard deviation, subtract 1 from num else sub = 0 // Population standard deviation return [M, sqrt[S/(k-1-sub)], k-1] } /** Calculates the mean of an array or enumerating expression. This uses Welford's algorithm as cited in Knuth, The Art of Computer Programming, Vol. 2, 3rd edition, page 232. This gets the units of measurement right and can even be used symbolically. Arguments: [list] where list: is an array or enumerating expression of the items to average. Returns: the mean of the list. */ mean[list] := { M = undef // Make units come out right k = 1 for v = list { if M == undef M = v // Make units come out right. else M = M + (v - M) / k k = k+1 } return M } /** Returns the mode(s) of a list, that is, the value(s) that occur the most times. Each element of the list must be a hashing expression, that is, can be used as a key in a dictionary. This always returns an array because there are potentially multiple equivalent modes in a distribution. This uses the more general mostCommon[list] function to do the work. The modes are not returned in any specific order (because they might actually be non-comparable to each other.) For example, modes[[1, 1, 1, 2, 4, 4, 4]] returns [1, 4] because 1 and 4 both occur the same number of times. */ modes[list] := { return mostCommon[list]@0 }