macOS Running the Average and Std Dev on a data set - Newbie

chown33 · Jan 15, 2010

Awk program and simple data-sets follow.

Note 1: It uses the default precision for output, which is 6 fractional digits. This could be changed pretty easily.

Note 2: There are no checks. Input is not validated as numeric. Overflow and underflow are unchecked. Loss of precision is uncertain (i.e. I have not done an error analysis).

Example command-lines in Terminal:

Code:

awk -f awklet.awk in1.txt >out1.txt

awk -f awklet.awk in2.txt >out2.txt

awk -f awklet.awk in1.txt in2.txt >outCombined.txt

Contents of awklet.awk:

Code:

# Calculate running mean and stddev at each input line.
# Each input line is a single number.
# Each output line is the input number,
# followed by the mean and stddev calculated up to that point.
# This has not been optimized at all.

BEGIN {
	N = 0
	sum = 0
	sumOfSquares = 0
}

{
	++N;
	sum += $1;
	sumOfSquares += ($1 * $1);

	mean = sum/N;

	## std dev = sqrt( (avg of value^2) - (avg of value)^2 ) 
	## see http://coding.derkeiler.com/Archive/Ada/comp.lang.ada/2004-01/0872.html
	variance = (sumOfSquares / N) - (mean * mean);
	stddev = sqrt( variance );

	print $1, mean, stddev
}

in1.txt

Code:

in2.txt

Code:

jared_kipe · Jan 20, 2010

chown33 said:
Awk program and simple data-sets follow.

Note 1: It uses the default precision for output, which is 6 fractional digits. This could be changed pretty easily.

Note 2: There are no checks. Input is not validated as numeric. Overflow and underflow are unchecked. Loss of precision is uncertain (i.e. I have not done an error analysis).

Example command-lines in Terminal:

Code:

awk -f awklet.awk in1.txt >out1.txt awk -f awklet.awk in2.txt >out2.txt awk -f awklet.awk in1.txt in2.txt >outCombined.txt

Contents of awklet.awk:

Code:

# Calculate running mean and stddev at each input line. # Each input line is a single number. # Each output line is the input number, # followed by the mean and stddev calculated up to that point. # This has not been optimized at all. BEGIN { N = 0 sum = 0 sumOfSquares = 0 } { ++N; sum += $1; sumOfSquares += ($1 * $1); mean = sum/N; ## std dev = sqrt( (avg of value^2) - (avg of value)^2 ) ## see http://coding.derkeiler.com/Archive/Ada/comp.lang.ada/2004-01/0872.html variance = (sumOfSquares / N) - (mean * mean); stddev = sqrt( variance ); print $1, mean, stddev }

You WIN!!
I've never heard of awk, but it seems like an amazingly loosely typed language. Seems great for these little one time use things.

Also, don't you just love when you provide actual useful examples or code and the OP doesn't bother replying with a "thank you" or, "hey yeah that worked".

tyrant · Jan 21, 2010

jared_kipe said:
Also, don't you just love when you provide actual useful examples or code and the OP doesn't bother replying with a "thank you" or, "hey yeah that worked".

Wow, sorry guys (and gals) I was traveling and wasn't able to get back to ya'll. I DO want to say thank you for all of the great information and help from everyone in this thread.

Search

Search

macOS Running the Average and Std Dev on a data set - Newbie

chown33

Moderator

jared_kipe

macrumors 68030

tyrant

macrumors newbie

Our Staff