[SWLUG] adding up columns

Bryn Reeves hagbard at nildram.co.uk
Fri Dec 12 21:41:22 UTC 2003


On 19:37 Fri 12 Dec     , nat wrote:
> evening swlug'ers!
> 
> i have some csv files fairly huge ones, with data that i've accumilated 
> from various boxes around the place. i was wondering if there was a quick 
> way using shell tools to add up colums, my original files are like:
> 

hi nat,
You can do this with sed & awk, it takes about 30s on my PIV celeron to 
process a 60Mb file. Just put the script in a file & make it executable:


	#!/bin/sh 
	#
	# sum tables of the form:
	# locationN, COUNT
	# 
	# usage: sum <input files>

	AWK_SUM='BEGIN {                    
	        NUM=1                       
	}                                   
	                                    
	{                                   
	  if((SUM[$1] == 0) && ($2 != 0)){  
	   	KEYS[NUM]=$1
	  	NUM++                       
	  }                                 
	  SUM[$1] += $2                     
	}                                   
	                                    
	 END {                              
	  for (N=1;(N<(NUM)); N++) {        
	  printf("%s,%d\n",KEYS[N],SUM[KEYS[N]])
	  }                                  
	}'

	SED_STRIP="s/,/ /g"

	cat $@ | sed "$SED_STRIP" | awk "$AWK_SUM" 

python or perl might give better performance, I'm not sure:

	$ ./sum.sh data.csv
	location1,18
	location2,8
	location3,4

	bryn at metatron stuff $ time ./sum.sh big_data.csv
	location1,15845280
	location2,7042464
	location3,3521232
 
	real    0m33.075s
	user    0m32.058s
	sys     0m0.563s

Cheers

Bryn





More information about the Swlug mailing list