[Gllug] Stories of using filters
Steve Cobrin
steve.cobrin at highbury.net
Fri Mar 18 11:46:35 UTC 2005
Here's a script I use frequently, snarfed off the net, and tweaked a little.
Of course it would run far faster if recoded in Perl, but its quite a nice (sic)
example of lots of piping!
[best viewed with fixed pitch font.]
-- Steve
#!/bin/sh
# September 2000 * Padraig at Brady001.iol.ie
#
# Customised 9 Dec 2001 Steve Cobrin <cobrin at highbury.net>
#
# BUGS
# Doesn't handle files with funny characters in filename, e.g backspace
#
CMD=`basename $0 .sh`
USAGE="usage: $CMD [path...]"
#
################################################################################
args=
size=1024c # 1k bytes
while [ $# -gt 0 ]
do
case $1 in
-s | --size )
if [ $# -gt 1 ]
then
shift
size=$1
else
echo "$CMD: missing parameter" 1>&2
echo "$USAGE" 1>&2
exit 1
fi
;;
-*) echo "$CMD: unknown options \"$1\"" 1>&2
echo "$USAGE" 1>&2
exit 1
;;
*) break
;;
esac
shift
done
if [ $# -gt 0 ]
then
args=$*
else
args="."
fi
################################################################################
# find -- find all files bigger than $size, outputting "filename<nul>inode<nul><size>"
# tr -- protect embedded tabs and spaces, then replace nulls with spaces so other commands can process output
# sort -- sort on size (largest first) then inode, ignore duplicate lines
# uniq -- remove duplicates with same inode and size
# cut -- cut out all but filename part
# sort on filename
# tr -- put back spaces and tabs, and replace newline with null
# generate m55sums
# sort on checksum
# protect spaces and tabs
# swap checksum and filename round
# only show duplicate entries of checksum
# switch back checksum and filename
# swap back spaces and tabs
find $args -xdev -size +$size -type f ! -type l -printf "%p\0%i\0%s\n" \
| tr ' \t\0' '\0\1 ' \
| sort +2nr +1 -u \
| uniq -2 -D \
| cut -f1 -d' ' \
| sort \
| tr '\0\1\n' ' \t\0' \
| xargs -0 md5sum \
| sort +0 -1 \
| tr ' \t' '\1\2' \
| sed -e 's/\(^.\{32\}\)..\(.*\)/\2 \1/' \
| uniq -D -1 \
| sed -e 's/\(^.*\) \(.*\)/\2 \1/' \
| tr '\1\2' ' \t' \
| (
psum='no match'
line=''
while read sum file; do
if [ "$sum" != "$psum" ]; then
if [ ! -z "$line" ]; then
echo -e "$line"
fi
#line="`du -b "$file"`"
line="`cat "$file" | wc -c`"
psum="$sum"
fi
line="$line `echo $file | sed -e 's/ /\\\\ /g'`"
done
if [ ! -z "$line" ]; then
echo -e "$line"
fi
) \
| sed -e 's/^ *//' \
| sort +0 -1 -brn \
| cut -d" " -f2- \
| sed -e 's/\([^\\]\) /\1\
/g' \
| while read files
do : files=\"$files\"
ls -l "$files"
done
: END of script
--
Gllug mailing list - Gllug at gllug.org.uk
http://lists.gllug.org.uk/mailman/listinfo/gllug
More information about the GLLUG
mailing list