Browse Tag: awk

Enumerate columns for awk

I’m bad at counting, so when I’m using awk to print specific fields, I end up with greasy fingerprints on my screen as I manually count out each field. Thanks to my colleague James, here’s a script that counts for you!

awk 'NR == 1 { for (i=1;i<=NF;i++) {printf i " "} print ""} {print}' | column -t

Works with STDIN as is, assuming default field separator (space):

[kale@superhappykittymeow log]# tail -n 1 xferlog |awk 'NR == 1 { for (i=1;i<=NF;i++) {printf i " "} print ""} {print}' | column -t
1    2    3   4         5     6  7          8    9                          10  11  12  13  14    15   16  17  18
Sat  Jun  19  13:19:25  2010  1  127.0.0.1  220  /var/www/poop/wp-rss2.php  b   _   i   r   root  ftp  0   *   c

Or, if you're lazy like myself, encapsulate it in an alias:

alias count='awk 'NR == 1 { for (i=1;i<=NF;i++) {printf i " "} print ""} {print}' | column -t'

[kale@superhappykittymeow log]# tail -n 1 xferlog | count
1    2    3   4         5     6  7          8    9                          10  11  12  13  14    15   16  17  18
Sat  Jun  19  13:19:25  2010  1  127.0.0.1  220  /var/www/poop/wp-rss2.php  b   _   i   r   root  ftp  0   *   c

Practical awk (for Apache logs)

Who is hotlinking?
[code lang=”bash”]awk -F\” ‘($2 ~ /\.(jpg|gif)/ && $4 !~ /^http:\/\/www\.yourdomain\.com/){print $4}’ access_log.processed \
| sort | uniq -c | sort[/code]

Blank referrers (usually indicates direct hits, such as a user typing in yourdomain.com, or a script):
[code lang=”bash”]awk -F\” ‘($6 ~ /^-?$/)’ access_log.processed | awk ‘{print $1}’ | sort | uniq[/code]

How many different IPs visited on a specific day (and how often they visited):
[code lang=”bash”]grep ’12/Dec/2008′ access_log.processed | \
awk ‘{cnt[$1]++;} END{for (ip in cnt){printf(“%-15s visited: %04d time(s).\n”, ip, cnt[ip])}}'[/code]

Amount of data transferred for a specific date:
[code lang=”bash”]grep ’12/Dec/2008’ access_log.processed | awk ‘{ SUM += $10} END { print SUM/1024/1024 }'[/code]