**NOTE** The following only works with FTP daemons that log full paths in xferlog — ie, not vsftpd with its default configuration. Works like a charm on Plesk, fails terribly on non-Plesk. For non-Plesk, please scroll to the bottom of this post.
I made an earlier post about this subject, but there are too many holes in the script provided. Rather, I've found this simple awk recipe to do the trick quite well.
awk '$12 != prev {print $9; prev=$12}' xferlog | egrep "\.php|\.htm|\.shtm|\.js" | sort |uniq > ftp_modified.out
Note that the output it prints is not definitive, but it certainly gives you something to start with. Now, roll a grep:
cat ftp_modified.out |while read line; do grep -H iframe $line >> iframe.out ; done
**You will need to review this output to find the actual string and distinguish between legitimate iframes and the baddies.** The following sed will usually take care of about 80% of them:
cat iframe.out | awk -F\: '{print $1}' | while read line ; do sed -i 's/<iframe src=.*\/in\.cgi\?.*<\/iframe>//g' $line ; done
Of course, there are also JavaScript-obfuscated redirects to clean up:
cat ftp_modified.out | while read line; do grep -H eval $line >> eval.out ; done
This will catch *most* of them. Unfortunately with the JS ones, you need to develop a regex to match with sed on a per-exploit basis — and there are tons. Look over the results in eval.out and craft up a sed that is tailored enough for the JS exploits — that won't affect legit code. I usually end up with something like this:
cat eval.out | awk -F\: '{print $1}' | while read line ; do sed -i 's/function.*String.fromCharCode.*document\.write.*));//g' $line ; done
But of course, use your brain and — most importantly — *always* test the sed using the -e switch on one of the infected files first to ensure it works before running it with -i against the whole list! These cleanups are a good way to fine-tune your practical regex skills. Remember not to be too broad — or too specific!
If the server does not have Plesk or is doing chrooting, such that xferlog shows relative paths rather than absolute, we'll skip the xferlog bit and just look at our docroots for recently modified files.
grep DocumentRoot /etc/httpd/conf/httpd.conf |awk '{print $2}' > docroots.out
cat docroots.out |while read line ; do find $line -mtime -180 | egrep "\.php|\.htm|\.shtm|\.js" | sort |uniq > ftp_modified.out
To be quite honest, these aren't "ftp-modified" files, but you can drag'n'drop with the rest of my sniplets here. The entire purpose of generating these file lists is to narrow down the sheer amount of files we have to look through to make it more manageable, as opposed to grepping through everything in the server's DocumentRoots.
The above sniplets are the fastest ways I've developed to deal with this stuff — you'll spend most of your time reviewing the output and generating regexes with which to clean them up. Absolutely remember to change the FTP passwords for at least the FTP users exploited, and have the end user scan all computers that may have connected to the server via FTP for viruses and trojans.
I haven't the faintest how to deal with the new google analytics-esque variant yet… I hope that doesn't become more popular