4 Feb 2007
I blogged a while ago about using FuzzyOcr for detecting image spam. My FuzzyOcr isn’t working and at the moment I haven’t got time to fix it, so I wrote a procmail recipe to solve the problem instead:
# test if body contains gif, html, etc, and get procmail score :0 Bc * 3.5^0 Content-Type: image/gif * 2^0 Content-Type: text/html /dev/null SCORE_PM=$= # pull out SA score and required; if 2 scores > SA req'd, ISGT = 1 (true) SCORE_SA=`formail -c -xX-Spam-Status: | awk '{print $2}' | awk -F= '{print $2}'` REQD_SA=`formail -c -xX-Spam-Status: | awk '{print $3}' | awk -F= '{print $2}'` ISGT=`echo "${SCORE_SA} + ${SCORE_PM} > ${REQD_SA}" | bc -l` # test if ISGT = 1, if so, spam prob :0 : * ISGT ?? ^^1^^ .y_spam_probable/comments powered by Disqus