• Lots of Spam tagged as good

    By John Q Parker 2 decades ago

    I get as many as 50% of spam getting through the Bayesian filters. Most of these are one-liners, should be really easy to spot.



    Do I need to run the PREP phase again? Why isn't the Bayesian getting any more accurate? I always copy and paste spam from my mailbox into the kSpam Ham db, and then move to MailSpam db and confirm. Is this correct?



    I see quite a few of the same identical spams getting through time after time…

    • one liners

      By Tom Lyne 2 decades ago

      Unfortunately that's a limitation of the kSpam filter, the one line spams won't provide enough information in the body of the message to accurately classify them. You could turn on the token listing so you can see what tokens are being used to calculate the probability, it's likely they are all quite innocent.



      Lots of word classification algorithms suffer from the same problem, there's just not enough information to go on.



      -tom

      • Not All One-Liners

        By John Q Parker 2 decades ago

        They are certainly not all one-liners, and there are a few I must have seen a hundred times.



        "Another Update to fix Windows File Errors in registry" for example. This is a multi-line email, and the "ALLOW REASON" is garbage characters such as "ï" with an unprintable character as well.



        Here is the body of the email:

        ————————————————————-



        File Error Notification - Instructions To fix File Errors in your Registry: Your PC may be suffering from serious file errors in your WINDOWS registry which may be the reason why your PC is running so slow, or crashing and freezing from time to time. Also, these can lead to major system problems and possible memory leaks. Below are instructions that will enable you to Increase Your Computer's Speed, Power, Stability and Reliability in just a few minutes. Press below to launch the Diagnostics Test download for no cost at all: http://buyournest.com/c/ZNE7eooFlW9BL5-cMKaSg.html?0 Once again, there are NO OBLIGATIONS for this FREE OFFER that includes our FREE Software, FREE Analysis, FREE Report and 24 Hour Support ! If after completing the free Diagnostic Test it is brought to your attention that your computer's registry does contain file "errors", then it may be in your computer's best interest to fix the potentially harmful file errors in your registry. Press below to launch the Diagnostics Test download now: http://buyournest.com/c/ZNE7eooFlW9BL5-cMKaSg.html?0 Copyright © 2002 - 2004 All Rights Reserved To not receive future offers/promotions from "Error Nuker" please press on the below link: http://buyournest.com/c/ZNE7eooFlW9BL5-cMKaSg.html?1 Or send us a letter at: 100 E. San Marcos Blvd. San Marcos, CA 92069 To remove yourself from this list, click here http://buyournest.com/u/ZNE7eooFlW9BL5-cMKaSg.html or write to us at:#50038PO Box 025250Miami, Florida 33102-5250



        ——————————————————————–



        Surely there is enough to go here on to learn that this is SPAM?

        • Re: sorry..

          By Tom Lyne 2 decades ago

          ..I thought it was only one liners you had a problem with.



          There should be enough there to go on. Can you set kSpam to add the token list to the email, then you can see what tokens kSpam is using to decide the final probability.



          -tom