• Bayesian filtering

    By Peter Gloor 2 decades ago

    Assuming the KS_BL_IGNORE line in Notes.INI is limitted to 256 bytes I'm looking for suggestions how to deal with the following situation.



    After implementing kSpam and getting the first common problems solved I started to love kSpam due to it's fantastic results. After a month or so I've got some user complaints due to "good mail" that has been copied into the "spam mail" box. Looking into KS_BL_TOKENS I found out that the reason was some common header and formatting words like "Content-Type", "multipart", "NextPart", "charset" etc.



    I did some further analysis, entered the most critical words into a KS_BL_IGNORE line in Notes.ini and deleted all documents having any of this tokens in KS_BL_TOKENS in the "spam mail" database.



    This solved the problem and after one week or so I've got the excellent results back again. Now, ofter another month, the problem reappeared. Good mail is sent to the "spam mail" box. Again the same problem, just for another set of tokens.



    If I add the minimal set of tokens to KS_BL_IGNORE the line becomes 183 characters. If I enter the whole set of tokens I would like to see ignored the KS_BL_IGNORE line will be longer than 256 characters.



    Now I'm asking myself do I something wrong, do I missunderstand something or is there another option I'm not aware of to deal with this situation?



    I need to say that we are only five users here and get a lot of spam to our info account and my own which both are published in the web at several places. At bad days we receive up to 80 spam mails but only 15 or less good mails. The ratio of #mails in "spam mail":"good mail" is 7:1. Do I need to get more good mails copied into the "good mail" box?



    Any suggestions/ideas are welcome!



    Peter

    • Re:

      By Tom Lyne 2 decades ago

      You really want to reduce the number of messages in your spammail database. I have about a 2:1 ratio which seems to work pretty well.



      -tom

      • I am having the same problem

        By Seni Budi 2 decades ago

        I am facing the same problem as having good mail sent to spammail file.



        Can you elobrate on the above statement:



        What do you mean by reduce the number of messages in your spammail database?

        What do you mean by 2:1 ratio?



        How do I go about accomplish the above?



        Thanks

        • Re:

          By Tom Lyne 2 decades ago

          What do you mean by reduce the number of messages in your spammail database? - have less documents in the mailspam.nsf database



          What do you mean by 2:1 ratio? - spam to ham (good) emails





          set an agent to remove old email form the spam and good email databases, watch the number of documents in the databases closely and try to keep it to less than 10000 in each.



          -tom