View Single Post
Old 30 Jan 2022, 11:13 AM   #11
BritTim
The "e" in e-mail
 
Join Date: May 2003
Location: mostly in Thailand
Posts: 3,095
Quote:
Update: the Spam/non-Spam processing going WAY too slow.

After 4 days the spam counters show 6k emails have been processed. That's a ~1.5k/per_day rate. And the counters suggest the daily rate may be _slowing down_. I currently have 120k marked-as-spam-and-not-spam emails in queue to process... and this will most-likely grow every day (possibly dramatically) as I mass-add emails to my "Ham / non-Spam" folder. This will take months at the current rate. I have created a ticket with Fastmail on this.
The global Bayes database already provides a good baseline. If well selected (i.e. emails that were originally mischaracterised) processing 1k spam and 1k ham to fine tune your personal Bayes will usually produce great results.

I hope Bill comes by and adds his own thoughts. He has tuned his own account so he can safely discard virtually all spam (no false positives) while allowing almost no spam to reach the Inbox.

I agree that it is important to spend some time to get this right, but careful selection is more important than throwing massive amounts of data at the problem.
BritTim is offline   Reply With Quote