|
FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc. |
|
Thread Tools |
28 Mar 2006, 09:59 AM | #1 |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
too many false positives
I have a pop account for work which I access via fastmail's pop links. The address is public so it does get some spam, but it also gets a huge number of false positives. This means I have to go through my junk folder every day to find the ham. It also means that when I respond to a message and remove the spam score from the subject line, my message is not recognized as a response to the incoming message. I can't whitelist the incoming mail because most of it is first time inquiries from potential customers.
Most of these mails are in Japanese. I mentioned before on this forum that Japanese messages seem to get an extra high spam score but someone from fastmail said that wasn't true. I still wonder because as far as I can tell, there's nothing spammy about them. Any suggestions for how to solve this problem? Thanks in advance. Jeremy |
28 Mar 2006, 11:20 PM | #2 |
The "e" in e-mail
Join Date: Oct 2002
Location: Holon, Israel.
Posts: 4,857
|
You can look for "X-spam-hits" in the headers and see if there's something common that raises the spam scores. In Hebrew I also get quite a lot legitimate mail that gets relatively high spam score.
You can use "advanced" spam filtering, opt not to use spam filtering automatically, and then set rules that use the spamscore with other headers to get mail coming from different pop-links or forwarded to different aliases to use the spamscore differently. Example: I use allof( header :value "ge" :comparator "i;ascii-numeric" ["X-Spam-score"] ["7"] , header :contains "X-LinkName" "netvision" ) in the filing rules (with "look in" set to "advanced") to file the email I pull from my Netviosion account with spam score 7 or above to a spam folder. Another slightly more complicated rule: allof( header :value "ge" :comparator "i;ascii-numeric" ["X-Spam-score"] ["7"] , anyof( header :contains "X-Delivered-to" "member", header :contains "X-Sneakemail-Label" "slashdot") ) deals with email forwarded by a forwarding service to an alias containing the word "member" and in addition email forwarded by by sneakemail that was received at one of the addresses I published on Slashdot, and that have spam score 7 or above. It's not a perfect solution, but it can save you time by having most legitimate mail pulled from the pop account in one folder, and all of the spam in another, with a few false positives and a few false negatives. |
30 Mar 2006, 05:27 PM | #3 |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
thank you
Thank you, Hasaso. I'll try the advanced spam filtering.
As for checking x-spam hits, I glanced at a few and nothing jumped out at me except that most of them say something about bayes. What does that mean? And is there anything I can do about it? As individual users, we can't change the way spam assassin judges our messages, can we? If the advanced spam filtering thing doesn't work, I'm not sure what I'll do. Somethings got to change, because the way things are, I have to look through my junk folder as often and as carefully as my inbox. All my messages get forwarded to gmail and the filter there allows the occasional spam into the inbox but very rarely gives false positives. |
31 Mar 2006, 02:55 PM | #4 |
Intergalactic Postmaster
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,102
Representative of:
Fastmail.FM |
Can you PM me your account name, and leave some messages in the Junk Mail folder that I can look at. I'll see if there's an obvious problem...
Rob |
4 Apr 2006, 07:13 AM | #5 | |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
Rob looked into it
and just for people who amy have been following this thread, I quote (with permission) what he said in a PM:
Quote:
Jeremy |
|
23 Jun 2006, 04:41 PM | #6 |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
still too many false positives
Most of them are in Japanese. Most are very short messages. Many come from cell phones. I don't know what else they could have in common that lead to them being classified as spam.
Rob, could you take another look in my junk folder? I left eight mails in there (the read ones). Just in case you don't have my user ID from before, I'll PM it to you again. Thanks. -Jeremy |
23 Jun 2006, 05:34 PM | #7 | |
Cornerstone of the Community
Join Date: Mar 2004
Location: London, UK
Posts: 834
|
Re: still too many false positives
Quote:
|
|
23 Jun 2006, 06:06 PM | #8 | |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
Re: Re: still too many false positives
Quote:
Or does Fastmail have its own Bayes database? Thich would give non-English messages a hard time getting through since I assume most fastmail customers use English and would classify messages in other languages as spam. Jeremy |
|
23 Jun 2006, 09:49 PM | #9 |
Intergalactic Postmaster
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,102
Representative of:
Fastmail.FM |
The bayes DB is not currently user controlled in any way, it's built up automatically from the spam score of messages. It's not ideal, but it does actually help.
Part of the thing with the new servers is that we'll be freeing up some other servers to become DB servers for logging + per-user bayes DB, I've been waiting for that for a while... Rob |
23 Jun 2006, 09:56 PM | #10 | |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
Quote:
I'm looking forward to the per-user database. Should I be saving spam to train it on once it's in place? |
|
24 Jun 2006, 05:37 AM | #11 | |
The "e" in e-mail
Join Date: Oct 2002
Location: Holon, Israel.
Posts: 4,857
|
Re: Re: Re: still too many false positives
Quote:
I do consider all the mail I get in Japanese/Chinense/Spanish etc. as spam, but it really is - it's quite easy to recognize even if I don't speak the language. I don't get any legitimate email in these languages because people who know me don't send me email in languages I don't speak. So even if a spam/ham coepus used to train a system to recognize spam contains only Japanese spam, that would really be spam and not include any ham classified as spam. However, if their is no Japanese ham to go with it, a statistical model would just learn to associate Japanese patterns as spam. When there is a per user bayes DB, I wonder how easy it would be to make it work "per destination" (e.g., pop-link, alias, email address). It seems that different "destinations" tend to get different patterns of spam, so taking it into account in the statistical model can produce better results. |
|
26 Jun 2006, 10:12 AM | #12 | |
Intergalactic Postmaster
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,102
Representative of:
Fastmail.FM |
Quote:
The "train to bayes db" level is very high, so I think it's currently only spams with scores above 15 or below -10 which are going to the training DB. Rob |
|
26 Jun 2006, 06:30 PM | #13 |
Essential Contributor
Join Date: Aug 2004
Location: Japan
Posts: 226
|
That's good to hear. Do I have to actually click the "report as spam" button, or does moving it to the junk folder in my client do the trick. And likewise, will simply moving the messages marked falsely as spam out of the junk folder serve to report the messages as nonspam?
Also thanks for the suggestions in the PM, Rob. I'll raise my SA score threshhold and add addresses to my address book. I get a lot of inquiries from potential customers which of course I can't whitelist ahead of time. Also I usually use a client, so without a way to synch my thunderbird address book and my FM address book, it's kind of a pain to remember which addresses need to be added to FM. Jeremy Edit: Just a thought. Can I whitelist all Japanese encoded email, maybe using sieve? Probably a topic for another post. |
27 Jun 2006, 04:53 PM | #14 | |
Intergalactic Postmaster
Join Date: Oct 2001
Location: Melbourne, Australia
Posts: 6,102
Representative of:
Fastmail.FM |
Quote:
Rob |
|