|
Runbox Forum Everything related to Runbox should go here: suggestions, comments, complaints, questions, technical issues, etc. |
|
Thread Tools |
22 Jul 2004, 06:56 AM | #16 |
Member
Join Date: Jul 2004
Location: Portland, Oregon, USA
Posts: 98
|
It seems like an awesome idea linking them together like that for now. Is that a planned permanence or are you going to eventually separate the two of them?
One observation, though. I turned off junk mail filtering in Thunderbird so that I could train the DSPAM filter on the server level instead, and yes, I've gotten a few spams in my inbox today. No big deal. But looking in my Spam folder, I see lots of spams appended with the *SPAM* tag in the subject, which apparently just shows up in the web interface. Looking at the headers, though, they all say "DSPAM: Innocent," even though they're obviously being tagged as spam and labeled as such. How is that possible? Is the header just not updating when SA tells DSPAM that it's spam, or does the header only change when DSPAM itself recognizes an email as spam? |
22 Jul 2004, 08:00 AM | #17 | |
Intergalactic Postmaster
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606
Representative of:
Runbox.com |
Quote:
Have you noticed any messages without the "DSPAM" headers added? I'm getting quite a few (36 today). Rich |
|
22 Jul 2004, 08:20 AM | #18 |
Essential Contributor
Join Date: Nov 2003
Location: somewhere
Posts: 297
|
hey trond , i forgot to ask about the beta testing period , for how long will it last before the trainable mode is officially launched ?
|
22 Jul 2004, 07:41 PM | #19 |
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
We don't really know yet.
It depends on how resource intensive it turns out to be. We suspect that the database can become very large very fast, and we just don't know yet if this is a problem or not. So, after a while, we should have an idea of whether we need more CPU or more disks or anything like that. |
22 Jul 2004, 07:46 PM | #20 | ||
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
Quote:
Quote:
Does these messages have anything in common? Are they retrieved by pop? Forwarded from another runbox account? Did they pass through the same mailserver? |
||
22 Jul 2004, 09:28 PM | #21 | |
Intergalactic Postmaster
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606
Representative of:
Runbox.com |
Quote:
I haven't been able to identify anything particular in common. Not the same server ... lassie, fetch, fifi They are not POP retrieved. They are not forwarded from another Runbox account. I would say all are forwarded to my Runbox account from another service because I use my own domain names. Also, most of my emails are forwarded from my domain names to the Runbox NO or US domains and not the COM domain. Rich |
|
22 Jul 2004, 10:09 PM | #22 | |
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
Quote:
I don't think there's any good reason for this, probably just a misconfiguration. I'll contact Linpro and have them look at it. |
|
22 Jul 2004, 10:33 PM | #23 |
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
Now it's fixed. All your mail should get filtered through dspam now.
|
23 Jul 2004, 09:59 AM | #24 | |
Intergalactic Postmaster
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606
Representative of:
Runbox.com |
Quote:
Rich |
|
23 Jul 2004, 10:11 AM | #25 | |
Member
Join Date: May 2004
Location: Caracas, Venezuela
Posts: 63
|
Storage space for learning filter
Quote:
|
|
23 Jul 2004, 07:14 PM | #26 | |
Junior Member
Join Date: Jan 2004
Posts: 22
|
Re: Storage space for learning filter
Quote:
Tore |
|
23 Jul 2004, 08:06 PM | #27 | |
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
Re: Storage space for learning filter
Quote:
DSpam works by breaking down the mail in tokens. Usually single words, but it also look at pairs of words and certain headers. Then it checks if a token has been used in spam messages or innocent messages before. The ratio of spam to innocent tokens will decide whether DSpam thinks the message is spam or not. So - this list of tokens can grow HUGE in a very short time, and I'm really not sure what that translates to in diskspace. But it's mostly performance I worry about. Anyway - I love statistics, so I thought I'd share the numbers from the first couple of days of betatesting: Code:
messages: 3131 % spam: 70.9 spam: 2221 false negatives: 558 spam accuracy: 74.9 ham: 910 false positives: 46 ham accuracy: 94.9 First of all, DSpam usually starts catching spam on its own after about 50-75 spams. These numbers represent the total accuracy, which means that the first spams it let's through will drag down the accuracy for a while. Second, the "spam accuray" number does not really reflect what the user sees, because Spamassassin catches most of the spam DSpam misses to begin with. So it will feel like it's a lot more accurate than it really is at first. Once the filter has learned about 1000 messages of each sort, it is considered to be mature, and accuracy should be above 99.9% I think I'll reset the statistics for my user, and see what my stats are for the weekend. BTW: This is how I'm doing so far Code:
messages: 650 % spam: 52.9 spam: 344 false negatives: 19 spam accuracy: 94.5 innocent: 306 false positives: 5 innocent accuracy: 98.4 |
|
23 Jul 2004, 09:47 PM | #28 | |
Essential Contributor
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
|
Re: Re: Storage space for learning filter
Quote:
That turned out to be a bad idea. After I reset my stats, the filter started letting spam through again. Guess I should read the docs more carefully |
|
24 Jul 2004, 07:11 AM | #29 |
Intergalactic Postmaster
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606
Representative of:
Runbox.com |
WOW .. after just a couple of days of training DSPAM seems to be doing a pretty good job. I'm impressed. Now I must say my SPAM influx today seems to be a bit lower than normal. We'll see how it goes over the next few days.
Rich |
24 Jul 2004, 07:33 AM | #30 |
Intergalactic Postmaster
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606
Representative of:
Runbox.com |
DSPAM is catching stuff that would be slipping through SpamAssassin gave scores of 3.
Rich |