EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > Runbox Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

Runbox Forum Everything related to Runbox should go here: suggestions, comments, complaints, questions, technical issues, etc.

Reply
 
Thread Tools
Old 22 Jul 2004, 06:56 AM   #16
AGSHender
Member
 
Join Date: Jul 2004
Location: Portland, Oregon, USA
Posts: 98
It seems like an awesome idea linking them together like that for now. Is that a planned permanence or are you going to eventually separate the two of them?

One observation, though. I turned off junk mail filtering in Thunderbird so that I could train the DSPAM filter on the server level instead, and yes, I've gotten a few spams in my inbox today. No big deal. But looking in my Spam folder, I see lots of spams appended with the *SPAM* tag in the subject, which apparently just shows up in the web interface. Looking at the headers, though, they all say "DSPAM: Innocent," even though they're obviously being tagged as spam and labeled as such. How is that possible? Is the header just not updating when SA tells DSPAM that it's spam, or does the header only change when DSPAM itself recognizes an email as spam?
AGSHender is offline   Reply With Quote
Old 22 Jul 2004, 08:00 AM   #17
carverrn
Intergalactic Postmaster
 
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606

Representative of:
Runbox.com
Quote:
Originally posted by AGSHender
It seems like an awesome idea linking them together like that for now. Is that a planned permanence or are you going to eventually separate the two of them?

One observation, though. I turned off junk mail filtering in Thunderbird so that I could train the DSPAM filter on the server level instead, and yes, I've gotten a few spams in my inbox today. No big deal. But looking in my Spam folder, I see lots of spams appended with the *SPAM* tag in the subject, which apparently just shows up in the web interface. Looking at the headers, though, they all say "DSPAM: Innocent," even though they're obviously being tagged as spam and labeled as such. How is that possible? Is the header just not updating when SA tells DSPAM that it's spam, or does the header only change when DSPAM itself recognizes an email as spam?
The web interface shows the *SPAM* flag when the header of the message contains "X-Spam-Flag: YES", which means SpamAssassin flagged it as SPAM.

Have you noticed any messages without the "DSPAM" headers added? I'm getting quite a few (36 today).

Rich
carverrn is offline   Reply With Quote
Old 22 Jul 2004, 08:20 AM   #18
BLuRReD
Essential Contributor
 
Join Date: Nov 2003
Location: somewhere
Posts: 297
hey trond , i forgot to ask about the beta testing period , for how long will it last before the trainable mode is officially launched ?
BLuRReD is offline   Reply With Quote
Old 22 Jul 2004, 07:41 PM   #19
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
We don't really know yet.

It depends on how resource intensive it turns out to be. We suspect that the database can become very large very fast, and we just don't know yet if this is a problem or not.

So, after a while, we should have an idea of whether we need more CPU or more disks or anything like that.
trond is offline   Reply With Quote
Old 22 Jul 2004, 07:46 PM   #20
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
Quote:
Originally posted by carverrn
The web interface shows the *SPAM* flag when the header of the message contains "X-Spam-Flag: YES", which means SpamAssassin flagged it as SPAM.
We intend to show *SPAM* on messages caugth by DSpam as well, we just haven't configured the mailserver properly yet.

Quote:

Have you noticed any messages without the "DSPAM" headers added? I'm getting quite a few (36 today).
That's very odd. As far as I can tell, all my mail passes through Dspam.

Does these messages have anything in common? Are they retrieved by pop? Forwarded from another runbox account? Did they pass through the same mailserver?
trond is offline   Reply With Quote
Old 22 Jul 2004, 09:28 PM   #21
carverrn
Intergalactic Postmaster
 
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606

Representative of:
Runbox.com
Quote:
Originally posted by trond
That's very odd. As far as I can tell, all my mail passes through Dspam.

Does these messages have anything in common? Are they retrieved by pop? Forwarded from another runbox account? Did they pass through the same mailserver?
I setup a filter the check for the DSPAM headers and as of now 172 messages do not show these headers.

I haven't been able to identify anything particular in common.

Not the same server ... lassie, fetch, fifi

They are not POP retrieved.

They are not forwarded from another Runbox account.

I would say all are forwarded to my Runbox account from another service because I use my own domain names.
Also, most of my emails are forwarded from my domain names to the Runbox NO or US domains and not the COM domain.

Rich
carverrn is offline   Reply With Quote
Old 22 Jul 2004, 10:09 PM   #22
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
Quote:
Originally posted by carverrn
Also, most of my emails are forwarded from my domain names to the Runbox NO or US domains and not the COM domain.
Ah! That's it. It seems that only mails to runbox.com are filtered through DSpam.

I don't think there's any good reason for this, probably just a misconfiguration. I'll contact Linpro and have them look at it.
trond is offline   Reply With Quote
Old 22 Jul 2004, 10:33 PM   #23
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
Now it's fixed. All your mail should get filtered through dspam now.
trond is offline   Reply With Quote
Old 23 Jul 2004, 09:59 AM   #24
carverrn
Intergalactic Postmaster
 
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606

Representative of:
Runbox.com
Quote:
Originally posted by trond
Now it's fixed. All your mail should get filtered through dspam now.
Yep ... looks like it all started going to DSPAM about 8:10am CDT. Thanks!

Rich
carverrn is offline   Reply With Quote
Old 23 Jul 2004, 10:11 AM   #25
jbonet
Member
 
Join Date: May 2004
Location: Caracas, Venezuela
Posts: 63
Storage space for learning filter

Quote:
Originally posted by trond
We don't really know yet.

It depends on how resource intensive it turns out to be. We suspect that the database can become very large very fast, and we just don't know yet if this is a problem or not.

So, after a while, we should have an idea of whether we need more CPU or more disks or anything like that.
Surely the amount of storage will be small compared to 1 GB, no?
jbonet is offline   Reply With Quote
Old 23 Jul 2004, 07:14 PM   #26
tore
Junior Member
 
Join Date: Jan 2004
Posts: 22
Re: Storage space for learning filter

Quote:
Originally posted by jbonet
Surely the amount of storage will be small compared to 1 GB, no?
Yes. Bear in mind, though, that the average user isn't using his whole GB. Also, if the DSPAM database gets a lot of traffic, its storage - unlike the mail storage - has to be blazing fast. In other words - expensive storage.

Tore
tore is offline   Reply With Quote
Old 23 Jul 2004, 08:06 PM   #27
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
Re: Storage space for learning filter

Quote:
Originally posted by jbonet
Surely the amount of storage will be small compared to 1 GB, no?
Well, we tried another system first, and that would require almost as much space as people are actually using for their mail, so we decided against that.

DSpam works by breaking down the mail in tokens. Usually single words, but it also look at pairs of words and certain headers. Then it checks if a token has been used in spam messages or innocent messages before. The ratio of spam to innocent tokens will decide whether DSpam thinks the message is spam or not.

So - this list of tokens can grow HUGE in a very short time, and I'm really not sure what that translates to in diskspace. But it's mostly performance I worry about.

Anyway - I love statistics, so I thought I'd share the numbers from the first couple of days of betatesting:

Code:
       messages: 3131
         % spam: 70.9
           spam: 2221
false negatives: 558
  spam accuracy: 74.9
            ham: 910
false positives: 46
   ham accuracy: 94.9
Just a few things to note.
First of all, DSpam usually starts catching spam on its own after about 50-75 spams. These numbers represent the total accuracy, which means that the first spams it let's through will drag down the accuracy for a while.
Second, the "spam accuray" number does not really reflect what the user sees, because Spamassassin catches most of the spam DSpam misses to begin with. So it will feel like it's a lot more accurate than it really is at first.

Once the filter has learned about 1000 messages of each sort, it is considered to be mature, and accuracy should be above 99.9%

I think I'll reset the statistics for my user, and see what my stats are for the weekend.

BTW: This is how I'm doing so far
Code:
         messages: 650
           % spam: 52.9
             spam: 344
  false negatives: 19
    spam accuracy: 94.5
         innocent: 306
  false positives: 5
innocent accuracy: 98.4
trond is offline   Reply With Quote
Old 23 Jul 2004, 09:47 PM   #28
trond
Essential Contributor
 
Join Date: Oct 2003
Location: Oslo, Norway
Posts: 344
Re: Re: Storage space for learning filter

Quote:
Originally posted by trond
I think I'll reset the statistics for my user, and see what my stats are for the weekend.
Yikes!

That turned out to be a bad idea. After I reset my stats, the filter started letting spam through again. Guess I should read the docs more carefully
trond is offline   Reply With Quote
Old 24 Jul 2004, 07:11 AM   #29
carverrn
Intergalactic Postmaster
 
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606

Representative of:
Runbox.com
WOW .. after just a couple of days of training DSPAM seems to be doing a pretty good job. I'm impressed. Now I must say my SPAM influx today seems to be a bit lower than normal. We'll see how it goes over the next few days.

Rich
carverrn is offline   Reply With Quote
Old 24 Jul 2004, 07:33 AM   #30
carverrn
Intergalactic Postmaster
 
Join Date: Jan 2002
Location: Chicago, IL
Posts: 5,606

Representative of:
Runbox.com
DSPAM is catching stuff that would be slipping through SpamAssassin gave scores of 3.

Rich
carverrn is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 10:22 PM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy