View Single Post
Old 30 Jan 2022, 12:05 PM   #12
hydrostarr
Member
 
Join Date: Jul 2003
Posts: 55
Effort of "selecting emails" might be much worse than a temp SpamAssassin server

Thanks BritTim for your continued excellent feedback!

First a disclaimer about my long note below: I'm getting all wordy and lengthy here for one main reason: I asked Fastmail tech support to read and stay updated on this EMD thread. I have confidence that BritTim and others at EMD already get where I'm coming from without having to detail everything. I do not yet have that confidence with Fastmail tech support or their systems.

Second, fyi: I switched my team's email domains from Tuffmail.net to Fastmail.com in Dec 2021.

Comments on BritTim's excellent points:

Quote:
Originally Posted by BritTim View Post
I agree that it is important to spend some time to get this right, but careful selection is more important than throwing massive amounts of data at the problem.
Roger that. My problem: I'm a _brand new_ Fastmail user. I have little to no Fastmail-generated "false positive or negative" emails (ie, emails that were originally mischaracterized).

What I do have is a *massive* number of emails (over ~14 years or so) that were categorized (many of them to undo the false positive/negative) over the years by me when Tuffmail.net hosted my email domains and service.

Further: I have little desire to take the time to figure out which email sets (from these 120k+ Tuffmail-spam-and-nonspam-trained emails) represent a better selection ("well-selected emails"), if that's what you mean. That's a huge effort (to selected 1k ham and 1k spam emails from 120k+ total ham-and-spam emails), or so it seems (maybe I'm missing something? pls advise if I am).

It seems much easier for me to whip up my own temporary SpamAssassin server, process the existingly-categorized emails, and hand the resulting database over to Fastmail (if Fastmail is willing to do this). Further, I can rerun this paradigm whenever I get a large, new influx of new email characterizations (mostly to mark large sets of existing email folders from my Tuffmail days as "ham"/not-spam)... again if Fastmail is willing to play ball, or simply speed up their spam-processing a bit.

Quote:
The global Bayes database already provides a good baseline. If well selected (i.e. ) processing 1k spam and 1k ham to fine tune your personal Bayes will usually produce great results.
<rant #1: Fastmail has eroded my trust in them, starting immediately with the first test cases I ran against their systems>

I'm not sure if it was Bayes related or something else, but there's been a potentially-big problem I've had with existing spam classifications (on Fastmail) and/or email-delivery delays... or something. The fact that it seems ambiguous (Fastmail tech thinks they have it under control; I do _not_ think that). This problem also happened almost immediately when I started testing my Fastmail-served domains (the first few emails I tested broke things and it's still not been "fixed"--it's been a baaaaaad experience). More on this later if the problems/symptoms remain relevant. There may be good explanations for this... or not. It depends. I've not yet decided. It's a deeper topic, not enough time for me to properly introduce and detail right now. (I have interacted with "level 2 tech support" at Fastmail on this. I'm not yet satisfied. They're doing their best to assist me, I'm sure.)

The point: this Fastmail experience of mine has put a big, fat question mark in my mind on the trustworthiness of the Fastmail mx/spam/whatever-is-going-on filters.

And since 2nd-level Fastmail tech support failed to tell me -precisely- what was going on, I do not trust their explanation. Their answer seemed flippant and possibly embedded with a tone suggesting I was an inexperienced user. And while I'm confident it was not their intent, I felt like they blew me off (subjective assessment, granted); this came after I waited over a week to get a response from their "senior tech." I truly appreciate that they are working to do their job the best that they can. Each tech handles hundreds to possibly thousands of these inquiries a week; they do not want to have to linger or spend any extra time on any point more than what's needed.

Instead, what I ask is that some manager at Fastmail recognizes that I'm a special user, and they need to get me on the phone with their smartest tech-operations/developer person they got. Please enable me to blast past all the bureaucracy and red tape. This will solve this issue with max efficiency and minimal fuss. I'm happy to pay whatever extra fees this incurs, within reason. (I've already maxed out the user account to 3 years of "Professional.") I've already offered these "extra payments."

Granted, I do recognize I'm a VERY hard-to-please customer with respect to these issues. I'm not Fastmail's average user. But I'm picky for what I think is a darn good reason: I want my email-communication systems to WORK and be reliable, else business and projects can fail. And I do not like to have to consistently revisit the question of "can I trust my email service provider to not throw my good email away." I want to kill the problem dead, once, and be done with it.

In my teams' computing worlds: there's no such thing as "mostly working." In high-level practical terms, it works or it fails.

Digital-computing systems can be treated this way if you design, test, and implement them correctly. I say this with confidence given decades of experience with all manner of implementations, whether or not the core technology was designed by my team or others. And we've designed some of the-most-complex-and-impactful technology ever built. Please do not "hand wave" over important points and details when trying to gain my trust with computing systems that you provide that effectively might be "eating my data" without my knowing it. (Again, I'm talking to you, Fastmail.)

</rant #1>


<rant #2 = comparing Tuffmail vs Fastmail filtering configurability... granted, not a fair comparison>

With Tuffmail.net: I had confidence that I knew exactly what was happening when John and Derek were running Tuffmail. eg, I knew _exactly_ which mx filter was running for every domain/email.address, because the Tuffmail management interface allowed me to program that entire configuration. I could look at the daily report log for _every single mx filter action_ and easily spot problems.

I also managed our own Sieve inbound scripts--it wasn't hard. (Fastmail's Sieve stuff seems harder; there's a more-complex existing configuration where it's less clear to me where I should input my Sieve programming, or not. Or maybe it's just "new" and I don't want to have to take the effort to figure out Fastmail's Sieve base config. ;-) ).

Tuffmail also allowed more-granular level control of the Bayesian spam filters (separate from the mx-level filters).

Sieve, spam-filter, mx configuration and logging, several other config options: all of this gave me tremendous confidence in Tuffmail's system behavior.

I do not yet have that confidence with Fastmail. The only filter-configuration control I seem to have is the "selected folder marked as spam or ham/non-spam" stuff on top of the "zero/small/medium/large"-ish spam-control radio buttons. Add this to the big, unexplained, "ghost" of a problem mentioned in my rant #1 (above).... and...

I'm hammering on this spam config--since it seems to be the only thing I can control with respect to filtering at Fastmail--to at least get it to the point where I'm more comfortable with it's Bayesian spam-filtering and thus trying to trust my email service once again (now that I've switched from Tuffmail to Fastmail in December).

</rant #2>


In short, the bottom line: I'm not yet trusting the global Bayes data running at Fastmail.

Quote:
I hope Bill comes by and adds his own thoughts. He has tuned his own account so he can safely discard virtually all spam (no false positives) while allowing almost no spam to reach the Inbox.
That sounds quite interesting, I too hope to hear from Bill. :-)

Last edited by hydrostarr : 30 Jan 2022 at 01:39 PM.
hydrostarr is offline   Reply With Quote