|
FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc. |
|
Thread Tools |
24 Aug 2020, 01:21 PM | #46 | ||
Essential Contributor
Join Date: May 2018
Posts: 478
|
Quote:
Update: The other day, just for fun, I reopened my old ticket on the removal of the Spam Score just to tell them screw it and I had long since had a workaround. Just got a reply form them and here's what they said: Quote:
Last edited by xyzzy : 24 Aug 2020 at 01:29 PM. |
||
17 Jun 2023, 01:33 AM | #47 |
Essential Contributor
Join Date: Jan 2008
Posts: 378
|
@xyzzy
Thanks so much for providing this code. Just for us mere mortals can understand too, do I have to insert my spam threshold somewhere in the code or is that pulled directly from my spam settings in my FM account? So, currently I have my spam settings on "custom" and "move messages with score of 4.5 or higher to spam". Or do I just leave the code as is? And sorry if this is a stupid question |
17 Jun 2023, 06:37 PM | #48 | |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
Quote:
The disadvantage is that if they ever change the name of their variable, then your code would then be referring to a no-longer-set variable. It would be safer for you to invent your own variable. Except ... that then you have to /remember/ that your sieve code uses your variable which has nothing to do with the one in the GUI. Personally I'd use my own variable. I also wouldn't give any of my own variables such "obvious" names as eg "subject" or "spam_score" but instead names like "jn_saved_subject" and "jn_saved_spamscore" and so on, as it's unlikely FM would ever name any of their variables with a "jn_" prefix (those are my initials). If you look at FM's generated code you'll see that they don't have a sensible naming convention for all of their variables - some of them have names starting L0_ L1_ L2_ ...L99_ etc but they also use eg "deletetotrash", "flagged", "hasmailbox", "processimip_outcome", "read", "skipinbox", "spam" and "stop". Or even, if I had many pieces of sieve code I might name them by the date I wrote the code, so eg "jn_20230617_saved_subject" and "jn_20230617_saved_spamscore", or instead of date put a "change number" into the varnames (if I had a change history describing all the changes I'd made to the code and wanted to find all the variables related, say, to my 93rd change to that code). As well as reducing the chance of a nasty accident, using variables with my own prefixes means I can find them all (in a long script) by searching for "jn_" in my text editor. |
|
17 Jun 2023, 07:44 PM | #49 |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
I've just raised a support ticket asking Fastmail to add a fixed prefix of some sort to all their generated variable names.
|
17 Jun 2023, 07:46 PM | #50 | |
Essential Contributor
Join Date: Jan 2008
Posts: 378
|
Quote:
As I'm not a coder I just try to follow examples and instructions to find out by trial and error how things work, which can be frustrating a lot of times I currently have this in the FM generated part: Code:
if anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "4", header :regex "X-Spam-score" "^4\.[2-9]$") { fileinto "\\Junk"; stop; Code:
if allof(string :is "${ADD_SCORE}" "true", anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "${ST1}", header :regex "X-Spam-score" "^${ST1}\.[${ST2}]$")) { # X-Spam-score >= SPAM_THRESHOLD if header :matches "Subject" "{SPAM*} *" { # "Add score" is working if string :is "${known_sender}" "true" { # if known sender... deleteheader "Subject"; # ...strip "{SPAM xx.x}" from Subject line addheader "Subject" "${2}"; # X-Spam-score still shows the spam score } } elsif allof(string :is "${known_sender}" "false", # FM's "Add score" is not working or off header :matches "Subject" "*") { # don't have to worry about known senders set "Subject" "${1}"; if header :matches "X-Spam-score" "*" { set "spam_score" "${1}"; if header :value "lt" :comparator "i;ascii-numeric" "X-Spam-score" "4" { set "spam_score" "0${1}"; # add leading 0 for spam scores < 10 } } deleteheader "Subject"; # prefix Subject with "{SPAM xx.x}" addheader "Subject" "{SPAM ${spam_score}} ${subject}"; } else { # known sender && spam && no {SPAM*} prefix # as in the 1st case X-Spam-score still shows the actual spam score } } # ADD_SCORE && spam Code:
"X-Spam-score" "4" Code:
"^4\.[2-9]$") { |
|
17 Jun 2023, 08:35 PM | #51 |
The "e" in e-mail
Join Date: May 2003
Location: mostly in Thailand
Posts: 3,095
|
I have not carefully looked at your full code, but that regular expression is carefully allowing for a spam score that is is a decimal number between 4.2 and 4.9. I am surprised it is necessary, but presumably testing found that the comparison against 4 was only reliable for an X-Spam score of 5 and above. It suggests you need to be careful with your own comparisons. I think my own approach would be try to use Fastmail's comparisons, using them to set a flag that I use later rather than trying to change between GT and LT comparisons.
|
17 Jun 2023, 08:54 PM | #52 | |
Essential Contributor
Join Date: Jan 2008
Posts: 378
|
Quote:
Code:
"^4\.[2-9]$") { |
|
17 Jun 2023, 10:17 PM | #53 |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
I think before we go too much further it would help if @xyzzy commented.
I think there's a (small) bug (or a bad comment perhaps) in the code posted on 24 Aug 2020. To: qwertz123456 - the "gt" or "lt" things indicate if a comparison is being made of one value "gt" ie greater than another, or "lt" less than. The sieve code that has stuff like :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "4" in it compares a value extracted from a header with another value. In this case it compares the value from the header "X-Spam-score", with 4, and the test succeeds if the value in the header is gt (ie greater than) 4. However this test only looks at the integer part (the bit to the left of a decimal point) of a number. So if it compares "4.7" with 4 it would think they were the same (because it only extracts the "4" from "4.7", and then tests that 4 with the 4 specified at the end of the test. And a value like that 4.7 certainly wouldn't be considered greater than 4. So the tests also use a different technique to look at anything that might have been to the right of a decimal point. This is where I think xyzzy may have made an error. He(?) describes code that he didn't show that chops up a threshold value, described as "x.yy" (but shown as an example as 6.2) into two parts. He puts the "x" part into a variable named ST1 and the "yy" part (modified a bit) into ST2. ST2 is described as then containing "yy-9". This is fine if "yy" was just a single digit, so eg chopping "4.7" up would produce "4" in ST1 and "7-9" in ST2. But if the bit after the dot had 2 or more digits, eg if the threshold value was "4.75", then ST2 would end up as "75-9" ... and I'm not sure what that would do (because it's syntactically invalid in the regex it's later used in, I think). The reason there's a dash in ST2 is that in the test that uses its value later on the character after the decimal point in a number in the "X-Spam-score" header is tested to see if it matches (say) "5-9" and that is a shorthand way of saying if it is a 5 or a 6 or a 7 or an 8 or a 9. That aside, to use this code you also have to (a) set, probably at the very start of your code, the variable SPAM_THRESHOLD (b) have some code that chops SPAM_THRESHOLD up into two variables ST1 and ST2. (c) have a variable named ADD_SCORE and have that set true if you want this code to do anything (d) you need to correct an error you introduced in two lines; Code:
if header :value "lt" :comparator "i;ascii-numeric" "X-Spam-score" "4" { set "spam_score" "0${1}"; # add leading 0 for spam scores < 10 |
17 Jun 2023, 10:26 PM | #54 |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
Also ...the first thing that the snippet of FM-generated code shows is that Fastmail do NOT put a variable in their script that contains your spam threshold. Instead they generate that first test, knowing what your threshold is. Those lines:
Code:
if anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "4", header :regex "X-Spam-score" "^4\.[2-9]$") { fileinto "\\Junk"; stop; - if the INTEGER part of the number in "X-Spam-score" is greater than 4 and - if the value in "X-Spam-score", examined character by character (ie not as a number) is a 4, then a dot, then a single digit in range 2-9 inclusive. This implies a threshold of 4.1 I think, BUT DOESN'T MAKE SENSE in terms of what you said earlier - that you wish mails with a score > 4.5 to be moved to the spam folder. |
17 Jun 2023, 10:33 PM | #55 | |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
Oops, and also ... When qwertz123456 asked
Quote:
YES it is important. But you don't need to put it anywhere. It is already where it needs to be, in the FM-generated code. It is part of the sieve code that does what you configured spam processing to do, ie to move some mails to the spam folder. That logic really has nothing to do with the separate piece of sieve code that adds in or in some cases removes the "SPAM n.nn" text to/from a Subject header. The only reason it was relevant was that IF that FM-generated code had contained your current threshold value in a variable, you could have chosen to use its value in the second block of code ... and the only way to find out if such a variable existed was to look at this generated code. Does that make sense? |
|
17 Jun 2023, 11:00 PM | #56 |
Essential Contributor
Join Date: Jan 2008
Posts: 378
|
Thanks. Yeah, I think it would make sense to wait for xyzzy to respond. I thought I could just copy the code insert one or two numbers and it would work. But the code seems to be much more complex than I thought.
And your explanations do make sense, but putting it all together and understanding the whole code is way beyond my pea-sized brain |
18 Jun 2023, 12:05 PM | #57 |
Essential Contributor
Join Date: May 2018
Posts: 478
|
I'll try to summarize generally on these recent posts to this thread. And I apologize in advance for the length of this post since I'm including my code and some associated explanation.
Unfortunately the UI does not have any predefined Sieve variables for any of the UI settings. The funky names generated in the Sieve code are I think generally sufficient to reduce the odds of clashing with user variable names. Regex terms like "^4\.[2-9]$" are checking for fractional values in the range specified, 4.2 through 4.9 in that example (from X-Spam-score value start, ^, to end, $). You will see this whenever the range test is for a fractional value. The comparator only tests the integer portion of the score. If the threshold is a integer (e.g., 4.0) then the fractional test is not generated because the range then would be [0-9]. It would still always work but FM doesn't generate it in their Sieve code since the comparator is sufficient. They know the value they're testing for. Since these recent posts are resurrecting an old thread some things have changed since my post #44.The primary thing is that since FM no longer supports generating the {SPAM x.xx) because they add the score in a badge to messages sent to the Spam folder, then testing for it is not necessary and simplifies my code a little. So below is my actual Sieve code related to the threshold handling. This time I am including the initialization code. This is all copy/pasted as-is directly from my Sieve script. My code is organized into a global initialization section (define constants and lists) at the start of the Script making stuff easier to find, initialized code section, and finally the code itself in the various Sieve sections provided for that. Note, my coding style is to comment a lot so I hope that helps in understanding it. First the global initialization... Code:
/*--------------------------------* | Spam Protection level settings | *--------------------------------* The following settings REPLACE FM's UI Protection level "Move" and "Permanently delete". DISABLE THOSE PROTECTION LEVEL TESTS. NOTE HOWEVER THAT PROTECTION LEVEL'S CUSTOM RADIO BUTTON MUST STILL BE CHECKED TO ENSURE THE SERVER GENERATES THE SPAM SCORE. */ set "SPAM_THRESHOLD" "5.0"; # SPAM_DISCARD > spam score >= SPAM_THRESHOLD set "SPAM_FLAGS" ""; # fileinto "\\Junk" flags # \\Seen ==> read, \\Flagged ==> pinned # "\\Seen" | "\\Flagged" | "\\Seen \\Flagged" | "" # this spam is always filed in \\Junk (same as UI) # DISABLE UI's PROTECTION LEVEL's "MOVE" set "SPAM_DISCARD" "9.2"; # spam score >= SPAM_DISCARD set "SPAM_DISCARD_FOLDER" "INBOX.Discard"; # folder for msgs in that range (null ==> discard) set "SPAM_DISCARD_FLAGS" ""; # fileinto SPAM_DISCARD_FOLDER flags # set SPAM_DISCARD to 100.0 to disable this test # DISABLE PROTECTION LEVEL's "PERMANENTLY DELETE" # The effect of these settings is identical to Spam Protection's "Move" and "Permanently # delete" when the SPAM_DISCARD_FOLDER is set to null. If it specifies a folder then the # messages are filed into there to give the opportunity to review. # Note as stated these settings replace the UI "Spam protection" settings. If it is also # desired to set the Backscatter Advanced settings that is still done from there as usual. /*------------------------------------------------------------* | ADD_SCORE - Prefix Subject with spam score: "{SPAM xx.x} " | *------------------------------------------------------------* This is a legacy setting FM used to have (it's been removed form the FM UI) but is still considered wortwhile supporting. ADD_SCORE determines whether to prefix the Subject line with the spam score as "{SPAM xx.x} ". Three possibilities are supported as shown below. */ set "ADD_SCORE" "1"; # 0 ==> do NOT prefix ANY spam subject lines # 1 ==> prefix spam that's NOT filed into Spam # 2 ==> prefix ALL SPAM with with "{SPAM xx.x} " Unlike the UI I support a SPAM_DISCARD_FOLDER. I don't trust the UI's Permanently delete where messages are simply discarded. I want to possibly review them first. So if SPAM_DISCARD_FOLDER is specified such messages are sent to that folder (presumably set to hold message a short time) giving you time to review them. Setting SPAM_DISCARD_FOLDER to null will however cause a discard just like the Permanently delete UI setting would have caused. Both folders have their respective flags setting too. For example setting SPAM_DISCARD_FLAGS to "\\Seen" would have the same effect as the UI's Spam protection "Mark spam as read" setting. Finally note that my ADD_SCORE has three settings. 0 is basically FM's way of handling things. No spam score except for showing the flag on stuff filed into the Spam folder. 1 allows adding "{SPAM xx.x} " to stuff anywhere else, specifically spam that's going into SPAM_DISCARD. Finally, for the sake of completeness, show the spam score even on those filed into the Spam folder. Never used that one but what the heck! Next the initialization code that breaks the threshold values in to variables we can test the spam score values. Code:
# # For places where the spam thresholds need to be explicitly checked define the variables # needed to do those checks as a function of the threshold values. # if string :matches "${SPAM_THRESHOLD}" "*.*" { set "ST1" "${1}"; set "ST2" "${2}-9"; } else { # should never happen if SPAM_THRESHOLD is properly formatted as "x.y" set "ST1" "0"; # set so all X-Spam-score's will always be greater than this set "ST2" "0-0"; addheader "Debug" "????????? Miss-formatted SPAM_THRESHOLD value! ?????????"; } if string :matches "${SPAM_DISCARD}" "*.*" { set "SD1" "${1}"; set "SD2" "${2}-9"; } else { # should never happen if SPAM_DISCARD is properly formatted as "x.y" set "SD1" "0"; # set so all X-Spam-score's will always be greater than this set "SD2" "0-0"; addheader "Debug" "????????? Miss-formatted SPAM_DISCARD value! ?????????"; } /* Example uses: anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "${ST1}", header :regex "X-Spam-score" "^${ST1}\.[${ST2}]$") # X-Spam-score >= SPAM_THRESHOLD anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "${SD1}", header :regex "X-Spam-score" "^${SD1}\.[${SD2}]$") # X-Spam-score >= SPAM_DISCARD For some reason FM only uses a single back slash to escape the dot. It should really be two back slashes but use only single back slash as illustrated to be consistent with FM's "mistake". Doesn't matter here because X-Spam-score is always generated syntactically correct. */ Code:
# # For unknown senders prefix Subject with "{SPAM xx.y} " if X-Spam-score >= SPAM_THRESHOLD... # # FM used to do this under the Spam Protection's "Add Score" option but they have since # removed that option. [8/24/20] # # Note that the spam subject lines for unknown senders are always prefixed here if ADD_SCORE # calls for it. Later, at appropriate points in the code, ADD_SCORE may still call for taking # it back out, e.g., for messages being filed into the Spam folder when ADD_SCORE is 0. But # that's not known at this point. # if allof(not string :is "${ADD_SCORE}" "0", # spam score prefix is to be added string :is "${known_sender}" "false", # but only for unknown senders anyof(header :value "gt" :comparator "i;ascii-numeric" "X-Spam-score" "${ST1}", header :regex "X-Spam-Score" "^${ST1}\.[${ST2}]$"), header :matches "Subject" "*") { set "subject" "${1}"; # X-Spam-score >= SPAM_THRESHOLD if header :matches "X-Spam-score" "*" { # prefix Subject with "{SPAM xx.y} " set "spam_score" "${1}"; if header :value "lt" :comparator "i;ascii-numeric" "X-Spam-score" "10" { set "spam_score" "0${1}"; # add leading 0 for spam scores < 10 } deleteheader "Subject"; addheader "Subject" "{SPAM ${spam_score}} ${subject}";# prefix Subject with "{SPAM xx.x} " } } # ADD_SCORE == 0 && spam to an unknown sender I hope all this helps clarify everything. If you really want the code I have to set known_sender let me know. Last edited by xyzzy : 18 Jun 2023 at 12:16 PM. |
18 Jun 2023, 05:16 PM | #58 | |
Essential Contributor
Join Date: May 2018
Posts: 478
|
Quote:
Looking at "normal" code that Sieve checks X-Spam-score the spam scores are always only in the form of xx.y or x.y, i.e., only one decimal place. There cannot be a 2-digit fractional part. Values range from 0.0 to 99.9 (Spam Protection's UI settings enforce this too). Admittedly I am a little lax in the internal error checks. Thinking about this a bit I probably should change the test that extracts the values from the score to a more rigorous regex as follows: Code:
if string :regex "${SPAM_THRESHOLD}" "^([0-9]{1,2})\\.([0-9])$" SPAM_DISCARD is similar but allow 3 digit integers (e.g., 100.0) to disable discards since spam scores are always <100.0. Last edited by xyzzy : 18 Jun 2023 at 07:29 PM. |
|
18 Jun 2023, 07:29 PM | #59 |
Essential Contributor
Join Date: Dec 2017
Location: Scotland
Posts: 492
|
Thanks to @xyzzy for such thorough answers.
I am pleased to see someone else who writes the sort of comments I put in my own code. I don't class this as "a lot"; I class nearly every other programmer's code as "inadequately commented". I do have an observation and some questions though. The first is that for code for one's own use one doesn't have to be terribly careful. I would fully expect that if xyzzy changes this code, he(?) pays full attention to the implications of what gets changed. But when code is made available for other people to use one can't be certain that they will concentrate as much, and/or understand what they are doing. That is, I think this publicly-shared code might need to be written so it doesn't assume parameters are configured correctly. In the code that splits up a configured SPAM_THRESHOLD or SPAM_DISCARD value, the "else" action is to set values to 0. While for SPAM-THRESHOLD that means that later on every email will satisfy the "is my spam score greater than 0" test and thus have the spam score added to its subject, that's unlikely to matter. But the same thing would happen with SPAM_DISCARD and thus every one of a user's emails would be sent to the discard folder. I think that the score values set here should be SD1/ST1 as 9999 (ie much larger than likely to occur) and SD2/ST2 as a regex that will never match. I'd probably also set an error flag (which I'd have initialised to a default error message, clear only if parameters were ok, and set to a specific error message if specific errors were found), and only execute the following code if the error flag had been cleared. I /would/ set both the shouldn't match SD1/ST1/SD2/ST2 values AND the error message even if in theory only the test values or error message needs set, just to be on the safe side (if someone else changes later code without thinking or I change it without paying enough attention). It's also not enough to check that individual parameters have valid values. You need to use them only if the combinations of parameters make sense. Something like not string :is "${ADD_SCORE}" "0" is safe for xyzzy's own code (because he will be careful when setting ADD_SCORE's value in the first place), but not safe (albeit the risk of the condition accidentally being true is low for this specific action) for a user who might not have set ADD_SCORE at all or mis-set it. This "not string" is a shorthand for ADD_SCORE being set to 1 or 2, but really you should explicitly check that the value IS 1 or 2, not that it's not 0. Lots of things are not 0, but only two things are either 1 or 2. I understand completely why you "addheader" error messages into the mails - but a user who never looks at headers won't ever see them (until, maybe, directed to look for them by you). For a user who screws up using code like this, a safer action might be the addheader plus stopping the sieve script completely (so that it can't take any unintended action). Also... irrelevant in a way bearing in mind what I posted above about the "else" values of SD2 and ST2, but I was curious why you set those to "0-0". That makes no sense in regex terms; why not just code "0"? |
18 Jun 2023, 08:49 PM | #60 | ||||||||
Essential Contributor
Join Date: May 2018
Posts: 478
|
Quote:
Quote:
By the way, it's "he"! Quote:
Quote:
I think your 9999 idea is a good one. In fact 100.0 is sufficient since spam scores never exceed 99.9. In the comments for SPAM_DISCARD I already say setting it to 100.0 effectively disables it. Same idea would work for SPAM_THRESHOLD. So I can also use that fact for the error recovery setting ST1/SD1 to 100. Doesn't matter what ST2/SD2 are, Quote:
Quote:
Quote:
Be it all bypassed or doing what I suggest still leaves the problem on how to error report. Short of adding a header, thinking out loud, I guess I could add a prefix to the subject just like the the spam score. That would show up on the subject line. Not sure that's very good either. Quote:
|
||||||||