EmailDiscussions.com

EmailDiscussions.com (http://www.emaildiscussions.com/index.php)
-   FastMail Forum (http://www.emaildiscussions.com/forumdisplay.php?f=27)
-   -   Sieve Rule weirdness (http://www.emaildiscussions.com/showthread.php?t=74622)

JamesHenderson 2 Sep 2019 10:28 PM

Sieve Rule weirdness
 
Hi all,

Can anyone please tell me what I am doing wrong, such that this email:
notifications-noreplyATlinkedin.com
goes into my subscriptions folder despite these rules:

elsif
anyof(
address :matches "From" "*[.@]linkedin.com",
)
{
fileinto "INBOX.work";
}
elsif
anyof(
exists "List-Unsubscribe",
body :text :contains "unsubscribe",
body :text :contains "Newsletter",
)
{
fileinto "INBOX.subscriptions";
}
I thought the use of elsif meant that a subsequent elsif isn't tested if the previous if or elsif is successful.

thanks,
James.

SideshowBob 3 Sep 2019 02:36 AM

Firstly I don't think you can have a trailing comma on the end of a list.

Secondly, I don't see anything in RFC 5228 to suggest that [.@] is supported.

JamesHenderson 3 Sep 2019 03:28 AM

Quote:

Originally Posted by SideshowBob (Post 611361)
Firstly I don't think you can have a trailing comma on the end of a list.

Thanks for replying. This is the sieve generated by Fastmail themselves - I put the rule into their GUI and they generated that comma.


Quote:

Originally Posted by SideshowBob (Post 611361)
Secondly, I don't see anything in RFC 5228 to suggest that [.@] is supported.

yeah, maybe that's the problem, but their website says they support the regex extension. I read it here, but I guess that's the wrong paper as I can now see it was a draft. I'll look up RFC 5228 and have a read - thanks!

SideshowBob 3 Sep 2019 03:40 AM

Quote:

Originally Posted by JamesHenderson (Post 611362)
Thanks for replying. This is the sieve generated by Fastmail themselves - I put the rule into their GUI and they generated that comma.

I just put such a comma into my script and it created a syntax error.

Quote:

yeah, maybe that's the problem, but their website says they support the regex extension. I read it here, but I guess that's the wrong paper as I can now see it was a draft. I'll look up RFC 5228 and have a read - thanks!

:matches doesn't use a regular expression, it's a basic glob. You need :regex.

JamesHenderson 3 Sep 2019 03:52 AM

Hi,

Thanks for the fast replies.

Quote:

Originally Posted by SideshowBob (Post 611363)
I just put such a comma into my script and it created a syntax error.

commas are only needed when you have more than one line. The last line does not have a comma but I removed it for privacy thereby creating an error. I should have written:

anyof(
address :matches "From" "*[.@]linkedin.com" #no comma here
)
and

anyof(
exists "List-Unsubscribe", #comma needed to separate
body :text :contains "unsubscribe", #comma needed to separate
body :text :contains "Newsletter" #no comma here
)

Quote:

Originally Posted by SideshowBob (Post 611363)
:matches doesn't use a regular expression, it's a basic glob. You need :regex.

Yes, just worked that out - thanks!

xyzzy 3 Sep 2019 07:08 AM

Commas are only needed to separate multiple tests, i.e., a list of tests. It has nothing to do with multiple lines. You can put the list items on multiple lines, single lines, any combination.

I keep looking at the construct,

:matches "From" "*[.@]linkedin.com"

So is the intent here to look for anything or nothing followed by a dot or @ followed by linkedin.com? What's wrong with just "*@linkedin.com"? Personally I never used square brackets within a globing expression so that's what threw me off.

For matches brackets are also for specifying a list of alternatives, i.e., ["a", "b", "c']. In regex brackets enclose a list of characters or character ranges. But as shown that is not a valid regex expression.

The equivalent regex to a matches "*@linkedin.com" would be would be ".*@linkedin.com" or if you're a purist maybe ".+@linkedin.com".

SideshowBob 3 Sep 2019 10:11 AM

Quote:

Originally Posted by xyzzy (Post 611368)
[/size][/font]So is the intent here to look for anything or nothing followed by a dot or @ followed by linkedin.com? What's wrong with just "*@linkedin.com"?

Presumably the point is to match any address on linkedin.com including subdomain addresses (without matching other domains that end in linkedin.com).

Quote:

The equivalent regex to a matches "*@linkedin.com" would be would be ".*@linkedin.com".
The equivalent would be: "@linkedin\\.com$"

Note the double backslash. An extended regular expression would only need one to escape the dot, but the backslash character itself needs to be escaped in sieve strings if it's not escaping a double quote.

xyzzy 3 Sep 2019 12:03 PM

Quote:

Originally Posted by SideshowBob (Post 611370)
Presumably the point is to match any address on linkedin.com including subdomain addresses (without matching other domains that end in linkedin.com).

Ok, that may be the intent but it doesn't work. Maybe places like shell scripting support specifications of alternative choices in square brackets but sieve does not (well apparently sieve tester does not - can't be sure anymore sieve tester is a true representation of FM's sieve itself these days, see below). To sieve [.@] is what I originally thought, i.e., the sequence of characters [.@]. And if you are correct about taking into account ensuring it only matches on linkedin.com domains then the matches argument needs to be,

["*@linkedin.com", "*@*.linkedin.com"]

Quote:

The equivalent would be: "@linkedin\\.com$"
As a regex that will never work. That does not test for an email address local part (part before the @). So the complete regex would be

".+@(.*\\.)?linkedin\\.com$"

taking into account that case you mentioned for the matches, i.e., any email address whose parent domain is linkedin.com.

At this point then the regex is probably more concise than the matches.

Of course in the real world, IMO, a matches on "*@*linkedin.com" is probably "good enough".;)

Note, I created the following sieve tester script for these cases:

Code:

require ["regex", "fileinto"];

#if header :regex "From" ".+@(.*\\.)?linkedin\\.com$" {
if header :matches "From" ["*@linkedin.com", "*@*.linkedin.com"] {
  fileinto "match";
}

And created a From: xxx line in the email section to test out various cases.

JamesHenderson 3 Sep 2019 05:02 PM

Thanks for your reply!

Quote:

Originally Posted by xyzzy (Post 611368)
Commas are only needed to separate multiple tests, i.e., a list of tests. It has nothing to do with multiple lines. You can put the list items on multiple lines, single lines, any combination.

You are quite right - I put each test on a seperate line to keep it tidy; I meant test, when I wrote line

Quote:

Originally Posted by xyzzy (Post 611368)
:matches "From" "*[.@]linkedin.com"

So is the intent here to look for anything or nothing followed by a dot or @ followed by linkedin.com? What's wrong with just "*@linkedin.com"? Personally I never used square brackets within a globing expression so that's what threw me off.

Yes, I am looking at either of the following:
name@domain.tld
name@subdomain.domain.tld
Quote:

Originally Posted by xyzzy (Post 611368)
For matches brackets are also for specifying a list of alternatives, i.e., ["a", "b", "c']. In regex brackets enclose a list of characters or character ranges. But as shown that is not a valid regex expression.

Oh. This worked in regexr.com and I am sure that's how Bron wrote it in a piece of sieve code he gave me some years ago that I no longer have.

Quote:

Originally Posted by xyzzy (Post 611368)
The equivalent regex to a matches "*@linkedin.com" would be would be ".*@linkedin.com" or if you're a purist maybe ".+@linkedin.com".

"." and "@' are my alternatives.

JamesHenderson 3 Sep 2019 05:12 PM

Quote:

Originally Posted by xyzzy (Post 611372)

".+@(.*\\.)?linkedin\\.com$"

I get that I forgot to escape the ".", but don't get either "\\" - could you please explain that as I would have thought the following would have worked:
Code:

.+@(.+\.)?linkedin\.com$
cheers,
James.

btw, I am not actually looking for subdomains of LinkedIn - it was just an example to use.

[edit: added my proposed version]

JamesHenderson 3 Sep 2019 05:21 PM

Quote:

Originally Posted by xyzzy (Post 611372)

Code:

require ["regex", "fileinto"];

#if header :regex "From" ".+@(.*\\.)?linkedin\\.com$" {
if header :matches "From" ["*@linkedin.com", "*@*.linkedin.com"] {
  fileinto "match";
}

And created a From: xxx line in the email section to test out various cases.

Why header, not address?

cheers,
James.

xyzzy 3 Sep 2019 06:51 PM

Quote:

Originally Posted by JamesHenderson (Post 611373)
Oh. This worked in regexr.com and I am sure that's how Bron wrote it in a piece of sieve code he gave me some years ago that I no longer have.

If that is in reference to using that [.@] in matches then try it in that sieve tester. Apparently sieves support of globbing is more restrictive. It didn't work when I tried it in sieve tester.

Quote:

Originally Posted by JamesHenderson (Post 611374)
I get that I forgot to escape the ".", but don't get either "\\" - could you please explain that as I would have thought the following would have worked:
Code:

.+@(.+\.)?linkedin\.com$

My ".+@(.*\\.)?linkedin\\.com$" means either nothing between the @ and the linkedin.com or something ending in a dot before the linkedin.com. This was because the conversation diverted into allowing subdomains before the linkedin.com. Now that I look at mine again I suppose I should have used .+@(.+\\.)?linkedin\\.com. You would end up with this in generated sieve code if you typed yours (single slash) as an organize rule. I tend to think of it as it appears in the actual sieve code.

And yes, I really meant address, not header. My mind has a tendency to mean one thing and write another too.

JamesHenderson 3 Sep 2019 07:04 PM

Quote:

Originally Posted by xyzzy (Post 611376)
If that is in reference to using that [.@] in matches then try it in that sieve tester. Apparently sieves support of globbing is more restrictive. It didn't work when I tried it in sieve tester.



My ".+@(.*\\.)?linkedin\\.com$" means either nothing between the @ and the linkedin.com or something ending in a dot before the linkedin.com. This was because the conversation diverted into allowing subdomains before the linkedin.com. Now that I look at mine again I suppose I should have used .+@(.+\\.)?linkedin\\.com. You would end up with this in generated sieve code if you typed yours (single slash) as an organize rule. I tend to think of it as it appears in the actual sieve code.

And yes, I really meant address, not header. My mind has a tendency to mean one thing and write another too.

Thanks, xyzzy.

I got the gist of your code (thanks). but my question was specifically why two slahes were needed in succession. I can see that Fastmail translates my single slash into two, but why? ...I cannot see any reference to double-shlashing having a special meaning in regex (it seems to me that the first slash escapes the second slash).

thanks for being so helpful :-)

hbs 3 Sep 2019 09:20 PM

Quote:

I got the gist of your code (thanks). but my question was specifically why two slahes were needed in succession. I can see that Fastmail translates my single slash into two, but why? ...I cannot see any reference to double-shlashing having a special meaning in regex (it seems to me that the first slash escapes the second slash).
Can't help with the why, but FM needs the double backslashes. This means any regex created in a regex tool has to be modified in order to work in sieve. And vice versa when taking a regex from FM's sieve.

That's why I stay away from regex in sieve whenever possible.

What about this alternative approach?

Code:

if anyof(
        address :domain :matches "From" ["linkedin.com", "*.linkedin.com"]
){
        fileinto "INBOX.work";
}
elsif anyof(
        exists "List-Unsubscribe",
        body :text :contains "unsubscribe",
        body :text :contains "Newsletter"
){
        fileinto "INBOX.subscriptions";
}


JamesHenderson 3 Sep 2019 09:26 PM

thanks, hbs.

I was not aware of :domain, but that is certainly a robust way of doing it. Whilst I was trying to implement a rule, I was also trying to use it as a way to further my regex/sieve knowledge.

Knowing that "this is how fastmail does it" is a bit of a relief as I could not find documentation anywhere for the double-slash.

By the way, would the line not be:
Code:

        address :domain :matches "From" ["@linkedin.com", "*.linkedin.com"]
(I added an @ before the first LinkedIn)

cheers,
James.


All times are GMT +9. The time now is 03:44 AM.


Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy