EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 11 Jul 2011, 06:43 AM   #91
seph
Junior Member
 
Join Date: Jul 2011
Posts: 3
Thank you for the information about the outage, and I'm glad to hear you're adding a management network. (It's generally one of the first things to get built out)

I notice that the status blog post says:

Quote:
We don’t believe any email was lost or bounced
I know the MX servers were not accepting mail for a period this morning. Whether this caused bounces or merely delayed delivery is hard to know. Somewhere around 9:00 EST this morning, I tried connecting. in1.smtp.messagingengine.com was hanging and in2.smtp.messagingengine.com was returning "451 4.3.5 Server configuration problem"
seph is offline   Reply With Quote
Old 11 Jul 2011, 06:46 AM   #92
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
Quote:
Originally Posted by seph View Post
Thank you for the information about the outage, and I'm glad to hear you're adding a management network. (It's generally one of the first things to get built out)

I notice that the status blog post says:



I know the MX servers were not accepting mail for a period this morning. Whether this caused bounces or merely delayed delivery is hard to know. Somewhere around 9:00 EST this morning, I tried connecting. in1.smtp.messagingengine.com was hanging and in2.smtp.messagingengine.com was returning "451 4.3.5 Server configuration problem"
A complient server at the other end would queue the mail and retry, hence our "not lost email" comment.
brong is offline   Reply With Quote
Old 11 Jul 2011, 08:29 AM   #93
Berenburger
The "e" in e-mail
 
Join Date: Sep 2004
Location: The Netherlands
Posts: 2,908
Quote:
Originally Posted by brong View Post
Sorry I haven't looked in on this thread yet. I'm supposed to be on holiday this week (first holiday in about 5 months!) and was spending the day with the kids, but Rob called me urgently needing help, and I dumped the kids on my wife for 4 hours while I helped Rob deal with it, so I haven't come back to chat on the forum until after the kids were in bed.
After more then one year there is no one from the Opera team to assist you two?

Quote:
Originally Posted by brong View Post
I'm flying to Oslo in a few weeks to meet the sysadmin team over there, just making connections at this stage - but with an eye to figuring out what sort of machines we want and how we're going to share the administration load. It will certainly be nice to have timezone coverage, so someone in Oslo deals with issues first before they wake me. Especially the easy things like changing failed disks.
Berenburger is offline   Reply With Quote
Old 11 Jul 2011, 03:28 PM   #94
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
Quote:
Originally Posted by Berenburger View Post
After more then one year there is no one from the Opera team to assist you two?
Yeah, we do have people from opera sysadmin, and they were helping us with suggestions during this ouage. In terms of knowing the architecture inside out - it's not that easy to hand over the full system to another group. We can hand over small jobs, and indeed we have - but a major outage is almost always caused by nasty interactions between systems that we work on every day, so we can deal with them quicker.

I also had a couple of people from the Opera sysadmin team working with me after the fact on trying to track down the cause, and on seeing if we can reconfigure the switches to mitigate this risk.
brong is offline   Reply With Quote
Old 11 Jul 2011, 06:30 PM   #95
Terry
The "e" in e-mail
 
Join Date: Jul 2002
Location: VK4
Posts: 3,029
In the old days when a server went down our mail was still forwarded, but on this recent outage no mail was forwarded to my back up mail account.

There does not seem to be the backup that fastmail once had...do they only have 2 severs now.
Terry is offline   Reply With Quote
Old 11 Jul 2011, 07:01 PM   #96
solenoid
Junior Member
 
Join Date: Oct 2010
Posts: 8
Quote:
Originally Posted by Terry View Post
In the old days when a server went down our mail was still forwarded, but on this recent outage no mail was forwarded to my back up mail account.

There does not seem to be the backup that fastmail once had...do they only have 2 severs now.
They have two backup servers in Iceland for accepting incoming mail, presumably in case there is a problem in New York, but, for whatever reason, they could not accept mail. I believe this happened on another occasion, as well, although my understanding of this may not be correct.
solenoid is offline   Reply With Quote
Old 11 Jul 2011, 09:27 PM   #97
robert@fm
The "e" in e-mail
 
Join Date: Feb 2002
Location: London, UK
Posts: 4,681
Quote:
Originally Posted by eftertanke View Post
I just got this today:
HTML Code:
[...]
<meta http-equiv="Expires" content="Tue, 01 Jan 1981 01:00:00 GMT">
[...]
The above was already 30 years out of date when it was posted?

Quote:
Originally Posted by seph View Post
I know the MX servers were not accepting mail for a period this morning. Whether this caused bounces or merely delayed delivery is hard to know.
It's also beyond the control of FM (or of any other receiving service, as FM (usually Brong) has explained several times) -- it depends on whether the sending service handles temporary bounces correctly.
robert@fm is offline   Reply With Quote
Old 11 Jul 2011, 09:35 PM   #98
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
Quote:
Originally Posted by solenoid View Post
They have two backup servers in Iceland for accepting incoming mail, presumably in case there is a problem in New York, but, for whatever reason, they could not accept mail. I believe this happened on another occasion, as well, although my understanding of this may not be correct.
They collect email for a while, but at some point they realise they have been out of contact with the mothership for a while and their database is getting stale, so they stop accepting. Iceland is still not 100% viable by itself - though there are full replicas of everyone's email there. If we lost NYI fully, the only major loss would be filestorage, which
is not yet replicated. We're working on making the filestorage work too
brong is offline   Reply With Quote
Old 12 Jul 2011, 12:52 AM   #99
seph
Junior Member
 
Join Date: Jul 2011
Posts: 3
Quote:
Originally Posted by brong View Post
A complient server at the other end would queue the mail and retry, hence our "not lost email" comment.
I've now started receiving notifications from various mailing list mangers that mail was bounced. I agree mail was probably not lost, but that's not something I can tell. The blog post understates this.
seph is offline   Reply With Quote
Old 12 Jul 2011, 01:40 AM   #100
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
Quote:
Originally Posted by seph View Post
I've now started receiving notifications from various mailing list mangers that mail was bounced. I agree mail was probably not lost, but that's not something I can tell. The blog post understates this.
Can you forward a couple of these (including full headers) to brong at fastmail.fm please.
brong is offline   Reply With Quote
Old 15 Jul 2011, 02:06 AM   #101
hmh
Junior Member
 
Join Date: Jul 2011
Posts: 2
Not subnet. Separate network.

Quote:
Originally Posted by AlexMorris View Post
You're running servers of any type, never mind blades, and you don't have the management interfaces on a completely different subnet?
That might not have made much of a difference in a packet flood scenario. It depends on what got saturated (switch/inter-switch links/NICs/operating system).

You usually run the management on an entirely separate network because of that. Even VLANs won't help you if the switch itself or the inter-switch links cannot handle the high PPS storm. Even a non-blocking switch can go down or drop too many packets when the packet flood goes through its CPU for some reason (e.g. BPDU storms, or packets not hardware-routed in a L3 switch).
hmh is offline   Reply With Quote
Old 18 Jul 2011, 07:02 AM   #102
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
FYI - purchase order for a separate switch per cabinet was approved while I was on vacation last week - I'm away again this week (home for a day in between, so catching up on emails!) - so we'll get the switches wired up this week and probably next week will move all the management interfaces over. First we need to get a network range routed through all the Opera VPN links for the new management network, and then redo the address allocations for all the managment connections - that's going to be "fun".

So it will probably take a while of carefully reconfiguring things and then putting in tickets to get the cables moved. But at least it's a one time job
brong is offline   Reply With Quote
Old 18 Jul 2011, 09:41 PM   #103
rblon
Essential Contributor
 
Join Date: Jun 2009
Posts: 340
Quote:
Originally Posted by brong View Post
We aren't sure what caused it yet. The only thing we have that does any multicast traffic on our network is the NTP (time protocol) system. This has only been set up fairly recently to use multicast (last few weeks) but Opera uses it fine elsewhere. It's quite hard to believe that it's responsible, though I don't know what else it could be either. It's almost certainly some piece of networking hardware or software getting into an infinite loop.
Have you been able to find out what caused it?
rblon is offline   Reply With Quote
Old 18 Jul 2011, 09:44 PM   #104
brong
The "e" in e-mail
 
Join Date: Jul 2004
Location: Melbourne, Australia
Posts: 2,696

Representative of:
Fastmail.fm
Quote:
Originally Posted by rblon View Post
Have you been able to find out what caused it?
No, we haven't. There was nothing in the logs. We're looking at how to protect the switch against broadcast and multicast packet storms. It supposedly has the ability, but it's not turned on by default, and the manual is 400 pages long! Still reading to make sure we won't break anything first.
brong is offline   Reply With Quote
Old 19 Jul 2011, 04:16 PM   #105
gardenweed
Cornerstone of the Community
 
Join Date: Jun 2008
Location: Perth
Posts: 664
Down again?
19-Jul-11 3:15 pm Perth

Edit
3:18pm back again.

Last edited by gardenweed : 19 Jul 2011 at 04:18 PM. Reason: Change in status
gardenweed is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 06:42 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy