|
FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc. |
|
Thread Tools |
15 Aug 2002, 05:05 AM | #16 |
Cornerstone of the Community
Join Date: May 2002
Location: California
Posts: 617
|
Looks like it is back up.
|
15 Aug 2002, 05:06 AM | #17 |
Senior Member
Join Date: Oct 2001
Location: Maryland, USA
Posts: 146
|
Everything seems to be back up.
|
15 Aug 2002, 05:06 AM | #18 |
Cornerstone of the Community
Join Date: Apr 2002
Location: Muscat, Oman
Posts: 551
|
It works again, suddenly.
Rakhesh |
15 Aug 2002, 05:10 AM | #19 |
Moderator
Join Date: Dec 2001
Location: Long Island, NY
Posts: 2,654
|
Yipee! It might have done one of those self-correcting things. I guess we'll find out later..or in my case tomorrow. My computer is very, very sick and I won't have it at home tonight.
They have a few different people in different time zones to page them if things are down. I think the system will automatically page them under certain circumstances, too. |
15 Aug 2002, 05:57 AM | #20 |
Ultimate Contributor
Join Date: Sep 2001
Location: Australia
Posts: 11,501
|
The problem was caused by a horrible web-bot from AltaVista. It looped from the login page to the signup page and back again in an infinite loop. It did this from 128 different IPs, so our normal IP throttling was defeated. The enormous number of accesses caused our session storage to fill up (we use a RAM disk for session storage for speed, but as a result it doesn't have much room). Once that filled up, no new sessions could be created, so our test script failed. The test script tried to rectify the problem, but couldn't, because the persistent AltaVista 'bot kept filling up the session storage.
We ended up getting paged 5 times--once by each front-end server, and twice by our helpful server monitors. We blocked the IP range of AV's 'bot, cleaned up the session storage, and restarted the web server. Rob is looking into how we can improve our IP tracking to handle distributed attacks better, and we've added AV's bot to our banned-useragents list. Oh BTW, our new server just arrived! |
15 Aug 2002, 06:54 AM | #21 | |
Essential Contributor
Join Date: Apr 2002
Posts: 326
|
Quote:
Cheers, Mike |
|
15 Aug 2002, 07:50 AM | #22 |
Cornerstone of the Community
Join Date: May 2002
Location: California
Posts: 617
|
Jeremy,
Just curious, why would this have hung the IMAP servers, and not just the web proxies? Also, how did server2 remain immune? Thx. |
15 Aug 2002, 08:21 AM | #23 |
Cornerstone of the Community
Join Date: Apr 2002
Location: Germany
Posts: 693
|
oOo, I think I can make an educated guess here:
they probably have the whole of /var (apart from some subdirs, most notably /var/spool which they probably have extra disks for) as the ram disk. Basically nothing works anymore if /var is full as this is where most applications store their status information. |
15 Aug 2002, 08:34 AM | #24 |
Ultimate Contributor
Join Date: Sep 2001
Location: Australia
Posts: 11,501
|
Nice try, but you guessed wrong!...
The problem was that the corrective action taken by the test script included trying to restart the IMAP server. This failed to complete inside the timeout because so much other stuff was going on, so the system just paged us instead of going any further. We left IMAP down for a few minutes while we checked the source of the problem. So IMAP wasn't down as long as the web interface--only between the test script trying and failing and paging us, and us deciding it was safe to bring it up again. |
15 Aug 2002, 10:59 AM | #25 |
Cornerstone of the Community
Join Date: Apr 2002
Location: Germany
Posts: 693
|
I can't always be right *grin*
|