EmailDiscussions.com  

Go Back   EmailDiscussions.com > Email Service Provider-specific Forums > FastMail Forum
Register FAQ Members List Calendar Today's Posts
Stay in touch wirelessly

FastMail Forum All posts relating to FastMail.FM should go here: suggestions, comments, requests for help, complaints, technical issues etc.

Reply
 
Thread Tools
Old 23 Apr 2002, 05:44 PM   #1
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Talking HOORAY: IMAP bug identified!!!

<Executive Summary: IMAP server problem is diagnosed. We will have a fix within a few hours. Remainder of this post is boring technical detail...>

Fantastic--Rob and I have tracked down the IMAP bug where it stops accepting connections! We've slowly been adding more and more diagnostics, so when our server stops responding, before it is restarted a huge suite of tests are run automatically. Today, we had a POP server outage, and the test suite provided the key information at last!

It turns out that the problem is not with the IMAP server at all, and not even with Linux, but with a bug with the TCP/IP protocol itself! Here is a response to another person's question from famous kernel hacker Alan Cox: http://www.uwsg.iu.edu/hypermail/lin...09.2/0007.html . As you can see, people were dealing with this problem back in 1995!.

This problem was used in a notorious incident in 1996: http://cr.yp.to/syncookies.html . This incident and DJ Bernstein's solution are well-known, however the syncookie solution is recommended only for use under a denial of service attack (that are various downsides to using it all the time). In our case, we were not under a DOS attack, since only some ports at intermittent times were impacted, and they were at our busiest times, not just a random time as would be expected for a DOS attack.

In our case, the problem is actually due to high network load on our server, along with connections to our site from broken network clients. A description from someone else who saw this problem is here: http://groups.google.com/groups?hl=e...tel.com&rnum=4

Basically, someone with a broken network client (some old versions of Windows, I believe) tries to connect and fails to send an "FIN" packet following our "ACK": http://groups.google.com/groups?hl=e...%40cpmsnbbsa07 . The broken client tries again, and after a few goes fills up our SYN queue for that port. It is then impossible for our server to create new connections, since a SYN is required as part of the connection negotiation.

There are a range of fixes we can try, including decreasing the SYN timeout, increasing the SYN queue, and turning on syncookies. Fingers crossed!
Jeremy Howard is offline   Reply With Quote

Old 23 Apr 2002, 06:00 PM   #2
spiderman
Cornerstone of the Community
 
Join Date: Feb 2002
Location: Cork, Ireland
Posts: 802

Representative of:
Fastcheck.org
Excellent news!

As you use connection pooling between the web server and the IMAP server, it might explain why the web server was not subject to this bug as much as 'real' clients were...

SM
spiderman is offline   Reply With Quote
Old 23 Apr 2002, 06:04 PM   #3
ady
Essential Contributor
 
Join Date: Apr 2002
Location: Singapore
Posts: 482
Hey, that's a good news.

No wonder I never encountered the problem since my local time is +7 while, I think, most of your users are located in US or near their local time. I usually already in bed when they get the problem.

ady
ady is offline   Reply With Quote
Old 23 Apr 2002, 06:25 PM   #4
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Quote:
Originally posted by spiderman
As you use connection pooling between the web server and the IMAP server, it might explain why the web server was not subject to this bug as much as 'real' clients were...
Very true. Also, the local interface can only be accessed by our own services, that do not have a broken network stack.

I've now implemented a number of protection mechanisms, and while I was at it I implemented some network optimisations. Here's the list we are now using--any TCP/IP or Linux guru comments most welcome!:
Code:
#!/bin/bash
echo 1 > /proc/sys/net/ipv4/icmp_echo_ignore_broadcasts
echo 0 > /proc/sys/net/ipv4/conf/all/accept_source_route
echo 0 > /proc/sys/net/ipv4/tcp_timestamps
echo 1 > /proc/sys/net/ipv4/icmp_ignore_bogus_error_responses
echo 100 > /proc/sys/net/ipv4/icmp_ratelimit
echo 1 > /proc/sys/net/ipv4/ip_dynaddr
echo 1 > /proc/sys/net/ipv4/tcp_rfc1337

# NB: Only for DOS recovery: causes SMTP problems
# echo 1 > /proc/sys/net/ipv4/tcp_syncookies

echo 2048 > /proc/sys/net/ipv4/tcp_max_syn_backlog
echo 2 > /proc/sys/net/ipv4/tcp_synack_retries
echo 2 > /proc/sys/net/ipv4/tcp_syn_retries

# Only use if listening daemon can not be tuned
echo 1 > /proc/sys/net/ipv4/tcp_abort_on_overflow

# Takes care of interface-based settings
echo 1 > /proc/sys/net/ipv4/conf/all/log_martians
echo 0 > /proc/sys/net/ipv4/conf/all/accept_redirects
echo 1 > /proc/sys/net/ipv4/conf/all/rp_filter

#Reduce DoS'ing ability by reducing timeouts
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1800 > /proc/sys/net/ipv4/tcp_keepalive_time
echo 30 > /proc/sys/net/ipv4/tcp_keepalive_intvl
echo 4 > /proc/sys/net/ipv4/tcp_keepalive_probes
echo 0 > /proc/sys/net/ipv4/tcp_window_scaling
echo 0 > /proc/sys/net/ipv4/tcp_sack

#used to accept connections from broken solaris machines
echo 0 > /proc/sys/net/ipv4/tcp_ecn

#used to config nmaps
# Hide from finger-printing tools
echo 220 > /proc/sys/net/ipv4/ip_default_ttl

# Maximum number of packets, queued on the INPUT side, when
# the interface receives pkts faster than it can process them.
echo 4096 > /proc/sys/net/core/netdev_max_backlog

# echo 2000000 > /proc/sys/net/ipv4/tcp_max_tw_buckets

# Send and receive buffer sizes
echo 512000 > /proc/sys/net/core/wmem_default
echo 512000 > /proc/sys/net/core/wmem_max
echo 512000 > /proc/sys/net/core/rmem_default
echo 512000 > /proc/sys/net/core/rmem_max
Jeremy Howard is offline   Reply With Quote
Old 23 Apr 2002, 06:46 PM   #5
jhs
Essential Contributor
 
Join Date: Jan 2002
Location: Zurich, Switzerland
Posts: 350
Great news.
Jeremy, do you think this will also solve the problem with Opera using SSL where the connection seems to be cut at more or less reproducible patterns? See for example http://www.emaildiscussions.com/...?threadid=2174

- Jan
jhs is offline   Reply With Quote
Old 23 Apr 2002, 08:03 PM   #6
pobelly
Cornerstone of the Community
 
Join Date: Nov 2001
Posts: 586
heh... TCP/IP - designed to withstand nuclear devastation, but not 'crank calls'... (do they still call them that?)

once again, thanks for some interesting reading... this is a fun way to get an education.
pobelly is offline   Reply With Quote
Old 23 Apr 2002, 08:15 PM   #7
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Quote:
Originally posted by jhs
Jeremy, do you think this will also solve the problem with Opera using SSL where the connection seems to be cut at more or less reproducible patterns?
No. My understanding at this stage is that this is due to a bug in how Opera handles multipart encoded forms, and can not be fixed at our end. I don't claim to understand this problem well enough however to be definitive about this...

The change in our TCP buffer size could conceivably work-around the problem--that would be a nice coincidence if it turned out to be the case...
Jeremy Howard is offline   Reply With Quote
Old 23 Apr 2002, 08:17 PM   #8
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
Quote:
Originally posted by pobelly
heh... TCP/IP - designed to withstand nuclear devastation, but not 'crank calls'...
It's amazing how many core internet protocols work on the assumption that everyone plays nice...
Quote:
once again, thanks for some interesting reading... this is a fun way to get an education.
Oh dear. That kind of curiousity is meant to get killed off by the school system. Obviously you were an undisiplined child...
Jeremy Howard is offline   Reply With Quote
Old 23 Apr 2002, 08:44 PM   #9
kirill
Cornerstone of the Community
 
Join Date: Jun 2001
Posts: 879
What's the problem with having SYN cookies enabled? They are on by default on *BSDs (at least on OpenBSD and FreeBSD).

--
Kirill
kirill is offline   Reply With Quote
Old 23 Apr 2002, 08:44 PM   #10
pobelly
Cornerstone of the Community
 
Join Date: Nov 2001
Posts: 586
Quote:
Obviously you were an undisiplined child...
well... less school-damaged than most people i know, anyway... just lucky, i guess.

Quote:
What's the problem with having SYN cookies enabled? They are on by default on *BSDs (at least on OpenBSD and FreeBSD).
yeah, finish the lesson! it's past my bedtime, and i don't want to start a research mission myself...

Last edited by pobelly : 23 Apr 2002 at 08:48 PM.
pobelly is offline   Reply With Quote
Old 23 Apr 2002, 09:06 PM   #11
Jeremy Howard
Ultimate Contributor
 
Join Date: Sep 2001
Location: Australia
Posts: 11,501
From the Linux docs:
Code:
tcp_syncookies - BOOLEAN
        Only valid when the kernel was compiled with CONFIG_SYNCOOKIES
        Send out syncookies when the syn backlog queue of a socket
        overflows. This is to prevent against the common 'syn flood attack'
        Default: FALSE

        Note, that syncookies is fallback facility.
        It MUST NOT be used to help highly loaded servers to stand
        against legal connection rate. If you see synflood warnings
        in your logs, but investigation shows that they occur
        because of overload with legal connections, you should tune
        another parameters until this warning disappear.
        See: tcp_max_syn_backlog, tcp_synack_retries, tcp_abort_on_overflow.

        syncookies seriously violate TCP protocol, do not allow
        to use TCP extensions, can result in serious degradation
        of some services (f.e. SMTP relaying), visible not by you,
        but your clients and relays, contacting you. While you see
        synflood warnings in logs not being really flooded, your server
        is seriously misconfigured.
Pobelly--aren't you joining us on IRC?
Jeremy Howard is offline   Reply With Quote
Old 23 Apr 2002, 09:21 PM   #12
pobelly
Cornerstone of the Community
 
Join Date: Nov 2001
Posts: 586
no... you'll be safe from me there.

i have a hard enough time keeping up here...
pobelly is offline   Reply With Quote
Old 23 Apr 2002, 10:04 PM   #13
kirill
Cornerstone of the Community
 
Join Date: Jun 2001
Posts: 879
Quote from http://cr.yp.to/syncookies.html:

Quote:
A few people (notably Alexey Kuznetsov, Wichert Akkerman, and Perry Metzger) have been spreading misinformation about SYN cookies. Here are some of their
bogus claims:

SYN cookies ``present serious violation of TCP protocol.'' Reality: SYN cookies are fully compliant with the TCP protocol. Every packet sent by a SYN-cookie server is something that could also have been sent by a non-SYN-cookie server.

SYN cookies ``do not allow to use TCP extensions'' such as large windows. Reality: SYN cookies don't hurt TCP extensions. A connection saved by SYN cookies can't use large windows; but the same is true without SYN cookies, because the connection would have been destroyed.

SYN cookies cause ``massive hanging connections.'' Reality: With or without SYN cookies, connections occasionally hang because a computer or network is overloaded. Applications deal with this by simply dropping idle connections.

SYN cookies cause ``serious degradation of service.'' Reality: SYN cookies improve service. They do take a small amount of CPU time to compute, but that CPU time has to be spent anyway for hard-to-predict sequence numbers; see RFC 1948.

SYN cookies cause ``magic resets.'' Reality: SYN cookies never cause resets.
--
Kirill
kirill is offline   Reply With Quote
Old 23 Apr 2002, 10:19 PM   #14
mklose
Cornerstone of the Community
 
Join Date: Apr 2002
Location: Germany
Posts: 693
If I remember rightly, SYN-Cookies were default in Linux in 2.0.xx and 2.2 kernels too. But I am not 100% sure, but I think I remember it when compiling.

As I mentioned before, another thing you could do is play a little with IP-Tables to rate limit incoming connections. But try this out in a lab.
mklose is offline   Reply With Quote
Old 23 Apr 2002, 11:53 PM   #15
Tango
Cornerstone of the Community
 
Join Date: Mar 2002
Posts: 529
That's great news! Didn't think it would be so fast! Can I uncross my toes at least?

Last edited by Tango : 24 Apr 2002 at 12:41 AM.
Tango is offline   Reply With Quote
Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +9. The time now is 09:30 AM.

 

Copyright EmailDiscussions.com 1998-2022. All Rights Reserved. Privacy Policy