Redundant Mail Relays with Pine, Eudora and Outlook

We have two WAN links, and are attempting to achieve some redundancy for our outgoing SMTP mail. We have two relay servers (call them):

mailserver IN A 192.168.1.1 IN MX 10 mailserver IN MX 20 mailbackup mailbackup IN A 192.168.1.2

First try uses MX records

mailserver has MX records pointing to itself and a less preferred MX record pointing to mailbackup. Mail clients are instructed to use mailserver as the relay for outgoing mail.

The idea is that if mailserver goes down, then the less preferred MX record will be used to send mail out via mailbackup. This works well for the Unix mail programs such as Pine 4.10, which hands off its outgoing mail to sendmail, which will switch from mailserver to mailbackup with only a 5 minute delay.

But the PC programs don't work so well. Pine 4.32, Eudora Pro 5.0, and Outlook Express 5.0 all ignore the MX records. Microsoft Outlook 5.0 is even worse. In our tests, messages sent while mail was down stay in the outbox and are not sent later when it recovers. This is true even if one hits "send/receive". If you right click on the outbox entry you are told "This message did not get sent". You can forward the message, and it will go out promptly, suggesting that Outlook isn't even trying to send it out.

At the Eudora web site they say say that the Mac version has a "UseMX" option that is off by default, but that page doesn't specify the treatment of MX records in the PC version.

At the Microsoft web site it says that Outlook Express 4.0 follows MX records, and uses the A record only if no MX record is usable, but that OE 4.5 reverses the selection and looks for MX records only if the A record is unusable. There is no mention of how 5.0 might use MX records that I can tell, but it looks to us like it ignores them entirely.

Both web sites suggest that waiting for MX records that may not exist is tiresome and the vendor is doing a favor for the user by ignoring them. The documented behavior of OE 4.5 would seem to be an excellent compromise, but it did not carryover to the later version.

At the Pine web site there is a comment from Marc Crispin (one of the Pine developers):

The brief answer [is] that the requirement for MX records has to do with MTA (Mail Transfer Agent, a.k.a. "mailer") interchange with other MTAs. Pine's SMTP code is not, and is not intended to be, an MTA. It is merely mechanism for queueing a message to an MTA. There are no RFC requirements on a mechanism for queueing to an MTA.

This describes PC-Pine, but we don't have problems with Pine on Unix. This is because of the favorable interaction with sendmail acting as the MTA (which would obey the MX directive) before the message is transferred to the relay server. In any case, with Unix Pine, messages do seem to get through even if the least cost mail relay is down.

Given the content of the quote from Crispin, I was surprised to learn in an anonymous posting to comp.mail.pine (Dec 11, 2000) that Pine users can specify a list of smtp servers (such as "mailserver,mailbackup") in the smtp-server parameter field of the Pine setup, but that failover takes one minute - too long for interactive users.

A brief search through RFC 821 and RFC 974 does not provide any basis for distinguishing MUAs from MTAs. If an MUA uses POP or IMAP to send mail, I wouldn't expect it to conform to RFC 974 when sending mail. But when it uses port 25 to transfer mail to an SMTP server, I do not think it unreasonable to expect it to obey the rules in RFC974 for SMTP clients. Nevertheless, the trend is against my expectation, as illustrated by this text from the newly accepted revision of RFC 974, RFC 2821

Many mail-sending clients exist, especially in conjunction with facilities that receive mail via POP3 or IMAP, that have limited capability to support some of the requirements of this specification, such as the ability to queue messages for subsequent delivery attempts. For these clients, it is common practice to make private arrangements to send all messages to a single server for processing and subsequent distribution. SMTP, as specified here, is not ideally suited for this role, and work is underway on standardized mail submission protocols that might eventually supercede the current practices. In any event, because these arrangements are private and fall outside the scope of this specification, they are not described here.

This would not prevent a user agent from following MX records, but I accept that there are disadvantages to that approach.

In private correspondance Brian Stafford has suggested to me that RFC 2476 (Message Submission) is intended to cover this. That document describes an ESMT-like agent running at port 587 but does not mention any procedure for interacting with redundant servers. In any case, while RFC2476 is supported by Sendmail (since 8.10) on the server side, I am unable to locate any client support at all.[Note added 8/6/2002: It seems that most MUA will accept an SMTP server specification including a port number, e.g. mailserver.example.com:587 and that is sufficient for a client to support RFC 2476].

John C Klensin, one of the authors of RFC2476 writes:

MX records, with a different set of rules about same-preference and different-preference servers, and ways to specify those preferences, really are the right way to handle this parallel and redundant server situations. But, as you have noticed, many mail clients don't do MX lookups for submission servers.

Second Try, with Multiple A Records

Reorganize the DNS zone file so that there is a single hostname with two A records

mailserver IN A 192.168.0.1 mailserver IN A 192.168.0.2

Now clients will receive both numeric addresses when they look for mailserver.example.org. The order will vary, since Bind "round robbins" the addresses. The Windows "ping" command confirms that both addresses are sent and available to applications on the PC. According to RFC1123 Section 2.3 Applications on Multihomed hosts:

When the remote host is multihomed, the name-to-address translation will return a list of alternative IP addresses. As specified in Section 6.1.3.4, this list should be in order of decreasing preference. Application protocol implementations SHOULD be prepared to try multiple addresses from the list until success is obtained. More specific requirements for SMTP are given in Section 5.3.4.
Of course it only says "SHOULD" and we are abusing this section since our multiple A records point to different hosts, rather than a single multi-homed host, but it was worth testing. As long as both hosts are functioning, everything is fine. What happens if one address points to a nonfunctioning server? In that case, Eudora seems to send the message without noticable delay to the remaining server, but both Microsoft products simply hold the mail in their outbox forever.

Conclusion

At the current time, it seems that a combination of Eudora for the client and multiple A records for the server are the only working redundant combination. At our site, users get to pick their mail client, so this helps only a fraction of our users.

Should all clients support multiple A records? I don't see any disadvantage myself, however, Klensin writes:

The difficulty with doing this sort of thing with multiple A RRs associated with the same label is that such a setup is assumed to be multiple addresses for the same host, i.e., a multihomed setup. The rules for what one does when a Qtype=A request is made and produces multiple addresses are in the TCP spec and are further discussed in RFC 1122. My assumption is that, were we to repeat that discussion --or, worse, vary from it for the SMTP submit case-- for a revision of 2476, the IESG would make us take the text back out.

The major stumbling block in adding any feature requiring mail clients to support a redundancy feature is that requirements for Mail User Agents don't belong in standards for Mail Transfer Agents or Mail Subbmission Agents. I can see that point, but of course those standards don't preclude support, either, and it would not be a feature that would ever interfere with any mandated behavior.

LibESTMP, a library for mail submission, apparently will support multilpe A records in the future, and I salute Brian Stafford for responding so positively to my posted plea.

Andrew Filip has noted in comp.mail.sendmail that sendmail allows a "fallback list" in the mailertable. If delivery to first element fails then sendmail tries remaining elements at once. The syntax is

domain1.com smtp:relay1.:relay2.:relay3.

This offers another way to specify inter MTA redundancy, but doesn't help with MUA to MTA reliability.

We note that (if there are any developers listening) there is a new DNS record type "SRV" that would be an alternative for providing redundancy in this situation which is documented in RFC 2782.. SRV records have been supported in Bind since at least 8.2.1, so any client supporting them would be immediately useful.

Daniel Feenberg
feenberg of nber dot org


A comment on this piece: From alex at alex.org.uk Thu Sep 6 06:54:56 2001 Date: Sun, 26 Aug 2001 17:09:52 +0100 From: Alex Bligh <alex at alex.org.uk> To: Daniel Feenberg <feenberg at nber.org> Cc: Alex Bligh <alex at alex.org.uk> Subject: Re: Nanog posting Daniel, This is a different (and simpler) problem than the current NANOG flame war, because your outbound SMTP relays are likely to be topologically close to your users, and certainly are likely to be in the same AS number. You can achieve redundancy of stateless services (like caching DNS for example) by using the 'anycast hack'. IE give each server an (identical) secondary address (say A.B.C.D), and insert this into the routing table (locally - do NOT export) as a /32, with the next hop pointing to each of the real IP addresses. Make your BGP agent insert the /32's only if the DNS service is up. Hey presto, instant redundancy, and also (to boot) localization of caching nameservers, and no problems about configuring user addresses. This is slightly harder for stateful services, including anything running over TCP (like outbound MX as you describe), because servers going down will kill the session (not too bad), but also servers coming UP can kill other sessions (very bad). Two techniques I've achieved with success here are to delegate the zone outbound.smtp.foo.net to a custom DNS server, serving RR's with low TTLs in a round robin manner, but only including those servers which are up at the time. Sadly some client stacks ignore the TTL, meaning they can get a transient failure. Some client s/w responds badly to a transient failure in a manner where it can become a permanent failure (as you discovered). However, there is only a certain amount of client stupidity one can work around. The other, more typical solution is to use some form of L3 switch, that tests the service is up. Nice and simple, and no custom hardware. But your servers need to be local. You can by n>1 L3 switches, and have them in m<<n separate redundant clusters with m different outbound IP addresses again rotating in round robin. Further, you can combine the above two techniques. This gets around L3 switch failures. If you really like tweaking stuff, you can tweak the DNS server to give a result (by default) giving the cluster closest to the user looking things up. Alex
Last Modified 8 June 2002