Distribution of Milter responsibility ===================================== Milters look at the SMTP commands as well as the message content. In Postfix these are handled by different processes: - smtpd(8) (the SMTP server) focuses on the SMTP commands, strips the SMTP encapsulation, and passes envelope information and message content to the cleanup server. - the cleanup(8) server parses the message content (it understands headers, body, and MIME structure), and creates a queue file with envelope and content information. The cleanup server adds additional envelope records, such as when to send a "delayed mail" notice. If we want to support message modifications (add/delete recipient, add/delete/replace header, replace body) then it pretty much has to be implemented in the cleanup server, if we want to avoid extra temporary files. Network versus local submission =============================== As of Sendmail 8.12, all mail is received via SMTP, so all mail is subject to Miltering (local submissions are queued in a submission queue and then delivered via SMTP to the main MTA, or appended to $HOME/dead.letter). In Postfix, local submissions are received by the pickup server, which feeds the mail into the cleanup server after doing basic sanity checks. How do we set up the Milters with SMTP mail versus local submissions? - SMTP mail: smtpd creates Milter contexts, and sends them, including their sockets, to the cleanup server. The smtpd is responsible for sending the Milter abort and close messages. Both smtpd and cleanup are responsible for closing their Milter socket. Since smtpd and cleanup inspect mail at different times, there is no conflict with access to the Milter socket. - Local submission: the cleanup server creates Milter contexts. The cleanup server provides dummy connect and helo information, or perhaps none at all, and provides sender and recipient events. The cleanup server is responsible for sending the Milter abort and close messages, and for closing the Milter socket. A special case of local submission is "sendmail -t". This creates a record stream in which recipients appear after content. However, Milters expect to receive envelope information before content, not after. This is not a problem: just like a queue manager, the cleanup-side Milter client can jump around through the queue file and send the information to the Milter in the expected order. Interaction with XCLIENT, "postsuper -r", and external content filters ====================================================================== Milter applications expect that the MTA supplies context information in the form of Sendmail-like macros (j=hostname, {client_name}=the SMTP client hostname, etc.). Not all these macros have a Postfix equivalent. Postfix 2.3 makes a subset available. If Postfix does not implement a specific macro, people can usually work around it. But we should avoid inconsistency. If Postfix can make macro X available at Milter protocol stage Y, then it must also be able to make that macro available at all later Milter protocol stages, even when some of those stages are handled by a different Postfix process. Thus, when adding Milter support for a specific Sendmail-like macro to the SMTP server: - We may have to update the XCLIENT protocol, so that Milter applications can be tested with XCLIENT. If not, then we must prominently document everywhere that XCLIENT does not provide 100% accurate simulation for Milters. An additional complication is that the SMTP command length is limited, and that each XCLIENT command resets the SMTP server to the 220 stage and generates "connect" events for anvil(8) and for Milters. - The SMTP server has to send the corresponding attribute to the cleanup server. The cleanup server then stores the attribute in the queue file, so that Milters produce consistent results when mail is re-queued with "postsuper -r". But wait, there is more. If mail is filtered by an external content filter, then it needs to preserve all the Milter attributes so that after "postsuper -r", Milters produce the exact same result as when mail was received originally by Postfix. Specifically, after "postsuper -r" a signing Milter must not sign mail that it did not sign on the first pass through Postfix, and it must not reject mail that it accepted on the first pass through Postfix. Instead of trying to re-create the Milter execution environment after "postsuper -r" we simply disable Milter processing. The rationale for this is: if mail was Miltered before it was written to queue file, then there is no need to Milter it again. We might want to take a similar approach with external (signing or blocking) content filters: don't filter mail that has already been filtered, and don't filter mail that didn't need to be filtered. Such mail can be recognized by the absence of a "content_filter" record. To make the implementation efficient, the cleanup server would have to record the presence of a "content_filter" record in the queue file header. Message envelope or content modifications ========================================= Milters can send modification requests after receiving the end of the message body. If we can implement all the header/body-related Milter operations in the cleanup server, then we can try to edit the queue file in place, without ever having to make a temporary copy. Once a Milter is done editing, the queue file can be used as input for the next Milter, and so on. Finally, the cleanup server calls fsync() and waits for successful return. To implement in-place queue file edits, we need to introduce surprisingly little change to the existing Postfix queue file structure. All we need is a way to specify a jump from one place in the file to another. Postfix does not store queue files as plain text files. Instead all information is stored in records with an explicit type and length for sender, recipient, arrival time, and so on. Even the content that makes up the message header and body is stored as records with an explicit type and length. This organization makes it very easy to introduce pointer records, which is what we will use to jump from one place in a queue file to another place. - Deleting a recipient or header record is easy - just mark the record as killed. When deleting a recipient, we must kill all recipient records that result from virtual alias expansion of the original recipient address. When deleting a very long header or body line, multiple queue file records may need to be killed. We won't try to reuse the deleted space for other purposes. - Replacing header or body records involves pointer records. Basically, a record is replaced by overwriting it with a forward pointer to space after the end of the queue file, putting the new record there, followed by a reverse pointer to the record that follows the replaced information. If the replaced record is shorter than a pointer record, we relocate the records that follow it to the new area, until we have enough space for the forward pointer record. See below for a discussion on what it takes to make this safe. Postfix queue files are segmented. The first segment is for envelope records, the second for message header and body content, and the third segment is for information that was extracted or generated from the message header and body content. Each segment is terminated by a marker record. For now we don't want to change their location. In particular, we want to avoid moving the start of a segment. To ensure that we can always replace a header or body record by a pointer record, without having to relocate a marker record, the cleanup server always places a dummy pointer record at the end of the headers and at the end of the body. When a Milter wants to replace an entire body, we have the option to overwrite existing body records until we run out of space, and then writing a pointer to space at the end of the queue file, followed by the remainder of the body, and a pointer to the marker that ends the message content segment. - Appending a recipient or header record involves pointer records as well. This requires that the queue file already contains a dummy pointer record at the place where we want to append recipient or header content (Milters currently do not replace individual body records, but we could add this if need be). To append, change the dummy pointer into a forward pointer to space after the end of a message, put the new record there, followed by a reverse pointer to the record that follows the forward pointer. To append another record, replace the reverse pointer by a forward pointer to space after the end of a message, put the new record there, followed by the value of the reverse pointer that we replace. Thus, there is no one-to-one correspondence between forward and backward pointers! In fact, there can be multiple forward pointers for one reverse pointer. When relocating a record we must not relocate the target of a jump ================================================================== As discussed above, when replacing an existing record, we overwrite it with a forward pointer to the new information. If the old record is too small we relocate one or more records that follow the record that's being replaced, until we have enough space for the forward pointer record. Now we have to become really careful. Could we end up relocating a record that is the target of a forward or reverse pointer, and thus corrupt the queue file? The answer is NO. - We never relocate end-of-segment marker records. Instead, the cleanup server writes dummy pointer records to guarantee that there is always space for a pointer. - When a record is the target of a forward pointer, it is "edited" information that is preceded either by the end-of-queue-file marker record, or it is preceded by the reverse pointer at the end of earlier written "edited" information. Thus, the target of a forward pointer will not be relocated to make space for a pointer record. - When a record is the target of a reverse pointer, it is always preceded by a forward pointer record (or by a forward pointer record followed by some unused space). Thus, the target of a reverse pointer will not be relocated to make space for a pointer record. Could we end up relocating a pointer record? Yes, but that is OK, as long as pointers contain absolute offsets. Pointer records introduce the possibility of loops ================================================== When a queue file is damaged, a bogus pointer value may send Postfix into a loop. This must not happen. Detecting loops is not trivial: - A sequence of multiple forward pointers may be followed by one legitimate reverse pointer to the location after the first forward pointer. See above for a discussion of how to append a record to an appended record. - We do know, however, that there will not be more reverse pointers than forward pointers. But this does not help much. Perhaps we can include a record count at the start of the queue file, so that the record walking code knows that it's looking at some records more than once, and return an error indication. How many bytes do we need for a pointer record? =============================================== A pointer record would look like this: type (1 byte) offset (see below) Postfix uses long for queue file size/offset information, and stores them as %15ld in the SIZE record at the start of the queue file. This is somewhat less than a 64-bit long, but it is enough for a some time to come, and it is easily changed without breaking forward or backward compatibility. It does mean, however, that a pointer record can easily exceed the length of a header record. This is why we go through the trouble of record relocation and dummy records. In Postfix 2.4 we fixed this by adding padding to short message header records so that we can always write a pointer record over a message header. This immensly simplifies the code.