POP3 and MIME
In the project I work on, there is a server-side component that communicates with a POP3 server. It checks for new e-mails and automatically processes them. The trickier part of the code is not the POP3, but the parsing of the MIME content. There is a bunch of RFCs related to MIME (see: http://lesnikowski.fm.interia.pl/Mail/mail_rfc.html). It really does not seem trivial to fully support this format.
I got in the e-mail processing part of the project without planning to do so. The current implementation is based on some open-source code which was adapted in the solution from a fellow developer. But now the e-mail component has no specific owner since the guy moved to another project. I was the "lucky" one to notice that the processing of some e-mails caused an exception to be raised. So I began exploring the classes and later started making fixes. But three times in a row after I make a fix, another bug is introduced. So I decided to (1) get a deeper understand what the code did and (2) to acquaint myself better with MIME to know how the code should work. I discovered that those two didn't match quite well :) The code was just oversimplified, trying to do its job in the easiest way. Yes, it is simpler to search for "To: " directly in the entire header string, but you will not always get what you want. The RFCs tell exactly what can be expected. Making bigger assumptions is just inappropriate.
I guess, the implementation was tested with just a few Outlook e-mails with similar content. And the testing process did not include carefully prepared data (thus enabling easily repeatable unit tests). I wonder what is the best way to test such functionality. RFC compliance is a good thing to aim on but the more cruicial goal is to understand stuff, generated by real-world e-mail clients. In our case MS Outlook will probably have the greatest share of the received e-mails, but who knows... And since different e-mail clients most likely have a bit different interpretation of MIME, especially when sending more complex content (i.e. e-mail with attachments), one has to be sure that MIME parsing is done properly.
So no we have to decide if it is worth continue fixing and extending what we have currently in store. Or go find a more reliable, fault-tolerant and easily maintainable POP3 + MIME processing. I made some research on C# code providing such functionality and found those wonderful projects:
cpSphere - GotDotNet project with a lot of contributors. Very well written, flexible, well documented. I saw that there is a GUI for manual testing. And much more... Definitely worth checking out. Looks like the leader in this field.
OpenPOP.NET - SourceForge project. Looks promising, but there are no release since June 2004 which seems to me a bit dangerous. Has the project turned out just perfect or was it abandoned?
I found, of course, many others open-source code and articles on the topic, but they were a bit too simplistic (I guess, much like our current implementation...)
http://lesnikowski.fm.interia.pl/Mail/mail.html - a good piece of work, but no source code is provided. Still, I like that the API is well structured and well documented, and there is some sample code. I've read some nice references about this library...
There are, of course, a few commercial products. Just to name a few:
http://www.emailarchitect.net/webapp/popcom/
http://www.chilkatsoft.com/dotNetEmail.asp
Incorporating third-party software is always preceded by the question "To build or to buy?". The problem is that I think not many developers ask this question and even make research what the world has to offer :) There is another possible obstacle - when the management is asked to approve a buy decision they might not sit and do the math, but may say: "Aren't you paid to develop software?":) In this days it is often more rational for programming to consist more of integrating components than writing things form scratch.
P.S. If I have missed some other notable POP3/MIME project, please recommend it...
UPDATE 2005-07-11: I just realized that the sources we have been using are from a very early version of Lesnikowski work that can be found here. That's a surprise! In fact, we kept clinging to this code and after an accumulated total of 2 days refactoring and fixes it seemed to work in most of the cases we tested against (our own MS Outlook mailboxes :) ). It would have been definitely wiser to switch immediately to a more mature solution... but it is never late to do so.


