MS Outlook to Unix Mailbox Conversion mini HOWTO

Greg Lindahl, lindahl@pbm.com

v1.4, 2004-01-08
This MiniHowto covers conversion of old email in Microsoft Outlook (not Outlook Express!) to typical Unix file formats.

1. Introduction

While several programs exist to convert some formats such as Microsoft Outlook Express to Unix formats, Outlook users have a bit more of a challenge. One way to convert uses Mozilla Mail under Windows; another involves a more complicated method. Both are explained in this miniHOWTO.

The database format that Outlook uses for .PST files, called Jet, is documented at:

http://msdn.microsoft.com/library/techart/olexcoutlk.htm

1.1 Copyright

Copyright (c) 2001-2004 by Greg Lindahl

Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.1 or any later version published by the Free Software Foundation; with no Invariant Sections, with no Front-Cover Texts, and with no Back-Cover Texts. A copy of the license may be found at:

http://www.gnu.org/copyleft/fdl.html

I request that corrections and/or comments be forwarded to the document maintainer. If you're considering making a derived work other than a translation, I request that you discuss your plans with the current maintainer.

1.2 Disclaimer

Use the information in this document at your own risk. I disavow any potential liability for the contents of this document. Use of the concepts, examples, and/or other content of this document is entirely at your own risk.

All copyrights are owned by their owners, unless specifically noted otherwise. Use of a term in this document should not be regarded as affecting the validity of any trademark or service mark.

Naming of particular products or brands should not be seen as endorsements.

You are strongly recommended to take a backup of your system before major installations, and backup your system at regular intervals.

Do not place your cat in a running microwave oven.

1.3 News

1.01: Fixed minor typo in inetd/xinetd startup instructions.

1.1: Added information about Kmailcvt, Mozilla, and how to get Outlook to talk to IMAP servers if it's in Exchange mode.

1.2: Added details about using Mozilla to do this same task

1.2.1: Fixed formatting typo.

1.2.2: Relicensed under the GFDL, more minor typo fixes.

1.3: Yet more minor fixes.

1.4: Information about TNEF stuff from Scott Phelps (thanks!)

1.4 Other ways of doing this

A list of conversion utilities, many commercial, may be found at:

http://www.emailman.com/conversion/index.html

I've had a claim that the program Kmailcvt now converts Outlook mailboxes. However, I'm not 100% sure that this claim is true, since Kmailcvt definitely converts Outlook Express mailboxes, which are completely different from Outlook mailboxes. So, if you figure this out, please let me know.

2. Converting using Mozilla Mail

I've also heard that Mozilla Mail under Windows can convert Outlook mailboxes -- but the only solid report I've gotten was that attachments were not converted, so if they matter to you, don't use this method.

There is some documentation on the Ximian Evolution website. I haven't tried this since I don't have a Windows machine anymore. But, here's what they say to do:

Run Mozilla Mail

Go to "File > Import" and select that you wish to import mail from Outlook. When that's done, you're almost there.

I believe that Mozilla stores all its mailboxes as Unix mbox format files, even under Windows. So, all you have to do is transfer those files to your Linux box. You can find them in:

C:/windows/Application Data/Mozilla/Profiles/default/XXX/Mail/imported.mail/, where XXX will be some collection of digits. If there are multiple users on your Windows machine, "default" will instead be your username.

As I mentioned earlier, I've never tried this method, so I don't know how well it works. In particular, the issues raised in section 3.4 probably also apply to this method of conversion.

By the way, the reason that Mozilla can only read Outlook mailboxes under Windows is because it calls a Microsoft DLL to do it.

3. Converting using IMAP

3.1 Preparation

First, be sure you want to do things this way. In particular, section 2 explains how to use Netscape Mail under Windows to do the coversion. That's easier than doing it this way.

If you decide to do it this way, you need to make sure that your copy of Outlook can talk to IMAP servers. When I first wrote this HOWTO, I assumed that Outlook was Outlook was Outlook. Nope. If you are using Outlook in Corporate/Workgroups mode (which talks to Exchange) instead of Internet Mode (which talks to POP/IMAP servers), you'll have to change modes.

To test to see if you're OK, look at the Outlook "Tools" menu, and see if there is an "Accounts" item. If there is, you're OK, and you can go on to section 3.2. If there isn't, go find your towel, and keep reading.

Changing modes is not trivial, but a helpful reader (Matt Huyck) sent in the following instructions, which look dangerous, and which I have not tested:

Select "Options..." from the "Tools" menu, and then go to the "Mail Services" tab in the resulting Options dialog box. At the bottom of that tab there is a button labeled "Reconfigure Mail Support..." Hold your breath and then click it. A new window comes up with "Outlook 2000 Startup" in the title bar and an "E-mail Service Options" heading. There are two relevant radio buttons: "Internet Only" and "Corporate or Workgroup". Change over to "Internet Only" and click "Next >". You are then prompted with a very long message box which you should read carefully because you're about to make a significant (but reversible) change to the Exchange client configuration. If you haven't passed out already, you can stop holding your breath now. Although it doesn't explicitly say so, you will want to be sure you have a copy of the Microsoft Office install CD before you click "Yes". Click "Yes" and let Outlook do its thing for a few seconds until it has quit completely. Open up Outlook again. This is where you may be prompted for the install CD. After the re-configuration is complete you're ready to proceed with step 3.2 of the HOWTO.

To get back to your original Outlook configuration, follow the same directions, but you'll obviously be clicking the *other* radio button.

One other thing is different if you've been using Exchange. As you point out at the end of section 3.4, "the original 'From ' line" is not preserved. For Exchange users, however, the comment "Fortunately you don't actually need that information" doesn't quite apply. The "From" header that's missing is the only one that contains the identity of the sender in messages that were sent on an internal Exchange server, i.e. messages that didn't pass through an Internet gateway anywhere. I have preserved my "From" headers by saving copies of my mail folders as text files through the "Import and Export..." command from the "File" menu. I plan to hack out some twisted Perl/Grep code that will re-insert those "From" headers into my Linux mbox files. If I get that to work I'll let you know.

3.2 Install an IMAP server (temporarily!) on your Linux box

Installing things varies from Linux distribution to distribution, so I will use RedHat 7.0 as an example. First you need to install the correct package, which generally is named "imap". or "uw-imap" or something like that:

 cd /home/redhat-7.0-cd/RedHat/RPMS
 ls *imap*
 rpm -i imap*

Actually, since I had a "workstation" install, I also had to install the xinetd package; rpm gave me an error which told me to do this. And, of course, it was on the second CD of RedHat 7.0. Debian users using "apt-get" don't have to worry about such issues.

Next, we need to enable the imap server. On my Gnome-basekd desktop this is done using the graphical tool at Start -> System Settings -> Server Settings -> Services, or you can do it by hand. This is usually controlled by a line in the file /etc/inetd.conf:

 #imap    stream  tcp     nowait  root    /usr/sbin/tcpd        /usr/sbin/imapd

The above line is commented out; remove the leading # sign. Alternately, if your system uses xinetd, instead edit /etc/xinetd.d/imap and change "disable=yes" to "disable=no".

Then restart inetd or xinetd by doing:

 /etc/rc.d/init.d/inetd restart

or

 /etc/rc.d/init.d/xinetd restart

If all else fails, reboot.

You don't actually want to leave the IMAP server enabled for that long. This server runs as root and has had security bugs in the past. For this reason, you shouldn't leave it enabled unless you wish to use it permanently. We will disable this server in section 3.5.

In order to connect Outlook to this IMAP server, you will need to know the name or IP address of the Linux box.

3.3 Connect your Outlook client to the server

In order to copy over all our email to the server, we need to tell your Outlook client about this new server. Select "Accounts..." from the "Tools" menu, and then "Add" a new account "Mail...". The important items are that the server uses IMAP to download email, that the incoming mail server is the name or IP address of your Linux box from section 3.2, and the username and password should be your username and password on the Linux box. (As usual, it's a bad idea to use the root account on Linux for this purpose.)

Once you've hit "Finish", set this new account to be the default by highlighting it and clicking on "Set as Default". Outlook should connect to your IMAP server, and the name of your IMAP server should appear at the bottom of your folder list. Click on it; you should see an Inbox folder. (Note that if /var/mail/yourusername doesn't exist on your Linux box, you won't be able to drag-and-drop any messages into your INBOX... and the error message will be confusing. However, that's not what we're going to do.)

3.4 Copy over all your email

At this point you can drag and drop entire folders of email from Outlook onto the IMAP server name. This will copy the email, including all attachments, to the Linux box. Unfortunately it also immediately deletes it from Outlook. In order to copy items without deleting them, right-click on the folder name and select the "Copy" option. For the destination, pick your Linux server at the bottom of the list.

However, life isn't quite that simple. Outlook supports folders containing folders which also contain messages. Linux IMAP servers do not support that (at least those using Mailbox format); a folder is either a regular file containing messages, or a directory containing subdirectories and files. So if you have folders in Outlook with both messages and subfolders, you can't copy the entire tree over to the Linux IMAP server. Another incompatibility of the Linux IMAP server is that you have to tell it in advance if a new folder will contain subfolders or messages. You do this by appending a slash (/) to the folder name when you create it. This slash will disappear when the folder is created.

So, in order to copy a tree of folders to the Linux IMAP server, first you need to create a replica of the structure of your existing folders on the Linux IMAP server. While you're doing this, note which of the existing folders contain both subfolders and messages. You will need to move these messages elsewhere. Once you have the overall tree created, then you can copy or move groups of folders to the Linux IMAP server.

One final incompatibility to note is that the Linux IMAP server doesn't allow folders with slashes (/) in their name. You'll need to rename such folders before copying or moving them.

On the Linux box, folders appear as files and directories in your home directory. The format of these files is the usual Unix mail format, which most Unix/Linux mail tools either use directly or can convert to/from. Files with attachments will have MIME attachments; there is also one extra message per folder which is a (useless) header.

(One piece of data which doesn't get preserved is the original "From " line, which contains the envelope address of the email. Fortunately you don't actually need that information.)

An additional wrinkle applies regarding attachments. Microsoft sometimes bundles together several attachments into a ms-tnef attachment; TNEF stands for Transport Neutral Encapsulation Format. This attachment contains several mime-encoded attachments. I was lucky that my mail folders didn't contain any of these -- they seem to be created when people send you "Rich Text Format" email -- it contains two alternatives, one in plain text and the other a TNEF-encapsulated version of the message in rich text plus any attachments.

Fortunately, there are some ways to unpack TNEF attachments. One is the tnef project hosted on sourceforge, the other is ktnef, which is a part of KDE.

3.5 Deinstall IMAP from your Linux box

Once you've transferred all of your email, you will want to deinstall the IMAP server from your Linux box, for the security reasons mentioned earlier. This involves the same 2 steps you took to install the server:

  1. Remove the RPMs:
      rpm -e imap
    
  2. Remove the line in /etc/inetd.conf or /etc/xinetd.d/imap
  3. Restart inetd or xinetd, or reboot.

Voila! You have taken another step towards a Microsoft-free lifestyle.