Recently I was in need to parse Outlook emails to extract some values so that automated tests can pass multifactor authentication. I was hoping for some naïve implementation in JavaScript but could not found reliable solution there so that I search for good library in Java. I was not even surprised that there were several solutions for parsing Outlook msg files. Java truly has library for everything.
I chose the the Auxilii msgparser library. As it seemed like the easiest to use solution.
Added via Maven.
<dependency>
<groupId>com.auxilii.msgparser</groupId>
<artifactId>msgparser</artifactId>
<version>1.1.15</version>
</dependency>
Usage is then straight forward
Message parsedMessage = new MsgParser().parseMsg(msgFile.getInputStream());
String body = parsedMessage.getBodyText();
List<Attachment> attachments = parsedMessage.getAttachments();
Please be aware that Outlook on MacOS does not use msg
format for it’s emails. Exported emails on mac are eml
. Those are exported in plain text so they could be parsed via regex just be reading the file.
The whole code supporting all would look like this.
String body = "";
if(file.getName().endsWith("msg")) {
Message parsedMessage = new MsgParser().parseMsg(file);
body = parsedMessage.getBodyText();
} else if (file.getName().endsWith("eml")) {
body = new String(Files.readAllBytes(file.toPath()), StandardCharsets.UTF_8);
}
// here parse your body
If this is interesting to you, you can follow me on Twitter.
Top comments (1)
I am getting the error as below and I am using the same code snippet you mentioned above. Any idea why?
org.apache.poi.poifs.filesystem.NotOLE2FileException: Invalid header signature; read 0x615F3430305F2D2D, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document
Invalid header signature; read 0x615F3430305F2D2D, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document
inside load messages
at org.apache.poi.poifs.storage.HeaderBlock.(HeaderBlock.java:151)
at org.apache.poi.poifs.storage.HeaderBlock.(HeaderBlock.java:117)
at org.apache.poi.poifs.filesystem.POIFSFileSystem.(POIFSFileSystem.java:285)
at com.auxilii.msgparser.MsgParser.parseMsg(MsgParser.java:159)
at com.auxilii.msgparser.MsgParser.parseMsg(MsgParser.java:138)
at com.aspose.email.examples.email.Email_Parse.loadMessages(Email_Parse.java:36)
at com.aspose.email.examples.email.Email_Parse.getMessages(Email_Parse.java:114)
at com.aspose.email.examples.email.Email_Parse.main(Email_Parse.java:25)