Part 4 in a series of articles on implementing a notification system using Gmail and Line Bot
Hi. Back again for another instalment.
Today I will be working through accessing parts of an email. In part 3 we got to the stage where I had created an email.message.EmailMessage
object. You may not believe that is ready to use. Since if you played with it you might have found that it still contained a lot of non-ascii encoded characters, but thats ok. Once we start using the member methods provided by a email.message.EmailMessage
object. Things will be clear.
So a quick review:
We did our search, got our message ids, boiled it down to just the message ids minus the threadIds
We then used get_message(service, msg_id)
to return an email.message.EmailMessage
object.
single_email = get_message(service, some_id)
If you print this with print(single_email)
you will get the string representation of the entire email. If it is not in ascii you might see subject line that looks like this.
Subject: =?ISO-2022-JP?B?GyRCRn5CYDw8Pn//yROJCpDTiRpJDsbKEI=?=
And have an email body which is just as confusing. But thats ok. We will use the methods provided by email.message.EmailMessage
to get these string returned in a readable form.
Here is a list of some the methods we can use:
single_email.add_alternative() single_email.get_params()
single_email.add_attachment() single_email.get_payload()
single_email.add_header() single_email.get_unixfrom()
single_email.add_related() single_email.is_attachment()
single_email.as_bytes() single_email.is_multipart()
single_email.as_string() single_email.items()
single_email.attach() single_email.iter_attachments()
single_email.clear() single_email.iter_parts()
single_email.clear_content() single_email.keys()
single_email.defects single_email.make_alternative()
single_email.del_param() single_email.make_mixed()
single_email.epilogue single_email.make_related()
...
...
Headers can simply be accessed using single_email.get('headername')
Examples:
from = single_email.get("from")
subject = single_email.get("subject")
To check if an email is multipart; single_email.is_multipart()
will return a True
or False
There are lots of methods to use to deconstruct an email. Fortunately for me. The emails I am dealing with are system generated and also very simple plain text non-multipart.
Let's look at the subject of the email.
Subject: =?ISO-2022-JP?B?GyRCRn5CYDw8Pn//yROJCpDTiRpJDsbKEI=?=
Using the get
method:
sub = single_email.get('subject')
print(subject)
I get:
入退室情報のお知らせ
Note that I didn't actually need to know the character encoding. This was due to the way the parser object was setup using the arguments policy=policy.default
in the previous post.
So as you can see, getting the header details is pretty easy. How about getting the body of the email? Again this is pretty simple when dealing with a single non-multipart email. I will simply use get_content()
body = single_email.get_content()
print(body)
Redacted Redacted
�=J_] �@O5 �12K^0[0=^ 様の入退室情報をお知らせします。
【セーフティメール情報】
2021-02-01 19:08:26 に退室しました。
※なお、このメールに返信することはできませんのでご注意ください
If you are dealing with an email that is multipart, you will need to use the walk()
method. Combined with get_content_maintype()
and get_content_subtype()
to identify or find things like plain text and HTML or binary attachments.
There already exists some good Python documentation for this. So I won't go into it here.
That's it for this article. Next I will give some information on regex for dealing with Japanese. But you can also take a look at this earlier post.
Top comments (0)