(Cover photo by Noémi Macavei-Katócz on Unsplash)
A gphotospy tutorial.
Recap
In the first tutorial we saw how to set up API Keys and Authorization. After this we could see some actual code:
- List albums: We saw how to list the albums inside our account, using
Album.list()
- List elements in album: We saw how to list all the media inside an album, using an
album id
, with the methodMedia.search_album(album_id)
- show an image with Tk and PIL: we saw how to use Tkinter, python shipped UI, to show an image retrieved from an album (using PIL image processing library to construct the image)
In this tutorial we are going to see some ways to retrieve media inside our account.
List media
When we first open our account, media are showed by date (usually) independently of album categorization.
This corresponds to the Photos
section you can see under the burgher menu, and inside the menu as well (admittedly is a bit confusing); it's called Photos but it contains photos and videos (Yes, it's confusing).
On the far right there is also a draggable time-line to searh by date.
We can list the content of this section with the Media.list()
.
This method, similarly to Album.list()
which shows the media inside an album, is paginated in the API. However, in gPhotoSpy this pagination happens in the background: the method simply returns an iterator.
First things first, we get again the service (using the key file, and token file both saved in the first tutorial), and we construct the Media manager:
>>> from gphotospy import authorize
>>> from gphotospy.media import *
We select secrets file saved beforehand (in the first tutorial)
>>> CLIENT_SECRET_FILE = "gphoto_oauth.json"
If we did not delete the .token
file we will still have the authorization, so we won't need to authorize again our app.
So, we get authorization and return a service object:
>>> service = authorize.init(CLIENT_SECRET_FILE)
Now, we can construct the media manager:
>>> media_manager = Media(service)
Finally, we can get an iterator over the list of Media (all the media):
>>> media_iterator = media_manager.list()
It is important to remember that if the Account is connected to an Android device usually the number of items is big, so it's better not to consume it all (with list()
for example).
Instead, let's check the first item with next()
>>> first_media = next(media_iterator)
>>> first_media
{'id': 'AB1M5bKJueYgZdoz1c5Us..... # cut out
OK, this thing is bugging me, the result is always mapped to a Python dictionary, but in reality its' json. When it's printed, though, it's not so pretty.
We can remedy by mapping the media to the MediaItem
object:
>>> media_obj = MediaItem(first_media)
>>> media_obj
<gphotospy.media.MediaItem object at 0x7f2de21953d0>
Now this object has interesting properties that we will see later on, but one of these is pretty-print:
>>> print(media_obj)
{
"id": "...",
"productUrl": "https://photos.google.com/lr/photo/...",
"baseUrl": "https://lh3.googleusercontent.com/lr/...",
"mimeType": "image/jpeg",
"mediaMetadata": {
"creationTime": "2020-05-24T11:39:32Z",
"width": "720",
"height": "720",
"photo": {
"cameraMake": "...",
"cameraModel": "...",
"focalLength": "...",
"apertureFNumber": "...",
"isoEquivalent": "...",
"exposureTime": "..."
}
},
"filename": "...jpg"
}
Internally it uses Python's json
module, with these settings:
json.dumps(media, indent=4)
Beware that some information are not present all the times, for example for pictures taken with a smartphone camera usually the photo
field inside the mediaMetadata
is an empty object.
Filters
Let's say we want to show all media taken today, as in the account interface. We could retrieve a list of all media and search them by creation date. However, Google's API allows us a better way: we can search media with filters.
Of course, date is one such kind of filter, however filtering is not limited to date.
Here's the list of filters available to us:
- Content Categories
- Dates and date ranges
- Media types
- Features
- Archived state
All these filters are available with the Media.search()
; however, there are some gotchas.
We will try them all, starting with dates.
Dates
The basics of search:
Media.search(filter, exclude=None)
For dates we have to construct a Date
object (a Val
object really, of type DATE
), and pass it as a filter to the search
method.
The date()
function is contained inside gphotospy.media
, with the following signature:
date(year=0, month=0, day=0)
year
, month
, and date
are integers
-
year
must be expressed as a 4 digits year, for example1998
, or2020
-
month
must be expressed as a 1 or 2 digits integer, for example3
for March, or11
for November. -
day
must be expressed as a 1 or 2 digits integer, and must be a valid date for the month (30 for February is a no-go)
Additionally, some fields can be foregone for recurring dates: for example, you can leave out the year and get all
>>> xmas = date(month=12, day=25)
This way you can get all pictures for all Christmases.
At the time of writing these words is June 1st, 2020, so let's get today (update freely)
>>> today = date(2020, 6, 1)
Now we have two dates to play with; let the quest (...search...) begin!
>>> xmas_iterator = media_manager.search(xmas)
>>> next(xmas_iterator)
Let's see as of today how many new media in our account
>>> today_iterator = media_manager.search(today)
>>> today_media = list(today_iterator)
>>> len(today_media)
3
For me only 3, but it's still early in the morning
Date Ranges
We can also filter by a range of dates. We need the date_range()
function:
date_range(start_date=date1, end_date=date2)
For example
>>> festivities = date_range(xmas, date(0, 12, 31))
>>> festivities_iterator = media_manager.search(festivities)
>>> next(festivities_iterator)
Media Types and Media Mapping
After dates, we are moving to categorical filters, for which there are three specific classes to use as filters.
There is a API method to search for only Videos or only Photos, for which we use MEDIAFILTER
There are three available types:
-
MEDIAFILTER.ALL_MEDIA
All media types included (default, not needed really) -
MEDIAFILTER.PHOTO
Media is a photo -
MEDIAFILTER.VIDEO
Media is a video
For example, to get photos only:
>>> photo_iterator = media_manager.search(MEDIAFILTER.PHOTO)
>>> next(photo_iterator)
The same principle applies for MEDIAFILTER.VIDEO
.
Mapping
In a while we will see how to apply multiple filters (spoiler: just use an array of filters), however, once we have a media in order to know if it is a photo or a video, we need to first map the media to the class Mediaitem
, and then check if it is indeed a video or a photo:
For example, let's use again the list today_media
we created earlier:
>>> a_media = MediaItem(today_media[0])
We mapped the first element of today_media to a MediaItem
object
Now we can obtain several things, besides pretty-print of the Json object: we can have the type check!
>>> a_media.is_photo()
True
Let's get a metadata object
>>> a_media.metadata()
{'creationTime': '2020-05-26T12:38:24Z', 'width': '480', 'height': '341', 'photo': {}}
We will see later on how to use it a little more.
Featured filters
We can search featured media too, if any. Those are the 'starred' media, let's see if we have any.
-
FEATUREFILTER.NONE
Not featured media (default) -
FEATUREFILTER.FAVORITES
Media marked as favourite (starred)
Google Photos does not encourage starring media, so it is highly probable that any given account has no favorite at all (mine didn't until I had to make some tests for this feature!)
>>> favourite_iterator = media_manager.search(FEATUREFILTER.FAVORITES)
>>> next(favourite_iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
...
TypeError: 'NoneType' object is not iterable
Yep, empty iterator!
Let's talk a bit about guards, because in a regular script you'll find a lot of use for the following pattern:
try:
print(next(favourite_iterator))
except (StopIteration, TypeError) as e:
print("No featured media.")
I got a terse answer now:
No featured media.
What we did is to try
our search, and catch these two except
: StopIteration
and TypeError
. These are needed when using iterators (and all searches in gPhotoSpy are done through iterators).
These errors occur on empty iterator (if no item is present) and when cycling through searches, it guards against the end of the iteration (no more item present).
Just to verify that the filter is working, I starred an item. We have to fetch again from the API, otherwise it will cache the earlier result:
>>> favourite_iterator = media_manager.search(FEATUREFILTER.FAVORITES)
>>> len(list(favourite_iterator))
1
Now there's at least an element to show!
Filter by Content
Each time we save media to Google Photos it passes through some Machine Learning algorithms that classify the media content. We can search through these categories with the class CONTENTFILTER
. These are the classes available:
CONTENTFILTER.NONE
CONTENTFILTER.LANDSCAPES
CONTENTFILTER.RECEIPTS
CONTENTFILTER.CITYSCAPES
CONTENTFILTER.LANDMARKS
CONTENTFILTER.SELFIES
CONTENTFILTER.PEOPLE
CONTENTFILTER.PETS
CONTENTFILTER.WEDDINGS
CONTENTFILTER.BIRTHDAYS
CONTENTFILTER.DOCUMENTS
CONTENTFILTER.TRAVEL
CONTENTFILTER.ANIMALS
CONTENTFILTER.FOOD
CONTENTFILTER.SPORT
CONTENTFILTER.NIGHT
CONTENTFILTER.PERFORMANCES
CONTENTFILTER.WHITEBOARDS
CONTENTFILTER.SCREENSHOTS
CONTENTFILTER.UTILITY
CONTENTFILTER.ARTS
CONTENTFILTER.CRAFTS
CONTENTFILTER.FASHION
CONTENTFILTER.HOUSES
CONTENTFILTER.GARDENS
CONTENTFILTER.FLOWERS
CONTENTFILTER.HOLIDAYS
Let's give it a try, shall we? The following will take time to perform on most accounts today
>>> selfies_iterator = media_manager.search(CONTENTFILTER.SELFIES)
>>> selfies = list(selfies_iterator)
>>> len(selfies) # never got to this part... CTRL+C !!!
I left it going on for a while, then interrupted it CTRL+C
. The problem is that if there are many items the list()
function whants to consume the whole iterator, performing many requests to the server (by 50 at a time). All these requests are slooooow.
Instead let's do like this:
>>> houses_iterator = media_manager.search(CONTENTFILTER.HOUSES)
>>> next(houses_iterator).get("filename")
'IMG-20200522-WA0005.jpg'
Combining filters and excluding some
We can combine filters at will
>>> combined_search_iterator = media_manager.search([CONTENTFILTER.HOUSES, CONTENTFILTER.SPORT])
WATCHOUT, BE CAREFUL Contrary to what you might think, combining filter does not slim out the search!!!
COMBINING FILTERS IS MEANT TO WORK AS A LOGICAL "OR", SO IT SUMS UP THE CATEGORIES
In sum, if you combine CONTENTFILTER.FOOD
and CONTENTFILTER.TRAVEL
this operation does not return only pictures of exotic food you got on that trip at the seafood market in China (bats, eh?). It return ALL food pictures (and videos), including grandma's porridge, and ALL travel pictures, including those stupid pictures taken in the washrooms in Italy (yes, we knew all along!).
Filter out
As we have seen, combining just adds up, it does not slim down a search.
However, you can exclude categories (meager consolation).
If you remember the search
signature, there is an exclude
argument:
>>> exclude_some_iterator = media_manager.search(CONTENTFILTER.ARTS, CONTENTFILTER.CRAFTS)
So yes, you have to remember that the second argument is the exclusion, not another filter to add (in fact you must put all filters in an array)
Search: Putting all together
We can put all together, for example:
>>> combined_iterator = media_manager.search(
filter=[
FEATUREFILTER.NONE,
CONTENTFILTER.TRAVEL,
CONTENTFILTER.SELFIES,
MEDIAFILTER.PHOTO,
date(2020, 4, 24),
date_range(
start_date=date(2020, 4, 19),
end_date=date(2020, 4, 21)
)
],
exclude=[
CONTENTFILTER.PEOPLE,
CONTENTFILTER.GARDENS])
Very complicated still, and I got some results
>>> combined = list(combined_iterator)
>>> len(combined)
9
Nine pictures. Contrast it with the following, which is a list of videos only
>>> combined_iterator = media_manager.search(
... filter=[
... FEATUREFILTER.NONE,
... CONTENTFILTER.TRAVEL,
... CONTENTFILTER.SELFIES,
... MEDIAFILTER.VIDEO,
... date(2020, 4, 24),
... date_range(
... start_date=date(2020, 4, 19),
... end_date=date(2020, 4, 21)
... )
... ],
... exclude=[
... CONTENTFILTER.PEOPLE,
... CONTENTFILTER.GARDENS])
>>> combined= list(combined_iterator)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
...
You guessed it, no video
Same search, but without content filters or exclusions, only dates; still searching for videos
>>> combined_videos = media_manager.search(
... filter=[
... FEATUREFILTER.NONE,
... MEDIAFILTER.VIDEO,
... date(2020, 4, 24),
... date_range(
... start_date=date(2020, 4, 19),
... end_date=date(2020, 4, 21)
... )
... ])
>>> combined = list(combined_videos)
>>> len(combined)
8
Eight videos in total
Downloading media
Downloading time!
Let's get a media; for example a video
>>> video_iterator = media_manager.search(MEDIAFILTER.VIDEO)
>>> media = MediaItem(next(video_iterator))
Now downloading it is as trivial as opening a file and save the raw_download
data!
>>> with open(media.filename(), 'wb') as output:
... output.write(media.raw_download())
...
With media.filename()
we got the filename, and with media.raw_download()
the raw data read from the "baseUrl"
proper flags.
We could as easily save a picture:
>>> photo_iterator = media_manager.search(MEDIAFILTER.PHOTO)
>>> media = MediaItem(next(photo_iterator))
>>> with open(media.filename(), 'wb') as output:
... output.write(media.raw_download())
...
43463
If you want to sync the info of the downloaded picture with the original, you need to copy the info gotten with media.metadata()
to this downloaded picture, because they are off. For example, downloaded pictures have a metadata field called Software
with the value Picasa
set. Maybe not many people know this, but Google Photo is a 'remake' of Google Picasa; you can see here what happened to Picasa
View a video (with a trick)
Last time we saw how to view a picture with Tkinter in Python.
This time we will see how to view a video.
First things first, we have to istall opencv
pip install opencv-python
We have already installed pillow, otherwise we need it as well
Then we import Tkinter
>>> import tkinter
Next, I have created a file (in a gist) with two classes:
ImgVideoCapture
: hHis class wraps the opencv'scv2.VideoCapture(video_url)
, and provides a method to extract frames as PIL images. This is needed because Tkinter has native funcions only to show images, throughImageTk
. Moreover, as we have seen last time, it's even better to use pillow's wrapper for the same classPIL.ImageTk
VideoApp
: This is a class that creates a Tkinter window, with a canvas where to show the PIL.ImageTk. Moreover it has an update function that continually calls the frame extractor inImgVideoCapture
and calls itself again and again (every 15 millisecs, but it can be configured) withwindow.after()
. This class contains also a call towindow.mainloop()
so that it mantains itself alive during the video and after the end of it.
So let's download the gist and save it in the current directory as video_show.py.
Next, we import the two classes from it
>>> from video_show import ImgVideoCapture, VideoApp
Let's fetch a video to watch:
>>> video_iterator = media_manager.search(MEDIAFILTER.VIDEO)
>>> media = MediaItem(next(video_iterator))
We create a Tkinter root and we get the popcorns:
>>> root = tkinter.Tk()
>>> VideoApp(root, media)
Relax and enjoy the show.
How meta is this? A video showing a video...
Conclusions
Well, we did so many things today, I need really to relax.
You just stay tuned for next tutorial, to take full charge of your Google Photo's account with Python
Code
You can find the whole code in this gist except for the video_show.py, which is in this other gist.
Top comments (1)
How do we get links to all the images that have been shared.