I recently took the updated Google Cloud Certified Professional Data Engineer exam. Studying for the test is a great way to learn the data engineering process with Google Cloud.
I recommend studying for the exam if you want to use Google Cloud products and:
- are a data engineer
- want to become a data engineer
- want to build a tech company
- are a data scientist and want to understand the whole data pipeline
In this article I’ll share the what, why, and how to help you take your best shot at the exam. 🎯
Let’s tackle the why first. I decided to take the Google Cloud Certified Professional Data Engineer exam for two reasons. First, I wanted to learn more about Google Cloud products for data engineering and machine learning. Second, I wanted to pass the exam and demonstrate that I’d learned the information. 😃
I chose a Google exam over offerings from AWS and Microsoft Azure for a few reasons. First, Google is the leading cloud provider in terms of machine learning and AI. They are also the platform I would use if I were starting a company in the space.
Compared to the other major cloud services, Google has the clearest help docs and the best UX. They also have the lowest prices for GPUs and the most powerful machines for training deep learning models.
Additionally, the Google exam has good study materials available — which we’ll dig into below. It’s also a professional level exam, which means that it’s difficult, but passage signifies the highest level of mastery. Finally, the Professional Data Engineer test was updated in March 2019, so I figured it should be more relevant than an older, un-updated exam.
If you’re into Microsoft Azure, they have two exams that must be passed to attain the Certified: Azure Data Engineer Associate designation. The Azure exams have a revamp date of June 21, 2019.
As context, I’d used a number of Google Cloud products, but didn’t know the difference between BigQuery and Bigtable before I started studying for the exam. I also hadn’t done much data engineering work.
This isn’t the kind of test you can cram for in a day or two. I doubt hardly anyone is prepared to take this exam without a good bit of studying; the number of Google products and their options changes so fast.
Helpfulness : 7.5/10
Linux Academy’s Google Cloud Certified Professional Data Engineer course had good content. The course has videos, quizzes, a Lucid Chart e-book, and a final exam. Linux Academy provides free GCP practice time. It also has a helpful community Slack channel.
I took a legal pad worth of notes as I studied — and most of them came from the Linux Academy videos.
The course wasn’t updated for the new test as of early June 2019, so it wasn’t as helpful as it could have been. The instructor said the materials will probably be totally updated in late June 2019.
The Linux Academy final exam took a number of questions from the official Google practice exam. Don’t put much faith in the final exam results if you are taking the test in mid-June 2019. The test isn’t totally updated and the actual exam questions felt more difficult.
Overall, the UX isn’t bad, but there are some minor annoying issues (for example, the video is either full screen or tiny).
Bottom line: Linux Academy makes a great base, but you might want to wait until their training materials are updated to start studying for the exam.
Linux Academy is $49 a month, paid monthly, with a 7-day free trial.
The Quicklabs exercises aren’t focussed on the exam. I found this nice for overall learning but not very that helpful if you’re trying to figure out what you need to learn for the test.
Like Linux Academy, Qwicklabs provides a Google Cloud sandbox for practice. Qwicklabs checks your progress in the sandbox, which is nice. It doesn’t have videos.
The UX is alright. The countdown timer for each lesson is a bit distracting and pressure inducing — however there is a countdown time on the actual Google exam, too. The Qwicklabs timer is quite large — I suggest moving that part of the window offscreen if it’s distracting.
When doing interactive exercises, I recommend setting up your windows side-by-side — one for instruction and one for your work in GCP.
I recommend doing Linux Academy first and then using Qwicklabs for more practice.
This resource consists of just three 50 question practice exams with a timer. The practice exams had a few updated questions, but still had old case study questions. They used the same Google official practice exam questions as Linux academy. Several questions had grammatical issues. Also, several questions were now incorrect. For example, now there is a BigQuery ML K-means algorithm.
I did learn things by taking the exam and reviewing the answers. The answers were detailed and linked to source documents. Just don’t put much faith in the score. The real exam feels far harder. 😄
Overall, these exams aren’t great, but I found them worth the time and money because there were few good options.
$9.99 for a one-time purchase (price may change — I saw it for $10.99 first).
Google recommends taking the Coursera Data Engineering, Big Data, and Machine Learning on GCP Specialization. This specialization consists of five Coursera courses. I decided not to take it because it looked like it hadn’t been updated for the revised exam — it referenced the old exam case studies. In hindsight, I would have taken these courses because they look quite thorough.
The official Google practice exam is available online as a mini-version of the real exam. The questions are the most relevant; I just wish there were more of them. As noted above, the questions are also used by several other folks in their practice exams.
You have to fill out a form to take the practice exam, but it’s free.
Here are the cheat sheets, blog posts, and other resources I used to study for the exam.
- Maverick Lin’s cheatsheet here is very good, but pre the March exam refresh.
- Guang X’s here is pre-updated exam.
- Dmitri Lerko’s post here reflects the updated exam.
- Chetan Sharma’s post here also reflects the updated exam.
- The official Google Cloud docs are expansive. You’ll certainly want to spend some time taking notes from them. Not all the latest material is on the exam, but it’s all good to learn. 😃Here are the BigQuery docs, for example.
- The official Google Cloud blog is here. It’s worth spending some time with it to help you understand topics you might find challenging.
Do you have other resources that you found helpful? Please share them in the comments or send them to me on Twitter @discdiver .
One thing I found unnecessarily difficult was determining how updated study materials were. To make this easier, I suggested to Google that they should version their certification exams — just as most software follows semantic versioning. A version label like 1.1 could make it easy for training material providers to indicate which test version their materials match. This could save test-takers time and avoid frustration. If you think this is a good idea, please let Google know. You can tweet to them @ GCPcloud. 😃
For what it’s worth, I generally take tests well and am confident in my ability to learn with self-directed study. If self-directed study isn’t your thing, and your budget allows, you might want to take in-person courses.
Now let’s turn to the test.
The exam consist of 50 multiple choice questions. You have two hours to complete it. You’re able to mark questions for later review and revisit all questions before submitting the test.
Rumor has it that you need about 70% correct to pass the exam. However, there is not an official published passing score. Google says:
- Not all questions may be scored.
At any given time, a small number of questions on our exams may be unscored. These are newly developed questions that are being evaluated for their effectiveness. This is a standard practice in the testing industry.
- The score needed to pass is confidential.
The passing score for each exam is confidential. It is determined by a panel of internal and external subject matter experts, following an industry-accepted standard setting process. The passing score is applied equally to all examinees. It is re-evaluated when changes are made to the exam content.
You never learn your score, just whether you passed or failed. If you pass the test, your certification is good for two years.
The exam will cost you $200. If you don’t pass, you can take it again for another $200 in 14 days. If you don’t pass on your second try, you need to wait 60 days and pay again.
Here’s the official test overview.
If you decide to study for the Google Cloud Certified Professional Data Engineer exam, it’s hard to know when you’re ready to take the test. It’s tricky because there are few good test simulations and you don’t even know what you need to pass!
As with most things in life, practice improves your chances of performing well. Take as many practice exams as you can and review the results. You want to feel confident that you know the concepts, pitfalls, and best practices.
I originally planned to study for a month or so, but I decided to push it hard. On the sixth day I tried to register to take the exam the next day, but the testing center was booked. I decided to take a few more days to study and spend time with family in town over the weekend.
I ended up with 10 days of pretty intense study and a few days break in the middle. I felt decently prepared on test day. I hadn’t memorized every IAM role for every resource, but I had a good understanding of best practices with key products.
You take the exam on a computer at a testing center. You’ll have to leave your phone and other personal belongings with the proctor. You’ll be video recorded during the test. Other people will probably be in the same room taking other exams.
Earplugs, scratch paper, and pencils are provided. It sounds silly, but if you’re not an earplug wearer, you may want to practice with them ahead of time. I suggest you don’t press start until they are firmly in your ears.
I had read that the test would be difficult. It was still way harder than I thought it would be. It felt like the hardest test I’ve ever taken, and I’ve taken the SAT, ACT, GMAT, GRE, LSAT and several certification exams. For what it’s worth, this was my first exam from a cloud provider.
The test is difficult for several reasons:
- The breadth of material is vast. There are lots of google products and lots of potential questions about each product and how they work together. There are over 200 Google Cloud APIs. This exam doesn’t cover all of them, but it covers a bunch.
- The exam also tests your knowledge of several Apache open source products related to Google’s offerings.
- It’s not even clear exactly how many Google products could be on the exam because new products are always being added and products are being changed.
- The questions are often multi-line, requiring consideration of multiple variables and intense concentration.
- Some questions have multiple answers required (if more than one answer is required, the number of answers is specified).
- Many answers are somewhat correct. You need to choose the best answer.
The exam will test you in more ways than one. When I took the exam I just tried to stay focussed and not let the voice of self-doubt enter my head.
I had about 30 minutes left after my first pass through the questions. I marked seven answers for review. After reviewing, I had 10 minutes to spare. I clicked submit knowing I had tried my best and the chips would fall where they may.
On the next screen I saw I had provisionally passed. 😃I collected my belongings from the proctor and headed out.
I received an email from Google the next day that I had officially passed. It included a code for some free swag. I would have preferred a less expensive test, but now I’ve got some humiliswag.
I plan to write about Google tools for data injestion, processing, storage, and machine learning in a future article. Follow me to make sure you don’t miss it. Now I’ll mention what I didn’t see on the exam.
- As many IAM questions as I thought I might. There were a bunch on the various practice tests.
- Questions on exact product costs. Just know what makes sense if you’re more cost sensitive or less cost sensitive.
- Firestore questions.
- AI Hub questions.
- Many ML concept questions. I went into the test knowing ML concepts better than Google database products, so perhaps this explains why this part of the test didn’t loom large to me.
- Many questions with code samples.
It makes sense to study for this exam if you want to learn more about Google’s data science and engineering products and you have the time to devote to it. This exam doesn’t have you writing actual queries or cleaning data, so you’ll want to look elsewhere to develop those skills.
If you aren’t already a GCP pro, I guarantee you’ll learn things if you put the time in to study for the exam.
The way I look at it, if you pass the test, great. If you don’t, that’s okay. Either way, you’ll learn a bunch, and that’s most important. 😃
Speaking of learning, I hope you found this article helpful for your learning. If you did, please share it on your favorite social media channel. 👍
I help folks learn about cloud computing, data science, and other tech topics. Check out my other articles if you’re into that stuff.
Happy studying! 📙