DEV Community

Max
Max

Posted on

Replacing Google Analytics with CloudFront metrics

Replacing Google Analytics with CloudFront metrics

A detailed look at AWS website monitoring tools built into AWS CloudFront

I had to build a landing page to validate a free proofreading community idea a few days ago. It was just a single page with 2 images and an email sign up form. The easiest hosting option was AWS S3 + CloudFront + Route53. I decided not to add the obligatory Google Analytics JS tracker and rely on the analytics provided by CloudFront.

The site was running for a few days and attracted a small number of visitors, mainly bots. That was enough to conclude that:

  1. CloudFront metrics is not a replacement for a client-side tracker
  2. CloudFront metrics are useful in their own right to improve performance, troubleshoot or reduce your AWS bill.

Read on to dive deeper into live examples of CloudFront logs and reports.

CloudFront Logging

The logs were written into a separate S3 bucket in W3C format. There were standard W3C fields like date, time, sc-bytes, c-ip and cs(Referer) as well as AWS-specific fields like x-edge-location or x-edge-result-type. For example, the 3rd column in the snippet below is an extended field with a name of the AWS edge location the content was served from. It's kind of cool to know, but it doesn't tell us much about our visitors.

#Version: 1.0
#Fields: date time x-edge-location sc-bytes c-ip cs-method cs(Host) cs-uri-stem sc-status cs(Referer) cs(User-Agent) cs-uri-query cs(Cookie) x-edge-result-type x-edge-request-id x-host-header cs-protocol cs-bytes time-taken x-forwarded-for ssl-protocol ssl-cipher x-edge-response-result-type cs-protocol-version fle-status fle-encrypted-fields c-port time-to-first-byte x-edge-detailed-result-type sc-content-type sc-content-len sc-range-start sc-range-end
2020-04-19  05:33:36    IAD89-C3    100094  34.227.59.34    GET dvvs5hy178tti.cloudfront.net    /intro.png  200 https://feedback.farm/  Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Miss    8ipYDNH0CvV2lqANkkL9hkCf8fqEwImPf5eXkplC92knzEvPk2OhqA==    feedback.farm   https   190 0.060   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   25329   0.057   Miss    image/png   99641   -   -
2020-04-19  05:33:43    IAD89-C3    6493    54.211.21.183   GET dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Hit OH_XrRls6xYVxO85em0bS5g8Q1OUjgo6OZjew0SEZnPk01oV3pa17g==    feedback.farm   https   148 0.000   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/1.1    -   -   23242   0.000   Hit text/html   6034    -   -
2020-04-19  05:33:43    IAD89-C3    4780    54.211.21.183   GET dvvs5hy178tti.cloudfront.net    /feedback-farm-logo.svg 200 https://feedback.farm/  Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Hit hnJ4onLxGD95WKnlattdD5dtTZvQm6jw8QrlhOr4_-2UChLFIOj2LQ==    feedback.farm   https   207 0.001   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/1.1    -   -   23242   0.000   Hit image/svg+xml   4317    -   -
2020-04-19  05:33:43    IAD89-C3    100101  54.211.21.183   GET dvvs5hy178tti.cloudfront.net    /intro.png  200 https://feedback.farm/  Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Hit JeD6TX99dgXzCWRqJorXQQKMCNNNKVKvch96_zI9SbPywMc5j63Trg==    feedback.farm   https   190 0.001   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/1.1    -   -   23242   0.000   Hit image/png   99641   -   -
2020-04-19  05:33:44    IAD89-C1    6486    52.91.40.117    GET dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_11_6)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/55.0.2883.95%20Safari/537.36Mozilla/5.0%20(Macintosh;%20Intel%20Mac%20OS%20X%2010_11_6)%20AppleWebKit/602.1.50%20(KHTML,%20like%20Gecko)%20Version/10.0%20Safari/602.1.50   -   -   Miss    ncpijEZfVmXmsv2p-1lfHARFojMe6r6GO_oeCdCdptd2-sTKz73GbQ==    feedback.farm   https   359 0.067   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   21925   0.067   Miss    text/html   6034    -   -
2020-04-19  05:33:36    IAD89-C3    6486    34.227.59.34    GET dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Miss    5yy5mvCOVahhak4WS211xpAuuBU8mOiPtPWys76VtLRoXTRGEdccag==    feedback.farm   https   148 0.090   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   23294   0.090   Miss    text/html   6034    -   -
2020-04-19  05:33:36    IAD89-C3    4773    34.227.59.34    GET dvvs5hy178tti.cloudfront.net    /feedback-farm-logo.svg 200 https://feedback.farm/  Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Miss    l4I3HRzYzQMO-L9jjPmnwU4KN3gxRH3TofGhSVSe_BL5ZQMsxsD91A==    feedback.farm   https   207 0.062   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   23294   0.061   Miss    image/svg+xml   4317    -   -
2020-04-19  05:33:36    IAD89-C3    100094  34.227.59.34    GET dvvs5hy178tti.cloudfront.net    /intro.png  200 https://feedback.farm/  Mozilla/5.0%20(compatible;%20redditbot/1.0;%20+http://www.reddit.com/feedback)  -   -   Miss    FCkjm18IZ9wLRYrLkVZa1H0eWShTKlzk2XWxy0nOZbqEIF01Twx7Bw==    feedback.farm   https   194 0.080   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   23294   0.078   Miss    image/png   99641   -   -
2020-04-19  05:33:44    SEA19-C1    574 52.42.250.87    GET dvvs5hy178tti.cloudfront.net    /   301 -   Dispatch/0.11.1-SNAPSHOT    -   -   Redirect    XeBnr1zJn88qxAWu5NzqPwugYnmpyO2LsRSqR4Pn1Iw2mczK4FO2VA==    feedback.farm   http    109 0.000   -   -   -   Redirect    HTTP/1.1    -   -   17264   0.000   Redirect    text/html   183 -   -
2020-04-19  05:33:43    ORD53-C3    447 208.79.208.172  HEAD    dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(X11;%20Linux%20x86_64)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Ubuntu%20Chromium/72.0.3626.121%20Chrome/72.0.3626.121%20Safari/537.36   -   -   Miss    DRujIjx9qV_VXh_h3CJhvsh6MVSNxeN5VtpXhEZwxd6ND1C72VMWlw==    feedback.farm   https   208 0.165   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   42355   0.165   Miss    text/html   6034    -   -
2020-04-19  05:33:43    LHR50-C1    6486    185.20.6.41 GET dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(TweetmemeBot/4.0;%20+http://datasift.com/bot.html)%20Gecko/20100101%20Firefox/31.0   -   -   Miss    jRkF9NJeHpzk8HIybbEop-XKcFL92P_evRF0UKcgZl1CydJCe1h84A==    feedback.farm   https   243 0.437   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Miss    HTTP/1.1    -   -   41400   0.437   Miss    text/html   6034    -   -
2020-04-19  05:33:36    IAD89-C3    6485    18.234.93.156   GET dvvs5hy178tti.cloudfront.net    /   200 -   Mozilla/5.0%20(Windows%20NT%206.1;%20WOW64)%20AppleWebKit/537.36%20(KHTML,%20like%20Gecko)%20Chrome/45.0.2454.85%20Safari/537.36    -   -   Hit YOfmAegVa1Kz-KdhRIXcqhx5yzjyyPZTZY84Ja2rUUKpABTSh6b-Hw==    feedback.farm   https   301 0.001   -   TLSv1.2 ECDHE-RSA-AES128-GCM-SHA256 Hit HTTP/1.1    -   -   52041   0.000   Hit text/html   6034    -   -
Enter fullscreen mode Exit fullscreen mode

CloudFront Reports

Log-based reports go into more technical details (caching and usage) of how a resource was served, if it was in the cache, bytes transferred and other info of interest to sysadmins. Still, not much information about the visitors, but at least we can see the number of hits and some errors.

AWS updates the graphs every hour or so, but they warn us there may be delays up to 24hrs.

Caching graphs

cache-hits

cache-status

cache-total

cache-bytes

cache-aborted

Usage graphs

Usage reports look very similar to caching reports, but have a different level of detail. They are good for matching the usage data with AWS billing to optimize the monthly cost.

usage

Did you notice HTTP Status Codes graph had a surprisingly high number of 404 errors for such a small website?

Popular Objects report contained the answer: a missing robots.txt file and bots trying to hack the non-existent WordPress.

Popular Objects

Real-time monitoring

CloudFront connects to CloudWatch to display real-time performance of the distribution (i.e. the website), which is useful for troubleshooting, but again, not much info about our visitors.

real time monitoring

Alarms

CloudFront links directly to CloudWatch alarms. I configured an alarm to remind me if 404 errors go beyond a certain threshold in case I update the page with a broken image or script reference.

cloudwatch alarm

Viewer metrics

Top Referrers report shows where the viewers came from, but you can't drill any further into that data. For example, I would like to know what pages they came from, not just from what domain.

referrer report

There are only four charts with visitor information: Devices, Browsers, Operating Systems and Locations. They are very superficial - you cannot drill down into the data. What you see in the screenshots here is the limit of how granular it gets.

devices

browsers

operating systems

locations

Google Analytics dashboard

I'd guess that most readers here are familiar with Google Analytics (GA), so I will only remind you what that dashboard looks like with a single screenshot.

google analytics homepage

  • The depth of detail in GA is many times better than what we get from CloudFront charts and reports.
  • GA has many more metrics and insights than CloudFront.

For example, these GA menus list only a subset of all the GA views, compared to only 4 views in CloudFront:

google analytics menus

Conclusion

CloudFront metrics are a poor replacement choice for Google Analytics. It was a mistake to omit the client-side tracker and I will be adding GA to my Feedback Farm experiment shortly.

Saying that, I would still recommend enabling CloudFront metrics alongside a JS tracker, of which there is now quite a choice.

Top comments (0)