A project I am working on to build a custom website involves working with a third party API to receive a lot of information. The API associates users with products tied to their account, and the API is able to give you more information about each product in separate API calls.
To fully build this project, I need information about all users, the products they own, and then additional info about each product to produce an informative website. We have N
users and M
products and there are a number of connections between the users and products.
To start bootstrapping this project, we iterate over the users, then iterate over all the products they own.
(for ([user all-users])
(define users-products ($fetch-products (user-id user)))
(for ([pid users-products])
(define product-info ($fetch-product pid))
(do-something-with product-info)))
This isn't good though, because if we have 500 users and each user has 1000 products, we would be making somewhere around 500,000 network calls. I don't think the API provider would care much for that. Instead, we must take note that while user information can change regularly, product info often times does not.
Let's instead store product information on our file system as flat files. This way, when we do new builds of our website publishing, we have the products already set aside, and we can still fetch user live data from the API as that can change frequently.
(for ([user all-users])
(define users-products ($fetch-products (user-id user)))
(for ([pid users-products])
(define cached (build-path "cache" (format "~a.txt" pid)))
(define product-info
(if (file-exists? cached)
(file->string cached)
(let ([resp ($fetch-product pid)])
(call-with-output-file cached #:exists 'replace
(lambda (out)
(write resp out)))
resp)))
(do-something-with product-info)))
This is kind of a mouthful. This program as it stands isn't very modular, as caching is now tied with our main program logic. Making changes to the main program would mean ducking in and out of caching code, and doesn't reflect our true intent very well. We should move the caching logic into a separate function for modularity and readability.
(define (Cachify path-check code-to-run)
(if (file-exists? path-check)
(file->string path-check)
(let ([resp code-to-run])
(call-with-output-file cached
#:exists 'replace
(lambda (out)
(write resp out)))
resp)))
This looks right to me. We check a file path, and if there is no file, we execute our code payload. Seems right so far.
But herein lies our next problem: because of how Racket evaluates code, trying to use this would still result in us executing a network request each time, regardless of whether or not it was saved to file or not.
> (define product-info (Cachify "product.txt" ($fetch-product 5))
; ... still runs the network request ...
> "{\"pid\":\"5\"}"
This obviously doesn't work for us. We need to shift this into a language-level macro, where Racket will evaluate the macro as if it were a normal Racket function, but treats the arguments as literal values instead of sub-expressions to evaluate. Fortunately it doesn't look much different.
(define-syntax-rule (Cachify path-check code-to-run)
(if (file-exists? path-check)
(file->string path-check)
(let ([resp code-to-run])
(call-with-output-file path-check
#:exists 'replace
(lambda (out)
(write resp out)))
resp)))
This macro works by performing a literal translation on the code it receives and substituting variables appropriately. If we attempt to run this macro, we can see it will substitute the code in a way that it will only run our network request if it isn't on the file system first, and not before it has a chance to check. We can view the macro transformation using expand
and syntax->datum
to get a better picture.
> (syntax->datum (expand '(Cachify "1.txt" (println "Hello"))))
'(if (#%app file-exists? '"1.txt")
(let-values (((temp16) '"1.txt"))
(if (#%app
variable-reference-constant?
(#%variable-reference file->string101))
(#%app file->string 'binary temp16)
(#%app file->string101 temp16)))
(let-values (((resp) (#%app println '"Hello")))
(let-values (((string:5:8) call-with-output-file36)
((temp17) '"1.txt")
((temp18) 'replace)
((temp19) (lambda (out) (#%app write resp out))))
; ...
It looks a bit garbled, but this is how it looks inside the Racket VM. It first checks if a file exists with the (if (#%app file-exists? '"1.txt")
, and if that fails, it moves down to run the expression we gave it on the (let-values (((resp) (#%app println '"Hello")))
line.
Not only will this work with network requests, but frankly any computation you do, you can use with this macro. The nice part is you can adjust the macro itself to support better, modern technology, like say if you wanted to use a database or something like redis
as a memory store. You can modify the macro and continue to use it without having to change the caching implementation all over your program. Now let's go build our website.
(for ([user all-users])
(define users-products ($fetch-products (user-id user)))
(for ([pid users-products])
(define product-info
(Cachify (build-path "cache" (format "~a.txt" pid))
($fetch-product pid)))
(do-something-with product-info)))
On all incremental builds, you will have saved your API host thousands of calls, so long as their product info doesn't change. If it ever does, you might just have to invalidate your cache by cleaning out old files or some such, or creating an update cycle where the cache will validate itself only on certain days or intervals. But that's up to you, of course.
Thanks for reading!
Top comments (0)