Last Reviewed: July 2024
Introduction
File upload to the server host is a common webapp requirement. For instance, users of a blog webapp might want to add graphic or video files to their posts.
Previously, in this post series, you've seen how the Firebase deploy
procedure can be used to upload static assets to the server. But here we're talking about dynamic assets. A Firebase deploy
isn't going to be of any use in this situation. Google's answer to this requirement is a service called "Cloud Storage".
You might have caught sight of this when you first saw the Firebase Console in Firebase project configuration. At the time, attention was focused on Authentication, Firestore and Hosting but, included in the list of "Build" tabs in the left-hand column, you might have spotted a service labelled "Storage".
Open the Firebase Console for your project, find the "Storage" tab and click it.
Cloud Storage is actually a huge part of the Google Cloud system - you can get a glimpse of Google's overall vision for the service at Cloud Storage for Firebase. I'll be using just a fraction of its capabilities in this post, but you'll quickly see that this is a facility that you can rely upon to meet all of your requirements for robust, scalable and secure storage.
Cloud storage is organised around storage buckets. This is how Google itself describes the system:
Buckets are the basic containers that hold your data. Everything that you store in Cloud Storage must be contained in a bucket. You can use buckets to organize your data and control access to your data, but unlike directories and folders, you cannot nest buckets. While there is no limit to the number of buckets you can have in a project or location, there are limits to the rate you can create or delete buckets.
When you create a bucket, you give it a globally-unique name and a geographic location where the bucket and its contents are stored. The name and location of the bucket cannot be changed after creation, though you can delete and re-create the bucket to achieve a similar result. There are also optional bucket settings that you can configure during bucket creation and change later.
The first time you open the Console's Storage page, Google will ask you to initialise the default Storage "bucket" that has been allocated to your project (you can see the name of this if you open Project Settings and look for "storageBucket")
Initialisation is generally quite straightforward but you may be slightly thrown when you are asked whether you want to start your project in Test or Production mode. You might remember, however, that there was something similar when you were initialising your project's Firestore settings - it has to do with storage rules. At this stage, you should select "test" - more on this later. Select a suitable geographical location too - somewhere reasonably close should be your aim. Once you're through all this, the Storage page should look something like the following:
Uploading a file to Cloud Storage
If you're coming to this post with your mind full of Firestore collection and code deployment concepts, you might imagine that one way of uploading a file might be to store its content in a Firestore collection. So, for example, you might wonder if you could store it in a data field formatted as some sort of encoded string. Well, maybe, but the maximum size of a Firestore document is 1MB and this won't go very far with most of the content you're likely to want to upload. So the plan must be to upload your files into Cloud storage.
Let's create some code to upload a file directly into a project's default bucket. Here's an example. If you've arrived at this post as a result of following earlier episode of this series, this code below is intended to replace the index.html
and index.js
files in the firexptsapp
project described at Post 3.1. First, some HTML to solicit a filename:
<body style="text-align: center;">
<input type="file" id="fileitem">
<script src="packed_index.js" type="module"></script>
</body>
If you're not familiar with the HTML file
input type, check out Mozilla's documentation at <input type="file">
- it provides a very neat way launching a file-selection window and storing the user's choice in the DOM.
And here's an index.js
file to upload the selected file to Cloud Storage:
import {
initializeApp
} from 'firebase/app';
import {
getAuth,
GoogleAuthProvider,
signInWithPopup
} from 'firebase/auth';
import {
getStorage,
ref,
uploadBytes,
deleteObject,
getDownloadURL
} from 'firebase/storage';
const firebaseConfig = {
apiKey: "AIzaSyAPJ44X28c .... 6FnKK5vQje6qM",
authDomain: "fir-expts-app.firebaseapp.com",
projectId: "fir-expts-app",
storageBucket: "fir-expts-app.appspot.com",
messagingSenderId: "1070731254062",
appId: "1:10707312540 ..... 61bd95caeacdbc2bf",
measurementId: "G-Q87QDR1F9T"
};
const app = initializeApp(firebaseConfig);
const storage = getStorage(app);
window.onload = function() {
document.getElementById('fileitem').onchange = function() {
uploadFile()
};
}
function uploadFile() {
const sourceFileObject = document.getElementById('fileitem').files[0];
const sourceFileName = sourceFileObject.name;
const targetFileName = sourceFileName;
const storageRef = ref(storage, targetFileName);
uploadBytes(storageRef, sourceFileObject).then((snapshot) => {
alert('Successful upload');
});
}
The uploadFile()
function in index.js
is triggered when it sees the contents of the index.html
's fileitem
field change. This signals that the user has selected a file. Prior to that you'll see that the code initialises and authorises the webapp exactly as in previous posts but has also imported a few new functions from the firebase/storage
module.
The first new action is to create to create a storage
object using the new getStorage
function. The storageBucket
property in the preceding firebaseConfig
declaration tells getStorage
that I want my uploaded file to end up in my "fir-expts-app.appspot.com" default bucket.
The uploadFile()
function I've created to actually upload the selected file is very simple. First it creates a sourceFileObject
variable containing the selected file's details derived from the DOM entry for the HTML input field ( files[0]
means "get the details for the first in the list of selected files" - in this case, there's only one anyway, but we have to go through the motions). It then recovers the input file's name and creates a storageRef
variable that combines this with the specification of the target bucket from the storage
variable. The sourceFileObject
(ie source) and storageRef
(ie target) variables are then all that the SDK's uploadBytes
function needs to upload my file. When it completes, it displays an alert message on the browser screen.
To put this into operation, I simply open my build_for_development.ps1
script file (see post 3.1) and select "Run Active File" on the Terminal tab. Once the code has been successfully "webpacked" (essential now because I'm using the modular V9 Firebase SDK) and deployed, I can launch the webapp with the supplied url ( https://fir-expts-app.web.app
in this case). If I then select a random file, the browser should respond with a 'Successful upload' alert message, and if I refresh the Firestore storage page for my project I should see that this now contains a copy of my original file. I can check that it's the correct file by clicking on its console entry and noting the useful thumbnail and metadata that are then revealed.
You're now probably thinking "this is fine, but I need to put some structure on my storage. Do I need to create additional buckets to achieve this". The answer is "not necessarily". The Cloud Storage system is very happy for you to include folder structures in your specification of your storageRef
variable. So, for example, if I changed the code above to read
const storageRef = ref(storage, "myFolder/" + targetFileName);
my uploaded 'myFile' would be created in a 'myFolder' folder inside my default bucket. Note that I don't have to create this folder explicitly - if it doesn't exist, uploadBytes will automatically create it. Likewise, if a file already exists with the supplied name, it will be overwritten.
But be prepared for a surprise here. If you reference the file you've just "overwritten" you may be alarmed to find that it hasn't changed at all! This is because, by default, Google sets "cache control" metadata that has the effect of serving files from "cloud cache" cache for 60 minutes before referring back to the source. For details of how you can over-ride this feature you may find it interesting to read a curious tale at 'Flattening' your code with Google Cloud Storage Metadata
Your next question might be "so why would I ever want more than my default bucket?". To which you might also add "especially since additional buckets are only available to paid project plans".
One good reason is that file permissions are applied on a bucket rather than a file or folder basis. So, supposing that you wanted some files to be available to the general public while others needed to remain "secret", the two sets would have to be stored in separate buckets. Google's Make data public will be helpful here if you want further details. To allow public access, you'll need to use the Google Cloud console to assign the "Firebase Viewer" role to the "allUsers" principal for your bucket. If your project is registered with Firebase, you'll also need to use the Firebase Console to set appropriate rules for "Storage" (noting that these are separate from the rules for "Firestore Database"). While you're just noodling around, you might as well set these to:
allow read, write: if true
But be prepared for Google to email you to warn you that the whole world can read your data. In the longer term you will probably want to add a login and reset the rules to something like:
allow read: if true
allow write: if request.auth != null
In the event that you do want to store your files in a bucket named "my-bucket", you would change the storageBucket
reference in firebaseConfig
to read
storageBucket: "my-bucket",
Referencing a file in Cloud Storage
Once you've created a file "myfile" in a bucket "mybucket" you can reference it via its url at "https://storage.googleapis.com/mybucket/myfile. You might use this, for example, as the src
of an img
tag, or the href
of a window.open()
. A casual user might also open the address in a browser tab, but this is the point at which you would need to use the Google console to set public read permissions as mentioned earlier - the default on a new bucket is "private".
Deleting a file from Cloud Storage
Deletion of a cloud file with name myFile
in folder myFolder
is just a variation on the theme you've already seen.
const target = myFolder + "/" + myFile;
function deleteFile(target) {
const storageRef = ref(storage, target);
deleteObject(storageRef).then(() => {
// File deleted successfully
}).catch((error) => {
console.log("Oops - System error - code is " + error);
});
}
I've added a "catch" to this one so you can see how you error-handling works.
Note that it is not possible to rename a file in Cloud Storage using the JavaScript Storage API. You need to delete and recreate it.
Fetching a file from Cloud Storage
If you need to fetch the content of myFile so that you can process it within the body of your webapp you need to make an XMLHttpRequest call. The following piece of Google-supplied code will recover the content of myCloudFile.txt into local variable myCloudFileContent:
const target = "myCloudFile.txt";
var myCloudFileContent;
function fetchFile(target) {
const file = getDownloadURL(ref(storage, target))
.then((url) => {
const xhr = new XMLHttpRequest();
xhr.responseType = 'text';
xhr.onload = (event) => {
myCloudFileContent = xhr.response;
};
xhr.open('GET', url);
xhr.send();
})
.catch((error) => {
alert('Oops - fetch failed : error = ' + error);
});
}
Note that fetchFile
is an asynchronous function. If you need to know when myCloudFileContent had arrived, you should create an async
function and wait
for the completion of fetchFile(target)
like so:
var myCloudFileContentLoaded = false;
async function fetchMyFile(target) {
await fetchFile(target)
myCloudFileContentLoaded = true;
}
You can now use the myCloudFileContentLoaded
variable to direct code further down the line - for example to show a "file loading" spinner.
But how would you arrange to fetch a long list of files - typically the entire content of a Cloud Storage folder? Knowing that fetches are asynchronous operations, you'll be looking for a way of performing these in parallel in order to maximise efficiency. Also, you'll need to find a way of finding out when they've all finished so that you can then get on with processing the results. Here's some code that will do the trick:
import { initializeApp } from 'firebase/app';
import { getStorage, ref, listAll, getDownloadURL } from 'firebase/storage';
const app = initializeApp(firebaseConfig);
const storage = getStorage(app);
const myCloudFolderStorageRef = ref(storage, "myCloudFolder/");
const promisesArray = [];
const myCloudFilesContent = [];
const myCloudFilesLoaded = false;
fetchMyFiles();
async function fetchMyFiles() {
let result = await listAll(myCloudFolderStorageRef)
result.items.forEach((file) => (promisesArray.push(fetchMyFile(file))));
myCloudFilesContent = await Promise.all(promisesArray);
myCloudFilesLoaded = true;
// file content now in the myCloudFilesContent array
}
async function fetchMyFile(file) {
return getDownloadURL(ref(storage, file))
.then(response => fetch(response))
.then(response => response.text());
}
This script starts by invoking an asynch Firebase listAll
function to get details of all the files in myCloudFolder/
. These are then forwarded individually to a fetchMyFile
function to recover their content. The fetchMyFile
function uses Javascript's modern fetch
function inside getDownloadUrl
rather than the classic XMLHttpRequest
introduced earlier. (See a simple guide to Javascript's Fetch function for an explanation of the difference and why you might find fetch
preferable. Note that the asyncFetch
code uses .then
methods to handle the results rather than await
s). The consequence is that asyncFetch
returns a promise
to provide a value sometime in the future rather than the value itself.
Launching a set of fetchMyFile
s for the files in your myCloudFiles
array will get you your parallel fetches, but what's missing is some way of detecting when these have all finished.
In the parent fetchMyFiles
function above, this is deftly provided by Javascript's Promise.all
function. If you store a reference to each of your promises in a promisesArray
, Promise.all(promisesArray)
will tell you when these have all finished. It will also return their results as the outcome of the Promise.all
's own promise.. The first line of fetchMyFiles
raises an asyncFetch
promise for each of your files and places references to these in the elements of promisesArray
. This mighty statement therefore has the effect of both launching parallel execution of your fetches and storing the associated promises in useful array form.
By applying an await to the Promise.all
call you can then arrange for progress on fetchMyFiles
to halt until everything has been done. As explained above, at this point, the content of each file will also have been stored in an element of the myCloudFilesContent
array that forms the target of the Promise.all
promise. The fetchMyFiles
function can then continue on its merry way and set the myCloudFilesLoaded
field to true
.
A typical use of a flag field like myCloudFilesLoaded
would be to trigger the disappearance of a "spinner" field that had been activated when the associated fetchMyFile
button was clicked.
Note that this sort of code is likely to trigger "Cross Origin" errors - security measures invoked by the browser to secure the website that you are accessing from a potential security breach. This is a deeply technical subject, but a quick way of bypassing the issue in this case is to create a cors.json
file in the root of your project with the following content:
[
{
"origin": ["*"],
"method": ["GET"],
"maxAgeSeconds": 3600
}
]
and then running the following command in a terminal session for your project:
gsutil cors set cors.json gs://myProject.appspot.com
See Google's Download Files document for a description of the procedure.
For further background on all this you may find James Sinclair's How to run Async Javascript functions in sequence or parallel useful.
Downloading a file from Cloud Storage
Returning a copy of a Cloud Storage file to local file storage is quite a bit trickier than anything you've seen so far, but the following code builds on the arrangement described above and does the job very satisfactorily.
const target = "myCloudFile.txt";
function downloadFile(target) {
const file = getDownloadURL(ref(storage, target))
.then((url) => {
const xhr = new XMLHttpRequest();
const a = document.createElement("a");
xhr.responseType = 'text';
xhr.onload = (event) => {
const blob = xhr.response;
a.href = window.URL.createObjectURL(new Blob([blob], {
type: "text/plain"
}));
a.download = "myLocalFile.txt";
a.click();
alert('Successful download');
};
xhr.open('GET', url);
xhr.send();
})
.catch((error) => {
alert('Oops - download failed : error = ' + error);
});
}
This code will retrieve the "myCloudFile.txt" file from Cloud Storage and save it in your download folder as "myLocalFile.txt". To get the file into local storage, the code creates an anchor element pointing at the Cloud Storage address of myCloudFile.txt and activates the "download" action of the anchor dynamically with a "click()" call.
For background on these techniques, see Google's Download files with Cloud Storage on Web and Codeboxx's handy 5 Ways To Create & Save Files In Javascript page.
Checking for File Existence
There are times when you need to check the existence of a file independently of Upload, Delete, Fetch operations etc where, naturally, you will handle exceptions like the unexpected non-existence of a file via error handling arrangements. Here's a function to return a promise to tell you whether a file exists or not:
function fileExists(filename) {
return getDownloadURL(ref(storage, filename)).then(response => true).catch(error => false);
}
As you'll expect by now, checking for existence is going to be another asynchronous operation. To get the value of the fileExists
promise you'll need to await
it in an asynchronous function. The section on fetching files above gives examples of how you would do this, both for single file-instances and for a whole array-full.
Storage Rules
There's one final wrinkle to be ironed out. Because project API keys are essentially public, Google Cloud Storage needs to be protected by the same sort of "rule" arrangement that you saw earlier with regard to Firestore document security.
You can see the rules currently in force by clicking the "rules" tab on the Firebase Console's Storage page. Because I initialised storage for the fir-expts-app project in "test" mode they'll look something like the following:
rules_version = '2';
service firebase.storage {
match /b/{bucket}/o {
match /{allPaths=**} {
allow read, write: if
request.time < timestamp.date(2022, 1, 17);
}
}
}
These are saying "allow anybody to do anything while the run-date is prior to 17th Jan 2022". I ran my initialise on 18th Dec, 2021, so Google was giving me a month to get myself sorted out. After this date, unless I changed the rules myself, they would deny access entirely. So, when you're setting things up initially, the "test" setting is fine but in the longer term you'll probably want to add a "login" feature to allow you to replace the rule with something like
allow read, write: if request.auth!=null;
Copying and renaming files - working with Node.js in Cloud functions
Thus far, this post has concentrated on operations that you can perform using javascript in a webapp linked to the Google Firebase Storage API. But sadly, if you want to copy or rename files, you'll find that this API doesn't support these actions directly at all - for example, as commented earlier, the only way you can rename a file using the API is to download it and upload it again to a different name.
Fortunately, an alternative approach is available, courtesy of the Google Node.js Cloud Storage Library. Unfortunately, since this is a node.js library you have to do this in a cloud function.
Obviously, this opens up quite a can of worms for the novice developer. But cheer up. Once you've got the hang of writing and testing Google Cloud functions, the Storage Library enables you to write some satisfyingly neat and powerful code.
There is, however, a distinct difference between the style of the two libraries that may cause you some grief until you have got them clear in your mind.
In the Firebase Storage API library, you can use the storageRef feature to address your bucket storage as a hierarchy of folders and subfolders. To the Cloud Storage Library, however, the bucket is just a long list of files with names like "my-folder-name/my-filename". There's no concept of folders and sub-folders here at all. If you try to reference a storage location with a name like "mybucket/myfolder" you will just get a "this bucket doesn't exist" error message.
Supposing that you'd used the cloud console to build a bucket structure like this:
my-bucket-name
├───file1.txt
├───file2.txt
├───my-folder-name
│ ├───filea.txt
│ ├───fileb.txt
Then the following code:
const bucketName = 'my-bucket-name';
const storage = new Storage();
async function listFiles() {
// Lists files in the bucket
const [files] = await storage.bucket(bucketName).getFiles();
console.log('Files:');
files.forEach(file => {
console.log(file.name);
});
}
would produce the following output:
file1.txt
file2.txt
my-folder-name/filea.txt
my-folder-name/fileb.txt
To target files in a particular folder, you need to either "parse" the filename output from listFiles yourself or obscure your code by using the listFilesByPrefix version of listFiles with "prefix" and a "delimiter" parameters (see the "samples" link above to view the full list of options available to you).
So functions are a bit scary - if you've not used them at all before, you might find it useful to check out an earlier post in this series at Background processing with Cloud Functions. But once you've got the hang of them you'll find that effort expended here will reap big rewards because:
- Many third-party SDK libraries can only be accessed from a server environment - examples would be the mailchimp and postmark libraries which you might use to deliver mailing arrangements. They're written this was because the Cloud servers in which functions run provide security for the keys that guard access accounts.
- The Cloud servers are "closer" to your data so operations conducted here run much more quickly than they would if browser javascript.
- The "style" of both cloud storage file handling and Firestore collection interfaces are much tighter and more intuitive in the Node.js library than in the browser version.
A particularly useful way of configuring a Cloud function is the onCall model
. As an example, here's an onCall function that copies a bucket full of Cloud storage files with structured [title].pdf names into a new "indexed_files" folder as copies of the original files now renamed as "anonymous" [docId].pdf files. At the same time, it creates a [title] index for these by adding a document to an associated 'fileTitle-index' Firestore collection.
exports.indexMyFiles = functions.region('my-storage-region').https.onCall(() => {
const storage = new Storage();
return new Promise( function (fulfilled, rejected) {
async function indexFiles() {
try {
const [files] = await storage.bucket(bucketName).getFiles();
files.forEach(async file => {
// Get the raw title of the file by stripping the ".pdf" extension from the end of its filename
var fileTitle = file.name.slice(0, -4);
// get an autoId to provide the name of the file when it's copied into the "indexed_files" folder
const uuid = crypto.randomUUID();
// add a record to the fileTitle-index collection
const fileData = {
anonymisedFilename: uuid + ".pdf",
fileTitle: fileTitle,
}
await db.collection('fileTitle-index').add(fileData);
// copy the file to the indexed_files storage folder under its anonymised name
await storage
.bucket(bucketName)
.file(file.name)
.copy("indexed_files/" + uuid + ".pdf");
})
fulfilled("foo");
} catch (error) {
rejected("baa");
};
}
indexFiles();
})
})
See Google's Call functions from your app documentation for background information on onCall
functions.
Note the use of the .region('my-storage-region')
method in the indexMyFiles
function's declaration. This is used simply to override the default selection of the "us-central1" location. In order to avoid problems with CORS, it is important to ensure that functions are deployed and called in the same region. I've found it best if this is always declared explicitly. A typical value for my-storage-region
would be "europe-west2"
You'll appreciate that some preliminary setup in the functions/index.js is required to provide the above with the necessary libraries and parameters. The following is required in this instance
const functions = require("firebase-functions/v1");
const admin = require("firebase-admin");
const { initializeApp, cert } = require('firebase-admin/app');
const { getStorage } = require('firebase-admin/storage');
const { Storage } = require('@google-cloud/storage');
const crypto = require('crypto');
const serviceAccount = {
... serviceAccount json for your project...
}
const bucketName = "my-bucker-name";
initializeApp({
credential: cert(serviceAccount),
storageBucket: bucketName
});
const bucket = getStorage().bucket();
const db = admin.firestore();
Here, the "serviceAccount" json referenced above is expected to have been obtained by selecting the "Service accounts" tab in your project's Firebase settings and then clicking the "generate new key" button. The file download that this generates contains the necessary settings. Simply cut and paste this into your code (or set it up as a library component in your project if you think that it may be required in multiple locations)
Once your function code has been incorporated into your project's functions/Index.js file and deployed to the Cloud, you might then call your function using HTML or JSX in your webapp along the following lines:
<button onClick={() => {
const indexMyFiles = httpsCallable(functions, 'indexMyFiles');
indexMyFiles()
.then(success => {
window.alert("Yay - success! " + success.data);
})
.catch(error => {
window.alert("Oops - something went wrong : " + error)
})
}}>Launch Function</button>
where preliminary setup has initialised your webapp as follows:
import { initializeApp } from 'firebase/app';
import { getFunctions, httpsCallable } from 'firebase/functions';
const firebaseConfig = {
... firebaseConfig json for your project ...
};
const app = initializeApp(firebaseConfig);
const functions = getFunctions(app, 'my-storage-region');
Once again, the code for firebaseConfig
json can be obtained from your project's settings in the Firebase console.
Use console.log
statements in your code to debug this. You will be able to see output from these, together with full details of any error that Firebase itself has picked up by referencing the logs page in the Cloud Functions section of the Cloud console. For a complex function, you might like to give the Firebase emulator a try since this will save you the trouble of re-deploying your function every time you make a change to your code.
Other posts in this series
If you've found this post interesting and would like to find out more about Firebase you might find it worthwhile having a look at the Index to this series.
Top comments (0)