DEV Community

Rick Delpo
Rick Delpo

Posted on • Edited on

Enabling AWS S3 to behave more like a Web Server

Cloudfront Functions (as of Nov 2021) can be used to mimic Apache htaccess behavior with Cloudfront in front of S3 thus providing more server like behavior for our AWS configuration. I have 4 things in mind. 1. We want to deal with Pretty Urls or files without extensions using redirects 2. We want to enable default index.html behavior for our sub folders 3. We want to fiddle around with our response security headers and 4. We want to do some url rewriting. In this article we will cover 1 thru 3.

Remember back in the day before https (circa 2018) when hosting in S3 was as simple as enabling web hosting right in the bucket? Well AWS s3 is not really meant to be a webserver so now adding cloudfront in front of s3 attempts to get us closer but more is still needed. And with the more I get really close to mimicking apache httpd like behavior. Beware by adding Cloudfront we do experience some challenges as some traditional S3 behavior ceases to work. Cloudfront overrides traditional programming thus getting in the way.

Part 1 - Pretty Urls and the need for Redirects

One vexing issue is how to deal with html pages without extensions. My previous Apache setup had some files with html extensions and some without. A couple lines in htaccess solved the extensionless issue. But S3 forces us to keep the html extension for all the files. My first inclination was to upload html files and then rename them without the extension. Then I tried to redirect my extensionless file in the metadata section of the S3 record. I even tried to use AWS CLI to upload my files. After struggling for a while it dawned on me that I do not even need the no extension file. It turns out that after all this I found out that Cloudfront does not honor any of the traditional S3 manipulation rules. Hence the Cloudfront function to remedy this. Wait what? does not honor rules, really?? no one tells us about this one up front. This means u can study aws up and down and nothing works because with cloudfront in the way there is an exception. This is a real gotcha but can be overcome with my new strategy. The gotcha part is that so much out there is misleading that we waste our time in Google research hell. So now that I know about not honoring, here is how we deal with it.

After much over thinking the solution is simply to include a 301 redirect instruction, in a cloudfront function, to redirect to a newly created html file. Using this approach also protects our SEO link equity, also known as link juice. We are simply asking if the URI request is the old file without an extension. No need to create this file, just ask in our request header. If the condition is true then we redirect the user to the html version of the file. We should thus create the html version of the file and also don’t forget to change our sitemap and canonicals to agree. Any previous links to the no extension syntax will continue to work. This is key.

PS. clarification, be sure that all pages agree with sitemap because some have the extension and some don't. Also don't forget to change ur canonicals to include the .html after ur migration.

So here we go, this is what it looks like
VERY IMPORTANT..since the writing of this article part 1 of this code is no longer working. Go to the end of this article for the new code **

function handler(event) {
    var request = event.request;
    var uri = request.uri;
    var requestURI = event.request.uri

    var newurl5 = `https://howtolearnjava.com/JDBC.html`;
    var newurl6 = `https://howtolearnjava.com/Java-Servlet.html`;


   //the uri is everything after the .com slash in our path including the slash and any subfolders


    if (requestURI === "/JDBC") {
            var response = {
                statusCode: 301,
                statusDescription: 'Found',
                headers:
                    { "location": { "value": newurl5 } }
                }
            return response;
        }

    if (requestURI === "/Java-Servlet") {
            var response = {
                statusCode: 301,
                statusDescription: 'Found',
                headers:
                    { "location": { "value": newurl6 } }
                }
            return response;
        }        



//Part 2 - below code will enable sub folders to host index.html as default
    // Check whether the URI is missing a file name. If so then it is a subfolder and we then concat and index.html file so when the user types in the subfolder, by default he gets the index.


    if (uri.endsWith('/')) {
        request.uri += 'index.html';
    } 
        // Check whether the URI is missing a file extension.
    else if (!uri.includes('.')) {
        request.uri += '/index.html';
    }
    return request;
}
Enter fullscreen mode Exit fullscreen mode

Open AWS Cloudfront, click functions in left menu, name and build the function, test it, publish it then associate it to a cloudfront distribution.

Part 2 sub folders

Since S3 file storage is just simple storage S3 has no knowledge of a subfolder or its index file. Luckily when we first encounter web enabling an S3 bucket, this part gives our index file default like behavior. But generally nothing is ever discussed about subfolders. In Apache we make use of an htaccess file in each subfolder to deal with default indexes. But S3 and Cloudfront does not have an htaccess file so we hereby use a Cloudfront Function to mimic htaccess. In our above example the Cloudfront Function is like a listener listening for http requests (similar to apache httpd). Notice that when the user request comes in at the cdn edge our program asks if it is a subfolder and if it is then concat an index file to the uri. Voila we have the behavior we want.

Part 3 Response Security Headers

In our above 2 examples we are dealing with a request header but now we need to write a separate function to handle a response header. What we are doing here is providing our Cloudfront config with some additional headers for security purposes. Normally this is done once again in our Apache htaccess file but here we do it like this below. A separate function is needed because when we test it we want to check the box that says response and also want to do the same when we associate our published file. Likewise with the request function we check the box that says request.

Here we go

function handler(event) {
     var response = event.response;
     var headers = response.headers;


   // config security headers

headers['strict-transport-security'] = { value: 'max-age=63072000; includeSubdomains; preload'}; 


headers['content-security-policy'] = { value: "script-src-elem 'self' 'unsafe-eval' https://javasqlweb.org/js_new.js 'unsafe-inline' https://ipinfo.io/ 'unsafe-inline' https://www.googletagmanager.com/ 'unsafe-inline'; style-src 'self' 'unsafe-inline"};


headers['x-content-type-options'] = { value: 'nosniff'}; 


headers['x-xss-protection'] = {value: '1; mode=block'};


headers['referrer-policy'] = {value: 'no-referrer-when-downgrade'};


headers['x-frame-options'] = {value: 'DENY'};


return response;
}

Enter fullscreen mode Exit fullscreen mode

The content security policy header is the challenging one. In my case I have a javascript file called js_new.js to enable and I also have 2 other js connections one to ipinfo and the other one to google tag manager. My last instruction for styles enables inline styles. It is best to google all these headers to gain a full understanding of their importance.

In order to pre and post test all of the above I use Screaming Frog, which absent the security headers will complain that we don’t include them or some of them. Frog is by all means a powerful tool to use frequently to test ur website for a number of issues. It is best to study these headers a bit more to know what they mean. You can also go into the network tab of chrome dev tools (ctrl-shift-j), click on ur page and view what the response header is returning there.

Conclusion

With all these little tweaks we slowly get s3 to behave more like a real webserver even though it is just simple storage. Like enhancing apache using htaccess we enhance s3 with a Cloudfront distribution and a Cloudfront function for the tweaks. Now I feel much better about using S3 to host my website.

2024 update on 301 redirects. from above **
copy the following code into a new cloudfront function

function handler(event) {
    var request = event.request;
    var headers = request.headers;
    var uri = request.uri;
    var requestURI = event.request.uri
    var response = event.response;
    var newurl5 = `https://howtolearnjava.com/JDBC.html`;
    var newurl6 = `https://howtolearnjava.com/Java-Servlet.html`;
    var newurl7 = `https://javasqlweb.org/about.html`;
    var newurl8 = `https://javasqlweb.org/home.html`;

    if (requestURI === "/JDBC") {
        return {
            statusCode: 301,
            statusDescription: "Moved Permanently",
            headers: {
                location: { "value": newurl5 },
            },
        };
    }

    if (requestURI === "/Java-Servlet") {
        return {
            statusCode: 301,
            statusDescription: "Moved Permanently",
            headers: {
                location: { "value": newurl6 },
            },
        };
    }


 if (requestURI === "/about.html") {
        return {
            statusCode: 301,
            statusDescription: "Moved Permanently",
            headers: {
                location: { "value": newurl7 },
            },
        };
    }

/*
if (requestURI === "/") {
        return {
            statusCode: 301,
            statusDescription: "Moved Permanently",
            headers: {
                location: { "value": newurl8 },
            },
        };
    }
*/

    return request;  //only needed once at end of all if statements
}


Enter fullscreen mode Exit fullscreen mode

Happy Coding!!

Top comments (0)