I'm about to rewrite my personal site (for hopefully the last time for a long while). I have been paralyzed more than I should be about what I'm calling "URL Architecture". Because I care about Cool URIs, this is ideally not a reversible decision.
After some agonizing I think I have decided on a URL Architecture I like. In true fashion, I'm blogging about it.
"URL Architecture" (I don't know the proper word for this, if there is one pls correct me) is how you set up your URLs for each page of your site. For example, netlify.com
has these:
-
netlify.com
: generic landing page -
netlify.com/blog
: index of blogs -
netlify.com/blog/year/month/day/slug
: individual posts - other pages
I call that /blog
a namespace. The reason you do this is simple: it GREATLY reduces the risk of clashing with any other page you could possibly put on your site. It is "future proof".
Note: There is some debate about whether or not putting year/month/day in the URL impacts SEO. Some think it doesn't matter, some think it's a mild negative, some like it because you avoid clashes at the slug level.
Personal sites of frequent speakers often opt for TWO namespaces: /writing
and /speaking
. My current site (as of Sept 2020) uses this, but I have grown to dislike it: I will often speak about what I write, and write about related things I speak about. Why separate them? It is more effort to make my talks discoverable to my readers and vice versa.
The other thing that bothers me about my current setup is that it makes for ugly URLs. My most popular essay ever regularly gets linked to in Slacks, Discords, YouTube chats, Tweets and presentation slides. It weighs in at 43 characters all told.
If I shortened it to the shortcode that I set up for myself, swyx.io/LIP
, it is only 11 characters. This looks much better.
I've noticed that a lot of prolific bloggers don't namespace:
- CSS Tricks: https://css-tricks.com/how-css-perspective-works/
- Derek Sivers: https://sive.rs/cons
- Paul Graham: http://www.paulgraham.com/good.html
These people have blogged for decades and somehow managed to find their way around possible URL clashes. Why can't I?
Top comments (20)
I like namespaces. If I'm on a blog post with a /blog/ namespace then I know I can just chop off the slug on the URL to go to all posts. A post is a document within the blog so it makes sense to me that the URL should reflect as such.
https://www.swyx.io/writing/learn-in-public
tells me that it is a blog post called learn in public before I even click it. What does/lip
have going for it other than being shorter?I'll grant that "lip" may be less readable. but that's a different issue than whether or not to namespace with /writing or /blog!
I have abc.com/blog for all the blogs (like archives), but for single post I like it simple as abc.com/xyz. As long as you provide categorization and search on your blog, I don't think it's a big deal for url, from a user's perspective?
if xyz is a blog post there's no sense on using it without /blog. You break your own site structure by yourself...
xyz is either a blog post or a page. Originally I thought for posts I use /blog/xyz, and for page it would be just /xyz, but since my current one (expanding to be more than a blog) is migrated from another blog which has some popular links with good ranks, so I kept the original url. So strictly speaking /blog is more like /allposts or /archivesforblogposts for the function, less than structure.
Ok so it's just semantically incorrect, being the better approach naming it archives instead blog; I understand
Yeah, now thinking back I should have done that. The same for categories, as I used "tags" instead. >.< Lessons learned, with new website I would be more careful about those.
well, tags for categories is not that bad, semantically a tag points to a category anyway, isn't it right? :)
Yeah... but traditionally people see "categories" as bigger groups/topics, and tags are subgroups, points... you can have 20-30 popular tags and usually less than 10 categories for a general blog. I guess I meant to have more at the beginning but later got lazy and decided to use them as categories instead. But I don't think the users care. As long as on navigation they get to find what they want, they will be happy. :P More important is how to provide good content and market it ... which can be more challenging than coding the blog itself.
Hahahah true, you also can let the user multi-select tags to filter content which is more accurate than Searching for category
it's not, haha. just wanted to be a little thoughtful about it in my rewrite
This is called information architecture formally.
The IA of a site is fairly hard to design. That said it is not impossible to change. I just deployed a site today that redirects 48,000 urls.
We decided after 3 years to adjust how we handle some sections of our site. Additionally we deprecated some classes of products, and thoughtful redirected those to the closest match.
It does take effort. We wrote scripts to get all unique URLs on the original site and crawl those on the new site ensuring zero 404s.
I personally like the blog namespace on my personal site, because it’s descriptive. But seeing that my site is only a blog, it makes sense in the future for me to deprecate that. And I plan too soon.
One way to handle this, is what is the main focus? If it’s a blog, make that the root. Leave the name spacing for secondary actions.
If you have competing interests, then maybe namespace both.
I for sure am moving towards using more namespaces for topical areas, but using more root level names for singular site focus.
The site we released today with tens of thousands of pages uses a lot more namespaces, but because the url helps to inform the intent. Drug names are hard to understand, and we wanted it to read easier.
thanks for the thoughts! I didn't want to call it IA since to me IA goes deeper into structure of content 🤷♂️
yeah I guess personal sites primary focus is showing work (talks and blogposts included). not sure what else I'd put under a namespace. as css tricks shows, it's possible to scale this really really far.
Yeah, I just feel it always starts and includes the urls. Everyone I have worked with always have a top down strategy, once they get past some ideation. Great write up though.
The other thing I was going to say was that I see two approaches, where URLs go from less specific to more specific.
And then on other sites, short URLs are more important and longer namespaces are less important.
I feel we started to migrate to the later, but started with the former.
URL use to have a larger impact on SEO, but now google and other browsers are downplaying them and relying on page content a lot more.
It's been called pretty links for some time.
The reason some sites don't add /blog/ into the URI is because the site itself is a blog.
By the other hand, adding year, month, day into the URL means you don't understand the basics. I mean, you can set your URLs like this on wordpress and many other sites and had some kind of sense on wordpress as it worked generating files for each entry/post.
Each part on the URL must be either a directory or a file, so what's the point on adding a folder called 2020, another one called september and another one called 10 just for storing here what?
Of course, you will store anything here, your data will be into the database and there's only a config into .htaccess and some controller using this URI to show one content or another (or some other kind of mapping between this false URI and your content).
The way the major part of the reliable web is managed ( domain/directory/content) is the best for users too, which can know quickly what they are about to open, this links are simple and clean, that's why they're called "pretty links" or similar.
Of course there's no point on adding year month day into your URI related to SEO, it works matching the uri and the content apart from context and much more points.
Where to add a directory into your URL? When it's needed;
if your entire website is a blog, there's no point on adding a subdomain or directory called blog, for example.
If your site is about something else but you have a blog, you must add either a subdomain blog.domain.com or a directory domain.com/blog.
In this case it has SEO implications, which adding a directory makes the main domain stronger while adding a subdomain it's treated more like a different tool (which can be true and realistic on many sites)
I strongly disagree.
The 2020 "folder" will show you all posts from 2020. The 2020/10/ path shows you all posts from October 2020 and 2020/10/11 retrieves all posts from 11th October 2020. It's greate way to get an overview of posts from a specific time in the history of a blog.
I may not explained well the concept.
I mean that you don't have a directory called 2020, you will have N DB Rows of posts where their date_add or publish date is a datetime type field.
That makes you able to show all posts from 2020 by simply querying
Same for specific month, day, hour, minute...
There's no data stored into physical or logic folders, all data is usually set inside the database instead, so why do you want to add and map an inexistent server directory?
You can also think into media which is stored on the server, well that's another story, you'll also store into the DB the URL of that resource and of course, the date_add so same here. Moreover if you want to distribute your blog you'll need to store your media into something more "open" such a CDN like storage service (look at Google Cloud Storage for example) so that will not be an issue as you only will need to update your DB rows with this new URLs and that's all. If you rely on server directories it will be a massive mess to distribute.
Well it's not a server directory, it's a rewrite rule that generates a DB query.
The original purpose of this style of URL structures is to have a human navigable path (which has the side effect of also helping search engines, especially 15-20 years ago).
By "human navigable" I mean that a human can easily decompose the URI back to the site root and get meaningful content along the way.
/2020/09/12/some-blog-post
is a blog post posted today/2020/09/12/
remove the post slug and now you've got all posts from today/2020/09/
remove the day and now you've got all posts from this month/2020/
remove the month and now you've got all posts from the year./
remove the year and now you've got all posts on the blog.Without this structure a human would have to perform a search typically with an ugly URL with a bunch of query params that are difficult for a human to type easily. Sure you could dynamically generate a link to archives that fills in the params but you don't really want to pollute your pages with links for every year, month and day you've posted content.
This URL structure is an artifact from the early days of blogging when we had much debate exactly like this.
Yup, of course I know and understand the origin. Moreover there was some blogging tools that generates physical/logical server directories with text files to store the data.
Nowadays you don't need those params, and the average user hardly will be editing the URL so you build usable UIs with nice options to search and filter through your data to provide content to your users/customers. That implies providing links that shows correctly and on a non-verbose way what it's all about. Specially because most of time the URL string is shown break so you can not read it all.
Moreover if you provide a good search in between your data (title, content, category, tags, media alts and so) it will be much more fast than giving options to filter only (by year, category and tags).
Also for RSS sharing getting a link like:
domain/post-title
Will be preferred in comparison to:
domain/year/month/day/post-title
Yeah I’m thinking about this more than I should. I really don’t know what to do, let alone how to structure my site. I’ll be uncool and 301 to wherever I end up, lol.