DEV Community

Cover image for SEO Part1: Help Search Engine Understand Our Website
Andrew
Andrew

Posted on

SEO Part1: Help Search Engine Understand Our Website

This is the second article in this SEO series. In this article, I will explain how to help the search engine understand our website better.

Search Engine robots analyze our website in a very different way. Although the most important criteria for a good website is still the user experience. But since the search engine needs to automate the analyzing process, it needs to use lots of extra information to understand our website. It's important to make sure that extra information matches our content. Otherwise, the search engine might get confused about the content we want to provide to our users.
Now, let's dive into the details.

Agenda

robots.txt

robots.txt is a guideline for search engine robots, it's used to keep part of our pages private.
If we provide the wrong setting, it might accidentally block the search engine and make our website fail to display on the SERP (Search Engine Results Pages).

We can provide the robots.txt file for the entire website.

User-agent: *
Disallow:
Enter fullscreen mode Exit fullscreen mode

Or we can also use meta tag to specify the setting for a specific page.

<meta name="robots" content="noindex">
Enter fullscreen mode Exit fullscreen mode

sitemap

We can provide the sitemap to notice the search engine to index our pages.

<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"> 
  <url>
    <loc>https://www.oahehc.com</loc>
    <priority>0.90</priority>
    <changefreq>always</changefreq>
    <lastmod>2020-07-18T12:59:15.983Z</lastmod>
  </url>
  ...
</urlset>
Enter fullscreen mode Exit fullscreen mode

Once we finish our sitemap, we can submit the result to the search engine. Take Google as an example, we can send an HTTP request or handle it on google search console.

http://www.google.com/ping?sitemap=<complete_url_of_sitemap>
Enter fullscreen mode Exit fullscreen mode

Another thing worth mentioning is that there is a budget when the search engine crawls a website. So if we have tens of thousands of pages on our website. We can put the most important pages at first or set priority probably.

URL

The URL is another criterion that the search engine might reference. Keeping the URL in a clean directory structure not just makes it easier to manage, but also lets the search engine be able to understand.
If it's possible, including keywords and context on the URL to make it match our website content.

HTTP status code

Now the search engine has our URL, it can send the request and get the content. But before it starts analyzing the content, it will check the HTTP status code that our server responds to. Providing the correct status code can prevent the search engine from getting confused. There are two common mistakes:

  1. 301 (permanent redirect) vs 302 (temporary redirect)
    • replace old website should use 301
    • if we have different URLs for desktop and mobile websites and redirect our users based on their device. Then we should use 302.
  2. 404 page should return 404 404

meta tag

In this step, the search engine already has the content of our website. It will check the HTML head first.
html head

title and description

  <title>title</title>
  <meta name="description" content="description">
Enter fullscreen mode Exit fullscreen mode

The search engine reads the title and description to have a basic understanding of our website. So it's important to provide the proper information. For example:
(a) prevent using stop words
(b) should match the website content
(c) include the keywords but not overuse
(d) multiple tabs page should provide different title/description for each tab

social media

If we want to optimize our website on social media, we should provide proper meta tags. Taking Facebook as an example, we should including that information inside the head tag.

  <meta property="og:url" content="https://www.oahehc.com.tw" />
  <meta property="og:type" content="article" />
  <meta property="og:title" content="title" />
  <meta property="og:description" content="description" />
  <meta property="og:image" content="https://cdn.oahehc.com.tw/logo.png" />
Enter fullscreen mode Exit fullscreen mode

And don't forget to use their tool to make sure everything is working as we expected.

FB Sharing Debugger

link tag

It's important to prevent duplicate content on our website because this makes the search engine hard to decide which page they should provide to their users. Therefore, another information we might need to provide in the head tag is link.

If we have different URLs for desktop and mobile websites. Then we can provide canonical link & alternate link to tell the search engine to treat them as the same page.

<link rel="canonical" href="https://www.oahehc.com.tw">
<link rel="alternate" href="https://m.oahehc.com.tw" media="only screen and (max-width: 640px)">
Enter fullscreen mode Exit fullscreen mode

Sematic HTML tags

<div>
  <div>1. xxx</div>
  <div>2. ooo</div>
  <div>3. ...</div>
</div>

<ol>
  <li>xxx</li>
  <li>ooo</li>
  <li>...</li>
</ol>
Enter fullscreen mode Exit fullscreen mode

With proper CSS styling, the above two examples will look exactly the same by the human eye. But it's quite different in the search engine's aspect.
There are a few basic guidelines about how to choose HTML tags properly:

  1. Using h1~h6 for heading, and we should only have one h1 tag for each page
  2. Structure the page by semantic elements like header, main, aside, footer, section, nav, ...
  3. Using table/th/td/td for table
  4. Using ul/ol/li for list
  5. Don't mix up button & a

Use semantic HTML tags not just help the search engine understand our webpage better, sometime we might gain an extra bonus from that.
Google search provides featured snippet if there is a proper answer for the search question. If our content is summarized in the table or list, then it will have a higher chance to be choosing as one of them.

featured snippet

HTML tags attributes

Except choose proper HTML tags, sometime we might have to provide extra attributes to add more information.

For example, add alt for img tag to explain the image not just help the search engine understand better, but also make our website more friendly for people who use a screen reader to browse our website.

<img src="https://cdn.oahehc.com.tw/dog.png" alt="dog" />
Enter fullscreen mode Exit fullscreen mode

When we use an image as the content for a hyperlink, add title attribute to provide extra information is also a good practice.

<a href="./dog" title="dog list">
  <img src="https://cdn.oahehc.com.tw/dog.png" alt="dog" />
</a>
Enter fullscreen mode Exit fullscreen mode

Structured Data

If we want to provide more information for the search engine, we can add structured-data on our website.
There are a few different categories of structured-data that Google will display in richer features in search results.
Bloomberg
We can check this article to know more detail - Explore the search gallery.

To add structured-data on our website, we can directly add into the HTML tags:

<ol itemscope itemtype="http://schema.org/BreadcrumbList">
  <li itemprop="itemListElement" itemscope
      itemtype="http://schema.org/ListItem">
    <a itemprop="item" href="https://example.com/dresses">
    <span itemprop="name">Dresses</span></a>
    <meta itemprop="position" content="1" />
  </li>
  <li itemprop="itemListElement" itemscope
      itemtype="http://schema.org/ListItem">
    <a itemprop="item" href="https://example.com/dresses/real">
    <span itemprop="name">Real Dresses</span></a>
    <meta itemprop="position" content="2" />
  </li>
</ol>
Enter fullscreen mode Exit fullscreen mode

Or we can create a script tag and set all the information in json format:

<script type="application/ld+json">
{
 "@context": "http://schema.org",
 "@type": "BreadcrumbList",
 "itemListElement":
 [
  {
   "@type": "ListItem",
   "position": 1,
   "item":
   {
    "@id": "https://example.com/dresses",
    "name": "Dresses"
    }
  },
  {
   "@type": "ListItem",
  "position": 2,
  "item":
   {
     "@id": "https://example.com/dresses/real",
     "name": "Real Dresses"
   }
  }
 ]
}
</script>
Enter fullscreen mode Exit fullscreen mode

Once we finish the structured data, don't forget to test it and make sure the format is correct - Structured Data Testing Tool.

Tools

You might feel overwhelmed after seeing so many extra errands we have to deal with. Luckily, there are tools that can help us.

  • ESLint
    ESlint not just can help us prevent syntax error, adding a proper extension, it can also point out the missing attribute on our HTML tags.
    eslint

  • lighthouse
    lighthouse is a built-in feature in Chrome, we can use it to identify common problems on our website. Now we can focus on Accessibility & SEO. We just need to click Generate report, then lighthouse will let us know how to fix the problems on our website.
    lighthouse

  • google search console
    Once we publish our website, definitely start using google search console. It doesn't just provide the information to measure search traffic. Moreover, it points out all the problems that Google finds out when they are trying to crawl our website. Fixing those problems can make the search engine understand our website better.
    google search console


Conclusion

Now we know how to make the search engine easier to understand our website. In the next article, I will focus on the most important question - how to improve the user experience for our real users - SEO Part2: Improve User Experience & SEO

Reference

Top comments (0)