Anish Kumar

Posted on Aug 31, 2021

Applying tree traversal algorithms to DOM

#javascript #datastructures #algorithms #html

We've looked through few binary tree traversal techniques so far:

1- Traversing through binary tree using recursive and iterative algorithms

2- Traversing through binary tree using parent pointers

In this article we'll put those learnings to use for an n-ary tree i.e. DOM. We'll see how we can locate DOM elements using various CSS selectors without using inbuilt APIs like getElementById, getElementsByClassname or querySelector/querySelectorAll. The article would thus also throw light on how these APIs might be working under the hood.

DOM traversal overview

Borrowing the idea from first article, let's come up with the preOrder traversal algorithm for DOM:

function walkPreOrder(node){
  if(!node) return

  // do something here
  console.log(node)

  for(let child of node.children){
     walkPreOrder(child)
  }
}

We can modify this algorithm to return an iterator instead:

function* walkPreOrder(node){
  if(!node) return

  // do something here
  yield node
  for(let child of node.children){
    yield* walkPreOrder(child)
  }
}

// USAGE
for(let node of walkPreOrder(root)){
  console.log(node)
}

We can use any of breadth first or depth first algorithms (discussed in previous articles) to traverse the DOM. For the sake of this article, we'll stick with the above approach though.

Let's also assume we're working on a document having following HTML:

<html>
  <head>
    <title>DOM selection algorithm</title>
  </head>
<body>

  <div class="container">
    <div class="body">
      <div class="row">
        <img id="profile" src="xyz.jpg" alt="">
      </div>
      <div class="row"></div>
      <div class="row"></div>
    </div>
  </div>

</body>
</html>

Locating a node by ID

Browsers offer document.getElementById() API to achieve this result. Using the walkPreOrder() helper it becomes really simple to achieve this. Let's see:

function locateById(nodeId){
  // iterate through all nodes in depth first (preOrder) fashion
  // return the node as soon as it's found
  for(let node of walkPreOrder(document.body)){
     if(node.id === nodeId){
        return node
     }
  }
   return null
}

We can use the locateById() function as follows:

const img = locateById('profile')
// returns the image node

Locating nodes by className

Browsers offer document.getElementsByClassName() API to achieve this result. Let's see how we can implement something similar:

function locateAllByClassName(className){
   const result = []
   for(let node of walkPreOrder(document.body)){
      if(node.classList.contains(className)){
        result.push(node)
      }
   }
   return result
}

// USAGE
const elements = locateAllByClassName('row')

How browser optimizes the selection queries

Selecting DOM node is a fairly common operation for web applications. Traversing through the tree multiple times for the same selector doesn't seem optimal. Browser optimizes the selection by using memoization.

Looking at mozilla parser's source, namely an excerpt from the function startTag:

 // ID uniqueness
 @IdType String id = attributes.getId();
 if (id != null) {
      LocatorImpl oldLoc = idLocations.get(id);
      if (oldLoc != null) {
            err("Duplicate ID \u201C" + id + "\u201D.");
            errorHandler.warning(new SAXParseException(
                  "The first occurrence of ID \u201C" + id
                  + "\u201D was here.", oldLoc));
       } else {
            idLocations.put(id, new LocatorImpl(tokenizer));
       }
 }

We can see that node IDs are kept in a simple hash map. It's done to ensure repeated queries for the same ID do not require full traversal, instead we can just look it up from hashMap and return it.

Here's how our solution would look like post memoization:

function getSelectors(){
  const idLocations = {}
  const classLocations = {}

  // updated selector functions  
  function locateById(nodeId){
    if(idLocations.hasOwnProperty(nodeId)) 
       return idLocations[nodeId]

    for(let node of walkPreOrder(document.body)){
       if(node.id === nodeId){
          idLocations[nodeId]= node //memoize
          return node
       }
     }
    idLocations[nodeId]= null // memoize
    return null
  }

  function locateAllByClassName(className){
    if(classLocations.hasOwnProperty(className)) 
         return classLocations[className]

    const result = []
    for(let node of walkPreOrder(document.body)){
       if(node.classList.contains(className)){
          result.push(node)
        }
     }
     classLocations[nodeId]= result
     return result
  }

  return {
       locateById,
       locateAllByClassName
    }

} 

  // USAGE
  const {locateById, locateAllByClassName} = getSelectors();
  const result = locateAllByClassName('row') // returns array of elements
  const img = locateById('profile') // returns an element, if found

Dealing with more complex selectors

Let's try to implement something like element.querySelector. Here's how MDN describes it:

The querySelector() method of the Element interface returns the first element that is a descendant of the element on which it is invoked that matches the specified group of selectors.

Example:

const firstRow = document.querySelector('.container .row:first-child')

In this case we can pass any CSS selector to the function and it should be able to traverse the DOM to find that element for us. Let's see it we how it can be implemented:

function myQuerySelector(selector){
  const path = selector.split(' ').map(str => str.trim())

  let currentNode = document.body
  while(path.length && currentNode){

    const currentSelector = path.shift()
    let found = false

    for(let node of walkPreOrder(currentNode)){
      if(node.matches(currentSelector)){
        currentNode = node
        found = true
        break
      }
    }

    if(!found) currentNode = null
  }
  return currentNode
}

// USAGE:
const firstRow = myQuerySelector('.container .row:first-child')

Implementation of myQuerySelectorAll (similar to element.querySelectorAll) also follows the same approach with slight modification:

function myQuerySelectorAll(selector){
  const path = selector.split(' ').map(str => str.trim())
  const result = []

  let currentNode = document.body
  while(path.length && currentNode){

    const currentSelector = path.shift()

    for(let node of walkPreOrder(currentNode)){
      if(node.matches(currentSelector)){
        currentNode = node
        result.push(currentNode)
      }
    }
  }
  return result
}

Bonus

We can use the recursive preOrder traversal approach, describe at the start of this article, to clone any tree. Let's see how we can use it to clone any DOM tree, similar to what element.cloneNode(true) does:

Create a clone of source node, by create a new node with same tagName and then copying over the attributes.
Recursively call cloneTree method on all children of the source node, and append the returned nodes as children to cloned node.

function cloneTree(node){
  if(!node) return

  const clonedNode = document.createElement(node.tagName.toLowerCase())
  const attributes = node.getAttributeNames()

  attributes.forEach(attribute => {
     clonedNode.setAttribute(attribute, node.getAttribute(attribute))
  })

  for(const child of node.children){
      clonedNode.append(cloneTree(child))
  }

  return clonedNode
}

This article has been originally published at StackFull.dev. If you enjoyed reading this, you may want to opt for my newsletter. It would let me reach out to you whenever I publish a new thought!

DEV Community

Applying tree traversal algorithms to DOM

DOM traversal overview

Locating a node by ID

Locating nodes by className

How browser optimizes the selection queries

Dealing with more complex selectors

Bonus

Top comments (0)

Read next

State of JS 2024

Why Code Reuse is Important in the Age of AI

2924. Find Champion II

Sortie de TypeScript 5.7 RC, nouvelle alternative à npm, avenir des frameworks JavaScript - Cette semaine en JS