DEV Community

Cover image for How to Perform Data Validation in Node.js
Antonello Zanini for AppSignal

Posted on • Originally published at blog.appsignal.com

How to Perform Data Validation in Node.js

Data validation is essential to avoid unexpected behavior, prevent errors, and improve security. It can be performed both on a web page — where data is entered — and on the server, where the data is processed.

In this tutorial, we'll explore data validation in the Node.js backend. Then, you'll learn how to implement it in Express using the express-validator library.

Get ready to become a Node.js data validation expert!

What Is Data Validation?

Data validation ensures that data (whether entered or provided) is correct, consistent, and useful for its intended purpose. This process is typically performed in frontend applications, such as when dealing with forms.

Likewise, it is essential to validate data in the backend as well. In this case, data validation involves checking path and query parameters, as well as the body data sent to servers. This ensures that the data received by each API meets the specified criteria, preventing errors and vulnerabilities and ensuring the smooth functionality of your application.

Main Benefits of Validating Incoming Requests in Node.js

Validating incoming requests in Node.js offers several key benefits:

  • Enhancing security: Mitigate some threats, such as injection attacks and data breaches. Proper validation prevents attackers from exploiting vulnerabilities in your application by sending it malformed or malicious data.
  • Improving reliability: Ensure that only valid and sanitized data is processed and stored in the backend application. That enhances the overall integrity and reliability of the data, leading to a more robust and trustworthy server.
  • Maintaining compliance: Make sure that the data handled by the server adheres to specific data format requirements or meets internal coding standards.

Now that you know what data validation is and why you should enforce it in your Node.js application, let's see how to do it in this step-by-step tutorial!

Prerequisites

To follow this tutorial, you need a Node.js 18+ application with a few endpoints. For example, the following Express server is perfect:

const express = require("express");

// initialize an Express server
const app = express();
app.use(express.json());

// an array to use as an in-memory database
const users = [
  { id: 1, email: "john.doe@example.com", fullName: "John Doe", age: 30 },
  { id: 2, email: "jane.smith@example.com", fullName: "Jane Smith", age: 25 },
  { id: 3, email: "bob.johnson@example.com", fullName: "Bob Johnson", age: 40 },
  {
    id: 4,
    email: "alice.williams@example.com",
    fullName: "Alice Williams",
    age: 35,
  },
  { id: 5, email: "mike.brown@example.com", fullName: "Mike Brown", age: 28 },
  { id: 6, email: "sara.taylor@example.com", fullName: "Sara Taylor", age: 33 },
  { id: 7, email: "chris.lee@example.com", fullName: "Chris Lee", age: 22 },
  { id: 8, email: "emily.davis@example.com", fullName: "Emily Davis", age: 45 },
  {
    id: 9,
    email: "alex.johnson@example.com",
    fullName: "Alex Johnson",
    age: 27,
  },
  {
    id: 10,
    email: "lisa.wilson@example.com",
    fullName: "Lisa Wilson",
    age: 38,
  },
];

// define three sample endpoints
app.get("/api/v1/users/:userId", (req, res) => {
  const userId = req.params.userId;
  // find a user by id
  const user = users.find((user) => user.id == userId);

  if (!user) {
    res.status(404).send("User not found!");
  } else {
    res.send({
      user: user,
    });
  }
});

app.get("/api/v1/users", (req, res) => {
  // select all users by default
  let filteredUsers = users;

  const search = req.query.search;
  if (search !== undefined) {
    // filter users by fullName with a case-insensitive search
    filteredUsers = users.filter((user) => {
      return user.fullName.toLowerCase().includes(search.toLowerCase());
    });
  }

  res.send({
    users: filteredUsers,
  });
});

app.post("/api/v1/users", (req, res) => {
  const newUser = req.body;
  const maxId = users.reduce((max, user) => (user.id > max ? user.id : max), 0);

  // add a new user with an auto-incremented id
  users.push({
    id: maxId + 1,
    ...newUser,
  });

  res.status(201).send();
});

// start the server locally on port 3000
const port = 3000;
app.listen(port, () => {
  console.log(`Server listening at http://localhost:${port}`);
});
Enter fullscreen mode Exit fullscreen mode

This defines a local variable named users as an in-memory database. Then, it initializes an Express application with the following three endpoints:

  • GET /api/v1/users/:userId: To retrieve a single user from the users array based on its id.
  • GET /api/v1/users: To get the list of users in the database. It accepts an optional search query parameter to filter users by their full name.
  • POST /api/v1/users: To add a new user to the users array.

Next, you'll need a library to perform data validation in Node.js.
With thousands of weekly downloads, express-validator is the most popular option.

express-validator provides a set of Express middleware to validate and sanitize incoming data to server APIs. Behind the scenes, these middleware functions are powered by validator.js. If you are unfamiliar with this package, validator.js is the most widely used data validation library in the entire JavaScript ecosystem.

What makes express-validator so successful is its rich set of features and intuitive syntax for validating Express endpoints.
It also provides tools for determining whether a request is valid, functions for accessing sanitized data, and more.

Add the express-validator npm package to your project's dependencies with:

npm install express-validator
Enter fullscreen mode Exit fullscreen mode

Perfect! You now have everything you need to perform data validation in an Express application.

For a faster setup, clone the GitHub repository supporting this guide:

git clone https://github.com/Tonel/nodejs-express-validator-demo
Enter fullscreen mode Exit fullscreen mode

You'll find the Express server above and further implementations in dedicated branches.

express-validator supports two equivalent ways to implement data validation:

  1. Validation Chain: Define the validation rules by calling one method after another through method chaining.
  2. Schema Validation: Define the validation rules in an object-based schema to match against the incoming data.

Let's dive into both!

Data Validation in Node.js With Validation Chains

Learn how to implement data validation through validation chains in express-validator.

Understand Validator Chains

In express-validator, validation chains always begin with one of the following middleware functions:

  • check(): Creates a validation chain for the selected fields in any of the req.body, req.cookies, req.headers, req.query, or req.params locations. If the specified fields are present in more than one location, the validation chain processes all instances of that field's value.
  • body(): Same as check(), but it only checks fields in req.body.
  • cookie(): Same as check(), but it only checks fields in req.cookies.
  • header(): Same as check(), but it only checks fields in req.headers.
  • param(): Same as check(), but it only checks fields in req.params.
  • query(): Same as check(), but it only checks fields in req.query.

These middleware functions accept one or more field names to select from incoming data. They also provide some methods, which is possible because JavaScript functions are actually first-class objects. Their methods always return themselves, leading to the method chaining pattern.

So, let's assume that you want to ensure that the name body parameter contains at least 4 characters when it is present.
This is how you can specify that with an express-validator validator chain:

body("name").optional().isLength({ min: 4 });
Enter fullscreen mode Exit fullscreen mode

Each method chain will return a valid Express middleware function you can use for validation in a route handler.
A single route handler can have one or more validation middleware, each referring to different data fields.

Check out the express-validator documentation for all validation chain methods.

Time to see validation chains in action!

Validate Route Parameters

Suppose you want to ensure that the userId parameter in GET /api/v1/users/:userId is an integer. This is what you may end up writing:

app.get("/api/v1/users/:userId", param("userId").isInt(), (req, res) => {
  // business logic...
});
Enter fullscreen mode Exit fullscreen mode

While the validation chain defined above is correct, it's not enough, as express-validator doesn't report validation errors to users automatically. Why? Because it's better if developers always manually define how to handle invalid data!

You can access the result object of data validation in an Express endpoint through the validationResult() function.

Import it along with the validation middleware function from express-validator:

const {
  check,
  body,
  // ...
  validationResult,
} = require("express-validator");
Enter fullscreen mode Exit fullscreen mode

Then, you can define the validated route handler for GET /api/v1/users/:userId as below:

app.get("/api/v1/users/:userId", param("userId").isInt(), (req, res) => {
  // extract the data validation result
  const result = validationResult(req);
  if (result.isEmpty()) {
    const userId = req.params.userId;
    // find a user by id
    const user = users.find((user) => user.id == userId);

    if (!user) {
      res.status(404).send("User not found!");
    } else {
      res.send({
        user: user,
      });
    }
  } else {
    res.status(400).send({ errors: result.array() });
  }
});
Enter fullscreen mode Exit fullscreen mode

When userId is not an integer, the endpoint will return a 400 response with validation error messages. In production, you should override that response with a generic message to avoid providing useful information to potential attackers.

Validate Query Parameters

In this case, you want to ensure that the search query parameter in the GET /api/v1/users endpoint is not empty or blank when it is present.

This is the validation chain you need to define:

query("search").optional().trim().notEmpty();
Enter fullscreen mode Exit fullscreen mode

Note that the order of the method calls in the chain is important. If you switch trim() with notEmpty(), blank strings will pass the validation rule.

trim() is a sanitizer method and it affects the values stored in search.

If you need to access the sanitized data directly in the route handler, you can call the matchedData() function. The validated and sanitized endpoint will now be specified as follows:

app.get(
  "/api/v1/users",
  query("search").optional().trim().notEmpty(),
  (req, res) => {
    // extract the data validation result
    const result = validationResult(req);
    if (result.isEmpty()) {
      // select all users by default
      let filteredUsers = users;

      // read the matched query data from "req"
      const data = matchedData(req);
      const search = data.search;
      if (search !== undefined) {
        // filter users by fullName with a case-insensitive search
        filteredUsers = users.filter((user) => {
          return user.fullName.toLowerCase().includes(search.toLowerCase());
        });
      }

      res.send({
        users: filteredUsers,
      });
    } else {
      res.status(400).send({ errors: result.array() });
    }
  }
);
Enter fullscreen mode Exit fullscreen mode

Notice that req.query.search will contain the original value of the search query parameter, but you want to use its sanitized value.

Validate Body Data

Now, suppose you want the body of POST /api/v1/users to follow these rules:

  • fullName must not be empty or blank.
  • email must be a valid email. If there's an invalid email, the validation error message should be “Not a valid e-mail address.”
  • age must be an integer greater than or equal to 18.

This is how you can implement the desired data validation:

app.post(
  "/api/v1/users",
  body("fullName").trim().notEmpty(),
  body("email").isEmail().withMessage("Not a valid e-mail address"),
  body("age").isInt({ min: 18 }),
  (req, res) => {
    // extract the data validation result
    const result = validationResult(req);
    if (result.isEmpty()) {
      // read the matched body data from "req"
      const newUser = matchedData(req);
      const maxId = users.reduce(
        (max, user) => (user.id > max ? user.id : max),
        0
      );

      // add a new user with an auto-incremented id
      users.push({
        id: maxId + 1,
        ...newUser,
      });

      res.status(201).send();
    } else {
      res.status(400).send({ errors: result.array() });
    }
  }
);
Enter fullscreen mode Exit fullscreen mode

There are a couple of critical aspects to emphasize in this example. First, a single route handler can have multiple validation middlewares referring to the same data source. Second, the methods offered by the middleware functions not only specify how to validate the data, but also allow you to customize error messages and more.

Test the Validated Endpoints

Launch your Node.js application and verify that the data validation logic works as expected. Otherwise, check out the chain-validation branch from the repository that supports this article:

git checkout chain-validation
Enter fullscreen mode Exit fullscreen mode

Install the project dependencies and launch an Express development server:

npm install
npm run start
Enter fullscreen mode Exit fullscreen mode

Your Node.js application should now be listening locally on port 3000. Open your favorite HTTP client and try to make a GET request to /api/v1/users/edw:

Make GET request

Since edw is not a number, you'll get a 400 response.
Specifically, the error array generated by express-validator contains one or more error objects in the following format:

{
    "type": "field",
    "value": "edw",
    "msg": "Invalid value",
    "path": "userId",
    "location": "params"
}
Enter fullscreen mode Exit fullscreen mode

You can simulate another validation error by calling the GET /api/v1/users endpoint with a blank search query parameter:

Call GET endpoint

Again, trigger a validation error by calling the POST /api/v1/users API with an invalid body:

Validation error

If you call the three endpoints with the expected data instead, they will return a successful response as expected.

Great, you just learned how to perform data validation in Node.js! All that remains is to explore the equivalent schema-based approach to data validation.

Data Validation in Node.js with Schema Validation

Let's now see how to define data validation through schema objects with express-validator.

Understand Schema Validation

In express-validator, schemas are an object-based way of defining validation and/or sanitization rules on a request.

While their syntax differs from validation chains, they offer exactly the same functionality. Under the hood, express-validator translates schemas into validation chain functions.
Thus, you can choose between one syntax or the other, depending on your preference.
The same Express application can have some endpoints validated with chains and others validated with schemas.

Schemas are simple JavaScript objects whose keys represent the fields to validate. Schema values contain validation rules in the form of objects.
Pass a schema object to the checkSchema() function to get an Express validation middleware.

For example, this is how you can use a schema to ensure that the name body parameter contains at least 4 characters when it is present:

checkSchema(
  {
    name: {
      optional: true,
      isLength: { options: { min: 4 } },
    },
  },
  ["body"]
);
Enter fullscreen mode Exit fullscreen mode

By default, checkSchema() behaves like check(). To specify which input data sources it should check, you can pass them in an array as the second parameter. In the example above, the validation schema object will only be applied to the body data.

Sometimes, you may need to validate a body's inner field. Check out the documentation to see what express-validator offers in field selection.

Validate Route, Query, and Body Data With Schemas

Here's how you can specify the validation rules shown in the method chaining section above through schema validation:

  • GET /api/v1/users/:userId:
app.get(
  "/api/v1/users/:userId",
  checkSchema(
    {
      userId: { isInt: true },
    },
    ["params"]
  ),
  (req, res) => {
    // extract the data validation result
    const result = validationResult(req);
    if (result.isEmpty()) {
      const userId = req.params.userId;
      // find a user by id
      const user = users.find((user) => user.id == userId);

      if (!user) {
        res.status(404).send("User not found!");
      } else {
        res.send({
          user: user,
        });
      }
    } else {
      res.status(400).send({ errors: result.array() });
    }
  }
);
Enter fullscreen mode Exit fullscreen mode
  • GET /api/v1/users:
app.get(
  "/api/v1/users",
  checkSchema(
    {
      search: { optional: true, trim: true, notEmpty: true },
    },
    ["query"]
  ),
  (req, res) => {
    // extract the data validation result
    const result = validationResult(req);
    if (result.isEmpty()) {
      // select all users by default
      let filteredUsers = users;

      // read the matched query data from "req"
      const data = matchedData(req);
      const search = data.search;
      if (search !== undefined) {
        // filter users by fullName with a case-insensitive search
        filteredUsers = users.filter((user) => {
          return user.fullName.toLowerCase().includes(search.toLowerCase());
        });
      }

      res.send({
        users: filteredUsers,
      });
    } else {
      res.status(400).send({ errors: result.array() });
    }
  }
);
Enter fullscreen mode Exit fullscreen mode

Note that the order of the attributes in the schema object matters.
Placing the trim attribute after notEmpty will result in a different validation rule.

  • POST /api/v1/users:
app.post(
  "/api/v1/users",
  checkSchema(
    {
      fullName: {
        trim: true,
        notEmpty: true,
      },
      email: {
        errorMessage: "Not a valid e-mail address",
        isEmail: true,
      },
      age: {
        isInt: { options: { min: 18 } },
      },
    },
    ["body"]
  ),
  (req, res) => {
    // extract the data validation result
    const result = validationResult(req);
    if (result.isEmpty()) {
      // read the body data from the matched data
      const newUser = matchedData(req);
      const maxId = users.reduce(
        (max, user) => (user.id > max ? user.id : max),
        0
      );

      // add a new user with an auto-incremented id
      users.push({
        id: maxId + 1,
        ...newUser,
      });

      res.status(201).send();
    } else {
      res.status(400).send({ errors: result.array() });
    }
  }
);
Enter fullscreen mode Exit fullscreen mode

As you can see, not much changes from chain validation. The two approaches are completely equivalent.

Test the Validated Endpoints

Start your Express application and prepare to test your schema-based data validation.
Otherwise, check out the schema-validation branch from the repository supporting this article:

git checkout chain-validation
Enter fullscreen mode Exit fullscreen mode

Install the project's dependencies, start the local server, and repeat the API calls made in the method chaining validation section.
You should get the exact same results!

Wrapping Up: Protect Your Node.js Server From Unexpected Incoming Data

In this post, we defined Node.js data validation and explored its benefits for a backend application.

You now know:

  • The definition of data validation
  • Why you should check the data received by an endpoint before feeding it to business logic
  • How to implement data validation in Node.js with two different approaches: method chaining and schema validation

Thanks for reading!

P.S. If you liked this post, subscribe to our JavaScript Sorcery list for a monthly deep dive into more magical JavaScript tips and tricks.

P.P.S. If you need an APM for your Node.js app, go and check out the AppSignal APM for Node.js.

Top comments (1)