Data validation is essential to avoid unexpected behavior, prevent errors, and improve security. It can be performed both on a web page — where data is entered — and on the server, where the data is processed.
In this tutorial, we'll explore data validation in the Node.js backend. Then, you'll learn how to implement it in Express using the express-validator
library.
Get ready to become a Node.js data validation expert!
What Is Data Validation?
Data validation ensures that data (whether entered or provided) is correct, consistent, and useful for its intended purpose. This process is typically performed in frontend applications, such as when dealing with forms.
Likewise, it is essential to validate data in the backend as well. In this case, data validation involves checking path and query parameters, as well as the body data sent to servers. This ensures that the data received by each API meets the specified criteria, preventing errors and vulnerabilities and ensuring the smooth functionality of your application.
Main Benefits of Validating Incoming Requests in Node.js
Validating incoming requests in Node.js offers several key benefits:
- Enhancing security: Mitigate some threats, such as injection attacks and data breaches. Proper validation prevents attackers from exploiting vulnerabilities in your application by sending it malformed or malicious data.
- Improving reliability: Ensure that only valid and sanitized data is processed and stored in the backend application. That enhances the overall integrity and reliability of the data, leading to a more robust and trustworthy server.
- Maintaining compliance: Make sure that the data handled by the server adheres to specific data format requirements or meets internal coding standards.
Now that you know what data validation is and why you should enforce it in your Node.js application, let's see how to do it in this step-by-step tutorial!
Prerequisites
To follow this tutorial, you need a Node.js 18+ application with a few endpoints. For example, the following Express server is perfect:
const express = require("express");
// initialize an Express server
const app = express();
app.use(express.json());
// an array to use as an in-memory database
const users = [
{ id: 1, email: "john.doe@example.com", fullName: "John Doe", age: 30 },
{ id: 2, email: "jane.smith@example.com", fullName: "Jane Smith", age: 25 },
{ id: 3, email: "bob.johnson@example.com", fullName: "Bob Johnson", age: 40 },
{
id: 4,
email: "alice.williams@example.com",
fullName: "Alice Williams",
age: 35,
},
{ id: 5, email: "mike.brown@example.com", fullName: "Mike Brown", age: 28 },
{ id: 6, email: "sara.taylor@example.com", fullName: "Sara Taylor", age: 33 },
{ id: 7, email: "chris.lee@example.com", fullName: "Chris Lee", age: 22 },
{ id: 8, email: "emily.davis@example.com", fullName: "Emily Davis", age: 45 },
{
id: 9,
email: "alex.johnson@example.com",
fullName: "Alex Johnson",
age: 27,
},
{
id: 10,
email: "lisa.wilson@example.com",
fullName: "Lisa Wilson",
age: 38,
},
];
// define three sample endpoints
app.get("/api/v1/users/:userId", (req, res) => {
const userId = req.params.userId;
// find a user by id
const user = users.find((user) => user.id == userId);
if (!user) {
res.status(404).send("User not found!");
} else {
res.send({
user: user,
});
}
});
app.get("/api/v1/users", (req, res) => {
// select all users by default
let filteredUsers = users;
const search = req.query.search;
if (search !== undefined) {
// filter users by fullName with a case-insensitive search
filteredUsers = users.filter((user) => {
return user.fullName.toLowerCase().includes(search.toLowerCase());
});
}
res.send({
users: filteredUsers,
});
});
app.post("/api/v1/users", (req, res) => {
const newUser = req.body;
const maxId = users.reduce((max, user) => (user.id > max ? user.id : max), 0);
// add a new user with an auto-incremented id
users.push({
id: maxId + 1,
...newUser,
});
res.status(201).send();
});
// start the server locally on port 3000
const port = 3000;
app.listen(port, () => {
console.log(`Server listening at http://localhost:${port}`);
});
This defines a local variable named users
as an in-memory database. Then, it initializes an Express application with the following three endpoints:
-
GET /api/v1/users/:userId
: To retrieve a single user from theusers
array based on its id. -
GET /api/v1/users
: To get the list of users in the database. It accepts an optionalsearch
query parameter to filter users by their full name. -
POST /api/v1/users
: To add a new user to theusers
array.
Next, you'll need a library to perform data validation in Node.js.
With thousands of weekly downloads, express-validator
is the most popular option.
express-validator
provides a set of Express middleware to validate and sanitize incoming data to server APIs. Behind the scenes, these middleware functions are powered by validator.js
. If you are unfamiliar with this package, validator.js
is the most widely used data validation library in the entire JavaScript ecosystem.
What makes express-validator
so successful is its rich set of features and intuitive syntax for validating Express endpoints.
It also provides tools for determining whether a request is valid, functions for accessing sanitized data, and more.
Add the express-validator
npm package to your project's dependencies with:
npm install express-validator
Perfect! You now have everything you need to perform data validation in an Express application.
For a faster setup, clone the GitHub repository supporting this guide:
git clone https://github.com/Tonel/nodejs-express-validator-demo
You'll find the Express server above and further implementations in dedicated branches.
express-validator
supports two equivalent ways to implement data validation:
- Validation Chain: Define the validation rules by calling one method after another through method chaining.
- Schema Validation: Define the validation rules in an object-based schema to match against the incoming data.
Let's dive into both!
Data Validation in Node.js With Validation Chains
Learn how to implement data validation through validation chains in express-validator
.
Understand Validator Chains
In express-validator
, validation chains always begin with one of the following middleware functions:
-
check()
: Creates a validation chain for the selected fields in any of thereq.body
,req.cookies
,req.headers
,req.query
, orreq.params
locations. If the specified fields are present in more than one location, the validation chain processes all instances of that field's value. -
body()
: Same ascheck()
, but it only checks fields inreq.body
. -
cookie()
: Same ascheck()
, but it only checks fields inreq.cookies
. -
header()
: Same ascheck()
, but it only checks fields inreq.headers
. -
param()
: Same ascheck()
, but it only checks fields inreq.params
. -
query()
: Same ascheck()
, but it only checks fields inreq.query
.
These middleware functions accept one or more field names to select from incoming data. They also provide some methods, which is possible because JavaScript functions are actually first-class objects. Their methods always return themselves, leading to the method chaining pattern.
So, let's assume that you want to ensure that the name
body parameter contains at least 4 characters when it is present.
This is how you can specify that with an express-validator
validator chain:
body("name").optional().isLength({ min: 4 });
Each method chain will return a valid Express middleware function you can use for validation in a route handler.
A single route handler can have one or more validation middleware, each referring to different data fields.
Check out the express-validator documentation for all validation chain methods.
Time to see validation chains in action!
Validate Route Parameters
Suppose you want to ensure that the userId
parameter in GET /api/v1/users/:userId
is an integer. This is what you may end up writing:
app.get("/api/v1/users/:userId", param("userId").isInt(), (req, res) => {
// business logic...
});
While the validation chain defined above is correct, it's not enough, as express-validator
doesn't report validation errors to users automatically. Why? Because it's better if developers always manually define how to handle invalid data!
You can access the result object of data validation in an Express endpoint through the validationResult()
function.
Import it along with the validation middleware function from express-validator
:
const {
check,
body,
// ...
validationResult,
} = require("express-validator");
Then, you can define the validated route handler for GET /api/v1/users/:userId
as below:
app.get("/api/v1/users/:userId", param("userId").isInt(), (req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
const userId = req.params.userId;
// find a user by id
const user = users.find((user) => user.id == userId);
if (!user) {
res.status(404).send("User not found!");
} else {
res.send({
user: user,
});
}
} else {
res.status(400).send({ errors: result.array() });
}
});
When userId
is not an integer, the endpoint will return a 400
response with validation error messages. In production, you should override that response with a generic message to avoid providing useful information to potential attackers.
Validate Query Parameters
In this case, you want to ensure that the search
query parameter in the GET /api/v1/users
endpoint is not empty or blank when it is present.
This is the validation chain you need to define:
query("search").optional().trim().notEmpty();
Note that the order of the method calls in the chain is important. If you switch trim()
with notEmpty()
, blank strings will pass the validation rule.
trim()
is a sanitizer method and it affects the values stored in search
.
If you need to access the sanitized data directly in the route handler, you can call the matchedData()
function. The validated and sanitized endpoint will now be specified as follows:
app.get(
"/api/v1/users",
query("search").optional().trim().notEmpty(),
(req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
// select all users by default
let filteredUsers = users;
// read the matched query data from "req"
const data = matchedData(req);
const search = data.search;
if (search !== undefined) {
// filter users by fullName with a case-insensitive search
filteredUsers = users.filter((user) => {
return user.fullName.toLowerCase().includes(search.toLowerCase());
});
}
res.send({
users: filteredUsers,
});
} else {
res.status(400).send({ errors: result.array() });
}
}
);
Notice that req.query.search
will contain the original value of the search
query parameter, but you want to use its sanitized value.
Validate Body Data
Now, suppose you want the body of POST /api/v1/users
to follow these rules:
-
fullName
must not be empty or blank. -
email
must be a valid email. If there's an invalid email, the validation error message should be “Not a valid e-mail address.” -
age
must be an integer greater than or equal to 18.
This is how you can implement the desired data validation:
app.post(
"/api/v1/users",
body("fullName").trim().notEmpty(),
body("email").isEmail().withMessage("Not a valid e-mail address"),
body("age").isInt({ min: 18 }),
(req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
// read the matched body data from "req"
const newUser = matchedData(req);
const maxId = users.reduce(
(max, user) => (user.id > max ? user.id : max),
0
);
// add a new user with an auto-incremented id
users.push({
id: maxId + 1,
...newUser,
});
res.status(201).send();
} else {
res.status(400).send({ errors: result.array() });
}
}
);
There are a couple of critical aspects to emphasize in this example. First, a single route handler can have multiple validation middlewares referring to the same data source. Second, the methods offered by the middleware functions not only specify how to validate the data, but also allow you to customize error messages and more.
Test the Validated Endpoints
Launch your Node.js application and verify that the data validation logic works as expected. Otherwise, check out the chain-validation
branch from the repository that supports this article:
git checkout chain-validation
Install the project dependencies and launch an Express development server:
npm install
npm run start
Your Node.js application should now be listening locally on port 3000. Open your favorite HTTP client and try to make a GET
request to /api/v1/users/edw
:
Since edw
is not a number, you'll get a 400
response.
Specifically, the error array generated by express-validator
contains one or more error objects in the following format:
{
"type": "field",
"value": "edw",
"msg": "Invalid value",
"path": "userId",
"location": "params"
}
You can simulate another validation error by calling the GET /api/v1/users
endpoint with a blank search
query parameter:
Again, trigger a validation error by calling the POST /api/v1/users
API with an invalid body:
If you call the three endpoints with the expected data instead, they will return a successful response as expected.
Great, you just learned how to perform data validation in Node.js! All that remains is to explore the equivalent schema-based approach to data validation.
Data Validation in Node.js with Schema Validation
Let's now see how to define data validation through schema objects with express-validator
.
Understand Schema Validation
In express-validator
, schemas are an object-based way of defining validation and/or sanitization rules on a request.
While their syntax differs from validation chains, they offer exactly the same functionality. Under the hood, express-validator
translates schemas into validation chain functions.
Thus, you can choose between one syntax or the other, depending on your preference.
The same Express application can have some endpoints validated with chains and others validated with schemas.
Schemas are simple JavaScript objects whose keys represent the fields to validate. Schema values contain validation rules in the form of objects.
Pass a schema object to the checkSchema()
function to get an Express validation middleware.
For example, this is how you can use a schema to ensure that the name
body parameter contains at least 4 characters when it is present:
checkSchema(
{
name: {
optional: true,
isLength: { options: { min: 4 } },
},
},
["body"]
);
By default, checkSchema()
behaves like check()
. To specify which input data sources it should check, you can pass them in an array as the second parameter. In the example above, the validation schema object will only be applied to the body data.
Sometimes, you may need to validate a body's inner field. Check out the documentation to see what express-validator
offers in field selection.
Validate Route, Query, and Body Data With Schemas
Here's how you can specify the validation rules shown in the method chaining section above through schema validation:
-
GET /api/v1/users/:userId
:
app.get(
"/api/v1/users/:userId",
checkSchema(
{
userId: { isInt: true },
},
["params"]
),
(req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
const userId = req.params.userId;
// find a user by id
const user = users.find((user) => user.id == userId);
if (!user) {
res.status(404).send("User not found!");
} else {
res.send({
user: user,
});
}
} else {
res.status(400).send({ errors: result.array() });
}
}
);
-
GET /api/v1/users
:
app.get(
"/api/v1/users",
checkSchema(
{
search: { optional: true, trim: true, notEmpty: true },
},
["query"]
),
(req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
// select all users by default
let filteredUsers = users;
// read the matched query data from "req"
const data = matchedData(req);
const search = data.search;
if (search !== undefined) {
// filter users by fullName with a case-insensitive search
filteredUsers = users.filter((user) => {
return user.fullName.toLowerCase().includes(search.toLowerCase());
});
}
res.send({
users: filteredUsers,
});
} else {
res.status(400).send({ errors: result.array() });
}
}
);
Note that the order of the attributes in the schema object matters.
Placing the trim
attribute after notEmpty
will result in a different validation rule.
-
POST /api/v1/users
:
app.post(
"/api/v1/users",
checkSchema(
{
fullName: {
trim: true,
notEmpty: true,
},
email: {
errorMessage: "Not a valid e-mail address",
isEmail: true,
},
age: {
isInt: { options: { min: 18 } },
},
},
["body"]
),
(req, res) => {
// extract the data validation result
const result = validationResult(req);
if (result.isEmpty()) {
// read the body data from the matched data
const newUser = matchedData(req);
const maxId = users.reduce(
(max, user) => (user.id > max ? user.id : max),
0
);
// add a new user with an auto-incremented id
users.push({
id: maxId + 1,
...newUser,
});
res.status(201).send();
} else {
res.status(400).send({ errors: result.array() });
}
}
);
As you can see, not much changes from chain validation. The two approaches are completely equivalent.
Test the Validated Endpoints
Start your Express application and prepare to test your schema-based data validation.
Otherwise, check out the schema-validation
branch from the repository supporting this article:
git checkout chain-validation
Install the project's dependencies, start the local server, and repeat the API calls made in the method chaining validation section.
You should get the exact same results!
Wrapping Up: Protect Your Node.js Server From Unexpected Incoming Data
In this post, we defined Node.js data validation and explored its benefits for a backend application.
You now know:
- The definition of data validation
- Why you should check the data received by an endpoint before feeding it to business logic
- How to implement data validation in Node.js with two different approaches: method chaining and schema validation
Thanks for reading!
P.S. If you liked this post, subscribe to our JavaScript Sorcery list for a monthly deep dive into more magical JavaScript tips and tricks.
P.P.S. If you need an APM for your Node.js app, go and check out the AppSignal APM for Node.js.
Top comments (1)