What Are Form Validation and Sanitization?

When creating web forms and processing them with Node, it is essential to validate and sanitize all user generated content. Proper validation and sanitization will keep your form user-friendly and secure. They help protect against common approaches to hacking like XSS attacks (Cross-site scripting) and SQL injection, amongst other things. It will also help prevent your application from malfunctioning when trying to process safe but unexpected input.

In this tutorial we’ll:

Learn about form input validation for Node.js
Learn about sanitizing user generated content
Explain why validation is essential for functionality and usability
Explain why sanitization is crucial for security

By the end of this tutorial, you will understand the consequences of not validating and sanitizing user input and why it’s important.

This tutorial is part 2 of 7 tutorials that walk through using Express.js for user authentication.

Goal

Understand what validation and sanitization are — and why they are essential to your application.

Prerequisites

None

What is form validation?

A web form is a particular type of web page that allows users to enter input. Common examples include registration, login, and contact forms. Each form requires the user to enter specific types of data.

Example User Registration Form:

Example login form showing username, password, and email address fields. Username field is highlighted in red indicating an error.

Look at the example user registration form above. What types of data is the user required to input?

Username
Password
Email

When you validate something, you are checking to make sure it is accurate or appropriate relative to pre-defined rules. The developer who created the form above has defined the rules for each input field. Here are some examples:

The username must be at least eight characters.
The password must be at least eight characters and include an uppercase letter, a number, and a special character.
The email must have email address formatting.
No input field may be blank.

Form validation ensures the user input does indeed follow the rules defined for each input field.

Front-end validation

As you may be aware, HTML has some front-end validation built in. For example, adding “required” to an input field ensures it gets filled in before submission. You may also add a minlength attribute to specify the minimum number of characters allowed.

<input type="text" name="username" placeholder="Username" minlength="8" required>

Defining an input field type as “email” ensures the input’s formatting is that of an email address.

<input type="email" name="email" placeholder="your_email@mail.com" required/>

HTML can not validate every possible rule, so front-end developers also use JavaScript. In the past, form validation mainly occurred on the server, making users very frustrated. Imagine not being aware of the rules or alerted when you broke a rule until the entire form was submitted! Front-end form validation makes forms more user-friendly. User input rules should be made clear, and the user should be alerted as soon as they fill in invalid input.

Back-end validation

Front-end validation is essential, but it is not enough. Back-end developers must also validate data. Front-end developers validate data to make the form user-friendly, but back-end developers validate data to make the application work.

Server-side validation prevents malicious attacks as well as benign user errors that could break your program. If you have a function that expects to take in a number, but the user entered a letter, your application could malfunction. Even when validation takes place on the client side, you still want to double-check it.

What is form sanitization?

Form sanitization removes, or escapes, code in the user input. Many common security attacks can easily be prevented by sanitizing user input.

Common security attacks

SQL injection

SQL injection attacks occur when a hacker puts SQL code into an input field.

Think back to our registration form. Where will the user data go? Probably right into the application’s database. That means the values inputted by the user get inserted into query statements:

INSERT INTO users (username,password,email)
VALUES ('username', 'password', 'email');

An experienced hacker is aware of this and can taint your insert statement with malicious code. Let’s say for username, our hacker entered: hacker_man', 'Password1!', 'got@hacked.com'); DROP TABLE USERS; --

Now your SQL statement looks like this:

INSERT INTO users (username,password,email)
VALUES ('hacker_man', 'Password1!', 'got@hacked.com'); DROP TABLE USERS; --', 'password', 'email');

And the entire user table can be deleted!

SQL injection also gets used to access usernames and passwords. You must protect your users, your organization, and your code.

Cross-site scripting (XSS)

The most common web attack is cross-site scripting (XSS), which occurs when the hacker injects JavaScript into a website. Remember, anything placed in script tags <script>...</script> will be executed as JavaScript.

The most common vulnerability occurs when sites accept user input that will display on a page. Posting a comment is a prime example. The hacker may add a script to the page if the site does not escape or encode the user input.

Instead of targeting the website’s data, XSS attacks typically target the user’s data. They may:

Read the user’s cookies.
Direct the user to a malicious site.
Trick the user into downloading malware.
Steal the user’s username and password

There are many types of Cross-site scripting attacks, and the malicious JavaScript may get stored in the website’s database, a server-side request, or in the client-side code.

Command injection

Hackers can gain control of your website by injecting operating system commands. Once a hacker gains administrative access to the operating system on your web server, they can access sensitive data and alter your website at will.

Uploading malicious files

Any user input fields used to upload files to your website must get checked for code files. If you do not check file extensions, a hacker can upload any code they want. A vulnerable upload box is a perfect way for a hacker to upload malicious server code to your site.

How to validate and sanitize ExpressJS forms?

Now that you understand the importance of validation and sanitization, your next question is probably, “How can I validate and sanitize user input?” Although you can write your own vanilla JavaScript code, it will be quite extensive. Furthermore, it is easy to leave out important code by mistake. Thankfully there are libraries already created for you. Express Validator is the perfect tool for validating and sanitizing your ExpressJS web forms.

In conjunction with Express Validator, it is imperative to use your database library properly. Most NodeJS SQL libraries have built-in features that prevent SQL injection. When a SQL statement depends on user input, do not use plain string concatenation to make the statement. Instead, use built-in functions to “parameterize” or “prepare” query statements. These built in functions will automatically escape user input.

The PostgreSQL for NodeJS package allows you to parameterize your values:

// Parameterized Query Statement
var statement = 'INSERT INTO users(name, email, password) VALUES($1, $2, $3) RETURNING *';

// Values To Insert
var values = ['starbuck','starbuck@galactica.com','dsh&*7**jkHJ8&0'];

//Query Function
db.query(statement,values,function(err,res){
  if (err){
  // code to execute if there is an error
  }
  else {
  // code to execute if no error
  }
});

The MySQL package has an escape function built in to handle user input:

// User Input
var username = 'starbuck';

// Statement with Escape Function
var statement = 'SELECT * FROM users WHERE username = ' + connection.escape(username);

// Query Function
connection.query(statement, function (err, res, fields) {
  if (err){
  . // code to execute if there is an error
  }
  else {
  . // code to execute if no error
  }
});

Recap

It is necessary to validate and sanitize all user input. Form validation ensures the user input follows the rules created for each input field. Including front-end form validation will make your form user-friendly, and including back-end validation will make your form work. Form sanitization ensures there is no code included in the user input. Hackers often infiltrate websites by injecting code into the database, server, and client-side code. Express Validator is an excellent tool for validating and sanitizing user input. Proper use of database libraries will also prevent code injection.

Keep going with the next tutorial in this set: Process a User Login Form with ExpressJS.

Further your understanding

What type of code injection might be used in a website’s search box?
What type of SQL injection might be used on a login form?
Can unintended spaces break a web form?

Additional resources

How to validate and sanitize an ExpressJS Form using Express-Validator (HeyNode.com)
Cross-site Scripting (wikipedia.org)
SQL Injection (wikipedia.org)