Input Validation vulnerabilities and how to fix them

Table of Contents

Input Validation overview
- Syntactical and Semantic validation
- Client-side and Server-side validation
Vulnerabilities and Countermeasures
Conclusions

In this posts we'll provide an overview of the main vulnerabilities (known to date) that try to exploit two common programming errors that often affects web applications: incorrect handling of user input and erroneous or absent checks during the allocation of the memory areas used to contain the data.

The consequences of such vulnerabilities consist of a number of different attack techniques, usually aimed at execution of remote commands or viewing classified data; such threats can be effectively countered or mitigated by implementing some Input Validation countermeasures that we'll try to summarize.

IMPORTANT: the countermeasures suggested for each vulnerability should be implemented by senior software developers with a good background in internet security: if you don't have an internal team of coding expert capable of doing this you can try to outsource these aspects using a good software development company (read this software development outsourcing guide for details).

Input Validation overview

Before digging into the actual threats, let's spend a couple minutes to understand what Input Validation actually is and why it's a fundamental security asset in any web (and non-web) application.

The best definition of Input Validation comes from the Input Validation Cheat Sheet page at the OWASP web site, which we strongly suggest to read:

Input validation is performed to ensure only properly formed data is entering the workflow in an information system, preventing malformed data from persisting in the database and triggering malfunction of various downstream components.

As we can easily see, we are talking about a security countermeasure which aims to verify that the data received from users - as well as from any involved third party - are correctly formatted, correspond to what is expected and do not contain malicious elements, characters or instructions. OWASP advises that, although Input Validation techniques should not be used as the primary method of preventing XSS, SQL Injection and other attacks, they can significantly contribute to reducing their impact.

It's important to understand that Input Validation shouldn't be limited to the data sent by the end users, but applied to all potentially untrusted sources: this includes any internet-facing web clients, as well as any backend feeds over extranets (suppliers, partners, vendors or regulators): in a nutshell, we can say that any data sent to our application should be subject to validation in order to minimize the risk of receiving malformed data. Furthermore, input validation techniques should happen as early as possible in the data flow, preferably as soon as the data is received from the external party.

Syntactical and Semantic validation

Input Validation can be applied on two levels:

Syntactic validation, which checks the proper syntax of structured fields (SSN, date, currency symbol).
Semantic validation, which checks the correctness of each input value in the specific business context (start date must be lower than end date, price must be greater than zero, and so on).

It goes without saying that both levels should be checked whenever possible: moreover, both checks should be executed as early as possible in the application lifecycle, in order to ensure that unauthorized input will be blocked before being processed by the application (such as stored in Database, and so on).

Client-side and Server-side validation

Most client-side javascript libraries (such as JQuery) and frameworks (such as Angular and React) include some strong client-based validation features that can prevent malformed or erroneous input from being submitted to the server: in other words, this means that the input validation is performed on the client (usually the web browser), before the data is being sent. This can be great in terms of performance and grants an additional layer of security, but it's important to understand that it's not a validation that we should rely upon: as a matter of fact, any JavaScript-based input validation performed on the client can be bypassed by an attacker that disables JavaScript or uses a Web Proxy. For this very reason, it's important to ensure that any input validation performed on the client is also performed on the server: in other words, this means that - regardless of the client-side framework we're using - we always need to implement a comprehensive server-side validation.

Server-side validation is the process of checking for errors (and handling them accordingly) on the server side, that is, after the data has been sent to the back-end. As we can clearly understand, this is a whole different approach to client-side validation, where the data is checked by the front-end, that is, before the data is sent to the server. Handling errors on the client side has a lot of advantages in terms of speed and performance because the user immediately knows whether the input data is valid or not without having to query the server. However, server-side validation is a required feature of any decent web application because it prevents a lot of potentially harmful scenarios, such as the following:

Implementation errors of the client-side validation process, which can fail to block badly-formatted data
Client-side hacks performed by experienced users, browser extensions, or plugins that might want to allow the user to send unsupported input values to the back-end
Request forgery, that is, false HTTP requests containing incorrect or malicious data

All of these techniques are based upon circumventing the client-side validators, which is always possible because we have no way to prevent our users (or hackers) from skipping, altering, or eliminating them; conversely, server-side validators cannot be avoided because they will be performed by the same back-end that will process the input data.

Therefore, in a nutshell, we could reasonably say that client-side validation is an optional and convenient feature, while server-side validation is a requirement, at least for any decent web application that cares about the quality of the input data.

To avoid confusion, it is important to understand that server-side validation, although being implemented on the back-end, also requires a front-end implementation, such as calling the back-end and then showing the validation results to the user. The main difference between client-side validation and server-side validation is that the former only exists on the client-side and never calls the back-end, while the latter relies upon a front-end and back-end coordinated effort, thus being more complex to implement and test.

Vulnerabilities and Countermeasures

Now that we're learned the basics, let's review a list of some common input-based Vulnerabilities and the Countermeaures we can implement to either prevent them from happening or minimize their impact.

Shell Execution Command

If the most frightening vulnerability for desktop applications is the Stack Overflow, in Web applications we could say that it undoubtedly is the Shell Execution Command. Such threat happens when the attacker's input data are passed to the shell interpreter without being filtered, thus causing a direct or indirect execution of a command on the host system. It's worth noting that SQL Injection attacks could sometimes be classified as "indirect" Shell Execution Commands, since they can execute various actions on data (including deleting them): however, since they rely upon different vulnerabities and can be countered by adopting specific strategies, they will be treated in a dedicated section.

Countermeasures

Write the code so that no command shell is executed.
Avoid direct invocation of system commands, especially if it uses user input; to access to the functions of the operating system, always and only use the APIs made available by the various libraries programming languages.
Analyze the user input by filtering potentially harmful words and characters; even better if it occurs in advance, such as by comparing the user input with a white list of allowed values.

File Inclusion

File Inclusion issues are quite common in web applications; they have spread in recent years with the boom of scripting languages and technologies (ASP, PHP, Python, Perl, etc ..) and occur when the parameters passed to a vulnerable script are not properly verified before being used to include files at certain points on a portal.

File Inclusion issues are usually divided into two categories:

Local File Inclusion: such vulnerability occurs when an attacker passes locally resident files as parameters of a vulnerable script: their content is thus displayed on the screen in the exact point of the portal where inclusion occurs. In this way, an attacker can obtain system password hashes or access confidential information located outside the Web Server's document root. Local File Inclusion issues can also be exploited to execute remote commands if the attacker has the ability to locally place a file containing malicious code, which can be targeted by the vulnerable script. The file can be transmitted using the classic network services (ftp, ssh, cifs, etc ..) or using any upload procedure that can be called up from the Web.
Remote File Inclusion: such vulnerability allows an attacker to pass, as parameters of a vulnerable script, a file that resides on another web server (for example controlled by himself). The attacker can place scripting code (such as malicious PHP code) within this file to execute remote commands on the system.

As we can easily understand, Local File Inclusion vulnerabilities can be exploited to access or expose reserved/classified data, while Remote File Inclusion threats can be even more dangerous since they might allow the attacker to directly deal with the data (including, yet not limiting to, exfiltration tasks).

Countermeasures

Avoid using external files whose content is difficult to verify. In case you can't do that, it could be wise to prepare a white list of allowed files: only such files will be selectable from
part of the user, for example through a numerical index: such approach is very easy to implement in the case of local files. In the case of remote files there is no other solution than to check the content or the hash of the file before using it in any way.

XML external entity (XXE) injection

XML external entity injection, also known as XXE, is a vulnerability that allows an attacker to manipulate XML data processed by a web application. The attacker may be able to access the application server's file system and interact with any external system to which the application itself is authorized to access. In some situations, the attack can lead to extreme consequences, up to compromising the underlying server or other back-end infrastructures, exploiting the XXE vulnerability and faking server-side requests (SSRF).

Some applications use the XML format to transmit data between the browser and the server; those applications almost always use a standard library or platform API to process XML data on the server (XML parser); XXE vulnerabilities arise because the XML specification contains various potentially dangerous features which are "supported" by those XML parsers, thus leaving the system open to an attack.

Countermeasures

All XXE vulnerabilities arise because the XML parsing library used by the application supports potentially dangerous XML functionality; the simplest and most effective way to prevent XXE attacks is to disable these features.
Disable automatic resolution of external entities and disable support for XInclude, a part of the XML specification that allows you to create an XML document from subdocuments. This can usually be done via configuration options or by overriding at level default behavior.
Consult the documentation for the library or API that deals with XML parsng for details on how to disable dangerous and unnecessary features.

Insecure Deserialization

When data organized into structures such as arrays, records, charts, classes, or other configurations, they must be stored or transmitted to another location, for example across a network, must pass through a process called serialization; this process converts and changes the organization of data in a linear format, easy to transmit and to archive on storage devices.

Conversely, deserialization converts linear data into structured data, instantiating the object for use by the target process. The formats of the serialized objects are standardized so that they can be read by different platforms. Some of the platforms that support serialization processes include python, perl, php, ruby and Java. The Microsoft .NET platform also supports serialization functions with XMLSerializer and DataContractSerializer classes, as well as the BinaryFormatter and NetDataContractSerializer classes, which are more powerful but also more vulnerable. XML, YAML, and JSON are among the most commonly serialized data formats used.

The insecure deserialization vulnerability arises the moment an attacker is able to inject malicious data into serialized data. The exploitation of this attack takes place when the destination process creates an active instance from serialized data.

Countermeasures

Minimize the usage of deserialization techniques and/or use them with extreme attention, reducing unnecessary data transfers between applications / systems, also reducing the amount of files written to disk.
Disable potentially harmful deserialization features (such as "recursive deserialization") and check the de-serialized data against potentially harmful content.
Adhere to the principle of minimum privilege, minimizing or disabling access to administrative privileges to reduce the impact of a possible successful attack.

Cross Site Scripting (XSS)

Cross Site Scripting (XSS) is a typical vulnerability in Web applications and consists of the ability to insert HTML or client-side scripting (commonly Javascript) code within a page viewed by other users: by exploiting this technique the attacker can force the execution of Javascript code inside the browser used by the visitor.

The most common use of Cross Site Scripting is aimed at intercepting cookies and / or the token of an authenticated user and steal its web session, thus operating on its behalf (session hijacking / session spoofing).

In some cases an attacker has the ability to inject persistent code into the vulnerable web page, for example if the malicious code is stored by the server (for example on a database): when this happens, the malicious code can be used to attack multiple clients upon each single connection; in other circumstances the injected code is not stored and its execution is only possible by enticing the user, often through Social Engineering techniques, to click on a link that points to the vulnerable web page; in this last scenario, the malicious URL is usually represented in hexadecimal (or other "cloaked" forms) to prevent the user from identifying the Javascript code passed as parameter on the page itself.

Cross Site Scripting (XSS) vulnerabilities can be exploited by an attacker to:

take remote control of a browser;
obtain a cookie;
change the link to a page;
redirect the user to a URI different from the original;
forcing the entry of important data in non-trusted forms (phishing);

And other malicious activities that should be prevented by adopting some useful countermeasures.

Countermeasures

In order to avoid Cross Site Scripting, it's very important verify the input that comes from the outside before using it within the web application (storing and/or showing in web pages); such verification involves the use of escaping functions, which detect characters considered dangerous (such as ampersands, HTML delimiters, escaping characters, and so on), replacing them with normal, safe text. There are a lot of built-in and third-party libraries that can help developers to "sanitize" html tags and Javascript code.

Directory traversal

Also known as "path traversal" or "dot-dot vulnerability": this issue occurs when the attacker has the ability to enter input that will be used by the application to access a file for reading and / or writing. Applications usually do not allow the use of arbitrary paths (for example /etc/ , /var/www , c:\winnt\system32\ and so on) and/or they not use them "as is", for obvious security reasons: when they do, the attacker might have the chance to exploit such practice and reach and acquire the contents of an externally resident file by placing a sequence of points before the name of the same (such as ../../../../ ).

Directory traversals have been used by attackers since the development of the first Web Servers, yet they are still widely used today: for this very reason, all modern web applications and frameworks are specifically designed to mitigate the risk of theirs exploitation by enforcing different approaches that don't support the usage of user-defined file names and/or arbitrary paths.

Countermeasures

Avoid using user-defined file system names and/or paths.
If the user were to choose a file, it would be necessary to limit the selection by imposing a limited choice of files allowed (white list), through a numerical index. In case you need to use a path provided by the user, it should be verified and / or escaped.
Create a chroot jail, i.e. do not allow to escape the root accessible from the web application, in such a way as to safeguard critical operating system directories. The same result could be achieved allowing access to a user who has limited access, whose home directory coincides with the document root.

SQL Injection

Last but not least, we come to SQL Injection, a dreadful problem that affects most Web Applications that rely upon a back-end layer that uses a relational database; however, it's important to specify that SQL Injection issues can affect all kind of applications, including client / server ones, as long as they use a Database.

In general terms, a SQL Injection vulnerability occurs when a script or other application component does not appropriately filter the input passed by the user, making it possible for an attacker to alter the original structure of the SQL query through the use of special characters (for example quotes and quotes) or by concatenating multiple constructs (for example using the SQL UNION keyword). Depending on the circumstances and the type of database server with which the application interfaces, the attacker can exploit a SQL Injection issue for:

bypass the authentication mechanisms of a portal (for example by forcing the return of conditions truthfulness to the control procedures);
rebuild the contents of a database (for example by locating the tables containing the tokens of active sessions, viewing encrypted / unencrypted user passwords or other information
critical nature);
add, alter or remove data already present in the Database;
execute stored procedures.

Countermeasures

To prevent a SQL Injection attack, you must avoid concatenating query strings and rely on stored procedures and parametric queries (prepared statements). It may be useful to use an ORM library such as EntityFramework, Hibernate, or iBatis, but this technology - by itself - does not prevent from SQL Injection unless coupled with other development best practices.

For additional info regarding SQL Injection threats and countermeasures, take a look to our SQL Injection: Security Best Practices & Guidelines dedicated article.

Conclusions

That's it, at least for the time being: we hope that this post will help most software developers and system administrators to prevent some of the most common Input Validation issues affecting modern web applications and apply the appropriate security countermeasures to prevent them.

Print Friendly & PDF Download

Input Validation vulnerabilities and how to fix them A list of the most common software vulnerabilities based upon malformed data input and some Input Validation security best practices for your applications