Posts Tagged ‘Cross Site Scriping’

Validating Untrusted String Inputs

Tuesday, November 11th, 2008

Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email

Alright!! In my last post about untrusted inputs, we talked about validating the data of the “integer” input parameters, checking the out parameters et cetera.This time, we’ll talk about other types of inputs. If you have written a program to take in multiple lines of strings as an input from the user, you need to make sure that the input is not tainted. It is clean and as per your expectations. For example: If your program requires an answer for a question which can be subjective, then you need to provide a string buffer good enough to get a complete answer but not large enough to crash your system or make it run out of memory. Or you need to protect your system from getting any malicious scripts being inserted.Strings are a very risky area for inputs as there is pre-defined rule for this type of validation. So, following are the points to ponder to make your code safe and secure.

  1. Firstly, do use regular expressions to validate the string input. For example, ^[A-Za-z0-9]+$ specifies that the string must be at least one character long and that it can only include upper-case letters, lower-case letters, and the digits 0 through 9 (in any order). You can use regular expressions to limit which characters are allowed and to be more specific (for example, you can often limit even further what the first character can be).If you use regular expressions, be sure to indicate that you want to match the beginning (usually symbolized by ^) and end (usually symbolized by $) of the data in your match. If you forget to include ^ or $, an attacker could include legal text inside their attack to bypass your check.
  2. Now, if your program needs more variety of input and the above point doesn’t fulfil the requirements then you need to make a bit more complicated regular expressions. If the data is a filename (or will be used to create one), be very restrictive. Ideally, don’t let users choose filenames, and if that won’t work, limit the characters to small patterns such as ^[A-Za-z0-9][A-Za-z0-9._\-]*$. You should consider omitting from the legal patterns characters like “/”, control characters (especially newline), and a leading “.”Similarly, you need to take care for email strings, locale specific strings. UTF-8 encoding characters et cetera. In most of the programs, complex regular expressions are good enough to validate a string. But in certain cases, a malicious input containing some script code can spoil the fun.
  3. If your program faces HTML tags or script related instructions in the input, the input should be rejected immidiately or your program might get infected with self executing malicious code. This technique is generally used in Cross Site scripting attack. (XSS attack). These problems are especially a problem for Web applications. Now, you need to again take care not to validate any input which looks like an HTML tag. The easiest way is to use above mentioned regular expressions which won’t allow the entry of ‘<’ or ‘>’ character. But if you must support some of the HTML tags like <a href=> etc, please validate them exclusively by filtering the whole string using a regex like ^(http|ftp|https)://[-A-Za-z0-9._/]+$. A pattern that allows some more complex patterns is: ^(http|ftp|https)://[-A-Za-z0-9._]+(\/([A-Za-z0-9\-\_\.\!\~\*\'\(\)\%\?]+))*/?$
  4. For more complex strings, like reading a data file, regular expressions, again, prove useful but the ideal way is to break the file into multiple chunks rather than reading it in one complete string.

To Keep your String input kept in well defined range or buffer, make sure that your program terminates it with a NULL character. This will ensure that even if a large buffer is inserted using the input, the sting will get truncated as soon as the buffer gets full and it will be protected from buffer overflow.

© Safer Code | Validating Untrusted String Inputs

Liked this post? Get FREE Updates
Subscribe to RSS feed

Or
Enter Your E-mail ID below