Archive for November, 2008

Generic Function Pointers In C And Void *

Tuesday, November 25th, 2008

Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email

Many times people ask me about what keyword \ type they should use to declare a generic function pointer in C, or worse still, they don’t ask and steam ahead using “void *”. Well, C does not have a generic function pointer type but it does have a generic function pointer. We’ll see why void * cannot be used to denote generic function pointers and so how we can declare them, but first a brief word on why would someone need a generic function pointer in the first place.

Why do we need Generic Function Pointers?

Well, let’s explain this with the help of a slightly advanced example of a module M1 that supplies information to a lot of other modules M2, M3, M4…Mn. M1 provides this information to modules Mn through callbacks but all these modules need different kind of information and different prototypes for their callback functions. These modules register with M1, using an API, say M1_register(Mn_callback_ptr). Now, either we could have a separate registration API for each “type/class” of subscriber Modules depending on what kind of callback they are giving, or we can have a generic function pointer, to which they typecast their actual callback to and then call the registration API. M1, on the other hand, typecasts this callback pointer to its original form while calling callbacks appropriately.

Why can’t we use void* for a Generic Function Pointer?

This is because a void* is a pointer to a generic “data” type. A void * is used to denote pointers to objects and in some systems, pointers to functions can be larger than pointers to objects. So, if you convert amongst them, you’ll lose information and hence, the situation would be undefined and implementation dependent. Most compilers won’t even warn you if you convert between them but some might error out, if you try to call such a void * to function pointer converted. But even they might fail to alert you, if you take care of typecasting the call perfectly (Enclose in parentheses before function call brackets). And then, one fine day, you’ll try to compile and run your program on one of the aforementioned systems, and then keep on wondering why your program segfaults.

Note: C++ does allow this “conditionally” which means that such a conversion is allowed but a compiler is not bound to implement this feature, which again makes its usage circumspect.

So, how exactly do we declare a Generic Function Pointer?

(more…)

© Safer Code | Generic Function Pointers In C And Void *

Liked this post? Get FREE Updates
Subscribe to RSS feed

Or
Enter Your E-mail ID below

All Input is Evil

Tuesday, November 18th, 2008

Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email

In my previous posts, I have been emphasizing on validating Integer and String inputs by putting various checks in place. But now, I’ll suggest you to consider any type of input to your application or software as “Evil”. Consider the following two rules for any input data:

  1. All input is evil until proven otherwise.
  2. Data must be validated as it crosses the boundary between untrusted and trusted environments.

Till now, I explained how to validate Integer and String data, but today, I’ll explain what is to be validated in the input data. First things first, Look for valid data and reject everything else. You should deny all access until you are sure that the input in the request is valid. You should look for valid data and not look for invalid data for two reasons:

  1. There might be more than one valid way to represent the data.
      • For example: a word “Rose” can be represented in many ways like “ROSE”, “rose”, “R%6fse”, “RoSE” et cetera. All the mentioned words are the variations of single word “Rose” and they are valid variations. But, This can definitely be a problem for an application.
  2. You might miss an invalid data pattern.

Consider the following code: (more…)

© Safer Code | All Input is Evil

Liked this post? Get FREE Updates
Subscribe to RSS feed

Or
Enter Your E-mail ID below

Validating Untrusted String Inputs

Tuesday, November 11th, 2008

Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email

Alright!! In my last post about untrusted inputs, we talked about validating the data of the “integer” input parameters, checking the out parameters et cetera.This time, we’ll talk about other types of inputs. If you have written a program to take in multiple lines of strings as an input from the user, you need to make sure that the input is not tainted. It is clean and as per your expectations. For example: If your program requires an answer for a question which can be subjective, then you need to provide a string buffer good enough to get a complete answer but not large enough to crash your system or make it run out of memory. Or you need to protect your system from getting any malicious scripts being inserted.Strings are a very risky area for inputs as there is pre-defined rule for this type of validation. So, following are the points to ponder to make your code safe and secure.

  1. Firstly, do use regular expressions to validate the string input. For example, ^[A-Za-z0-9]+$ specifies that the string must be at least one character long and that it can only include upper-case letters, lower-case letters, and the digits 0 through 9 (in any order). You can use regular expressions to limit which characters are allowed and to be more specific (for example, you can often limit even further what the first character can be).If you use regular expressions, be sure to indicate that you want to match the beginning (usually symbolized by ^) and end (usually symbolized by $) of the data in your match. If you forget to include ^ or $, an attacker could include legal text inside their attack to bypass your check.
  2. Now, if your program needs more variety of input and the above point doesn’t fulfil the requirements then you need to make a bit more complicated regular expressions. If the data is a filename (or will be used to create one), be very restrictive. Ideally, don’t let users choose filenames, and if that won’t work, limit the characters to small patterns such as ^[A-Za-z0-9][A-Za-z0-9._\-]*$. You should consider omitting from the legal patterns characters like “/”, control characters (especially newline), and a leading “.”Similarly, you need to take care for email strings, locale specific strings. UTF-8 encoding characters et cetera. In most of the programs, complex regular expressions are good enough to validate a string. But in certain cases, a malicious input containing some script code can spoil the fun.
  3. If your program faces HTML tags or script related instructions in the input, the input should be rejected immidiately or your program might get infected with self executing malicious code. This technique is generally used in Cross Site scripting attack. (XSS attack). These problems are especially a problem for Web applications. Now, you need to again take care not to validate any input which looks like an HTML tag. The easiest way is to use above mentioned regular expressions which won’t allow the entry of ‘<’ or ‘>’ character. But if you must support some of the HTML tags like <a href=> etc, please validate them exclusively by filtering the whole string using a regex like ^(http|ftp|https)://[-A-Za-z0-9._/]+$. A pattern that allows some more complex patterns is: ^(http|ftp|https)://[-A-Za-z0-9._]+(\/([A-Za-z0-9\-\_\.\!\~\*\'\(\)\%\?]+))*/?$
  4. For more complex strings, like reading a data file, regular expressions, again, prove useful but the ideal way is to break the file into multiple chunks rather than reading it in one complete string.

To Keep your String input kept in well defined range or buffer, make sure that your program terminates it with a NULL character. This will ensure that even if a large buffer is inserted using the input, the sting will get truncated as soon as the buffer gets full and it will be protected from buffer overflow.

© Safer Code | Validating Untrusted String Inputs

Liked this post? Get FREE Updates
Subscribe to RSS feed

Or
Enter Your E-mail ID below

Unsafe Functions In C And Their Safer Replacements: Strings Part I

Tuesday, November 4th, 2008

Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email

A string is a fundamental part of programs all around us. Data exchange in many forms happens in strings (e.g. user input, command line arguments, web forms, text protocols and what not.) But most programs written in C are plagued by security issues because of their usage of unsafe functions. A string is not a built-in data type in C, instead it is termed as a continguous sequence of characters terminated by a NULL character (‘\0’). Now, many of the “standard” string manipulation functions written in early part of C development took this definition by heart, assumed that a programmer always knows what he is doing (though I agree that this MUST be true), and put out a code meant to be used in an everyone-is-good world. Subsequently, the shortcomings were noticed, stronger sibling functions were created but the older ones are still supported because they are “standard”. This means that naive programmers continue to use them and put their programs’ security into jeopardy. This series will do an in-depth analysis of such unsafe functions, tell you why they are unsafe, and bring out what alternatives you have in-built and what alternatives you can create.

Our first candidate is the very famous “strcpy()”. Lets see why it is unsafe.

(more…)

© Safer Code | Unsafe Functions In C And Their Safer Replacements: Strings Part I

Liked this post? Get FREE Updates
Subscribe to RSS feed

Or
Enter Your E-mail ID below