Author Archive
Tuesday, December 29th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
Many times while going through some perl code, you must have come across snippets like “select((select(fh), $|=1)[0])” and wondered what this means, even though you might know that:
- $|=1 is used for setting autoflush (i.e. unbuffered data output) and that
- select is used to set the default output to a given file handle instead of STDOUT
Whenever I face any issue with a code fragment, I try to break it down into simpler terms to understand it from the beginning (like I did for my random number generation post). So, these are the steps in which I progressed:
- select(fh) replaces the STDOUT with fh and returns the old filehandle (i.e. STDOUT).
- (select(fh),$|=1) does the above and then sets this new output to autoflush
- From perl’s online docs, I found that the output of the above is a list, the first element of which is the old Filehandle
- So, (select(fh),$|=1)[0] gives us STDOUT
- Then select(select(fh, $|=1)[0]) basically just sets the default output back to STDOUT
So, what is the use of all this. Basically, this is nothing but a trick to set autoflush for any filehandle. Now, there is a very simple way to do this. You just need to include the IO::Handle module (by writing “use IO::Handle;” in your script) and then call “fh->autoflush(1)” on your file handle (Use 0 as parameter to disable autoflushing). This is much cleaner although it means longer run times as your script now has to include and compile lot of new lines of code because the module you added.
© Safer Code | Weird Usage Of “select” in perl
Tags: autoflush. STDOUT, file-handle autoflush in perl, perl, select
Posted in General, Other Languages | No Comments »
Thursday, December 24th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
A global or static structure is automatically initialized by all of its elements as 0 or null (In case of pointers). But if you want to initialize a local struct to this, what would you do? Many would say that would use a memset or calloc to set all the elements of that structure instance to 0. But this is incorrect as a null pointer might not be same as 0. So, the correct way to do this would be like:
struct a b = {0};
What this does is that it initializes the first element of the structure b to 0 (or null pointer if it is a pointer) and the rest of the elements are initialized like they would have been done if the structure was global or static.
© Safer Code | Structure Initialization (All Elements to 0) In C
Tags: C, struct initialization, structure initialization
Posted in C/C++ | 2 Comments »
Tuesday, December 8th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
This is a topic that is quite easy and doesn’t need much explanation but still many people manage to mess it up. Not going into a hyperbole, I’ll get straight to the point. When asked, people can quickly tell you how to get a random number between 0 and b (b being any number below the maximum random number possible, which is defined as RAND_MAX by most C compilers). Using the function “rand” provided by most C libraries, this is as simple as:
Basically, given any random number, if you take its modulus with a, you will obviously get a number between 0 and a -1. This is all fine and dandy, so now someone asks you to generate a random number between a and b (a < b). This one is also really simple but few people fumble out still. Just think of it this way.
- If you add 1 to the above equation’s right hand side, your random number will be between 1 and a. So, basically your “lower limit” is raised by one. In the above case, your lower limit is a, so just raise it by a by adding it to right hand side, to arrive at this partial solution.
- To complete the equation, now think of the gap between the minimum and maximum result obtained from original equation. Minimum is 0 and maximum is (b-1). But your desired gap is (b-a). Since taking modulus with respect to b, gives you a gap of b-1, to get the desired gap, you need to take mod with respect to ((b-a)+1). So, minimum value this will give you is 0 and maximum would be (b-a) +1 -1 = b-a. So, your final equation becomes
result = rand() % (b - a + 1) + a;
This will give you a minimum value of 0 + a = a
and a maximum of b – a + a = b.
Note that the above solution includes the limits for the result. If you don’t want to include the limits (i.e. minimum result = a + 1 and maximum result = b – 1) and , then just add (a + 1) instead of a in step 1 and use (b – a -1) for your modulus operation instead of (b – a + 1) in step 2 to make the equation:
result = rand() % (b - a - 1) + a + 1;
Note that in the above equation we used b – a -1, though on the surface it looks like we could have gone with just (b-a). After all we just wanted to decrease the upper limit by 1. But the reason we had to decrement by an extra place is because of the 1 we added to raise the lower limit (a +1).
© Safer Code | Random Number Between Two Integers
Tags: rand, Random Numbers, RAND_MAX
Posted in C/C++, General | 14 Comments »
Monday, August 24th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
Most C programmers, even beginners, would claim that arrays are easy, except those abundant off-by-one errors and they are right. Arrays are easy indeed. However, here are a few points to consider when writing a program that needs a rather large array (By the way, the thought for this article came into mind while helping a colleague to resolve an error a couple of days ago). When creating a large array (e.g. something like int a[1000][1000], which BTW takes up around 3.8 MB on most machines), you might not have any issue at all or might see either a compile time error or worse, runtime errors.
Why do these errors happen? Depending on your machine’s limitations you are most probably consuming the total available stack (if you used a local variable) or data memory. Depending on your compiler, these might be flagged at compile time or surface when you try to run your program. Or it might be that your compiler isn’t able to work with large arrays. What you can do to alleviate these:
- The first and best way is to minimize your array size (This makes for an amazing example here from the bookd “Programming Peals”: http://www.cs.bell-labs.com/cm/cs/pearls/cto.html )
- Use “huge” memory model.
- Use a global variable (or a static variable if you are particular about its visibility to the rest of the program). The stack is generally quite limited as compared to the data memory available to a program. So, a variable with static storage would ensure that you use memory from the data segment.
- Don’t declare it as an array at all and instead use a pointer and dynamically allocate the memory required for it. You might have to use special allocation calls instead of normal malloc/calloc to get this working though (e.g. using farmalloc)
- Create a section in assembly with the “Area” directive (or whatever it is for your particular assembler) and reserve space for your array there and refer to that array as an extern variable.
The first approach is something that you should always look for. But if you can’t mimize your need any further, choose one out of the rest two. But few things to be kept in mind here that these options can still fail, e.g., when you don’t have enough RAM/heap memory remaining at runtime (of course, you would have planned for a graceful exit though in such case instead of the random crash that would have occured otherwise). But still these could be useful to you in case you don’t have any real RAM limitations but just that your compiler isn’t able to work with large objects.
© Safer Code | Large Arrays In C
Tags: BSS, C CPP, heap, Large Arrays, Programming Pearls, Stack Overflo
Posted in C/C++, Optimization, Security | 5 Comments »
Tuesday, May 12th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
Many people would agree with me when I say that the hardest part of fixing a bug is to find where it is originating from. It has happened to me quite a number of times that a certain module exhibits a particularly wild behaviour making me go mad on its developer but as I dug deeper, the cuplrit turned out to be someone who had not even heard of that module.
Today I bring to you such an example of a nice bug, which I’ll term as (like many others):
OpenOffice.org Cannot Print On Tuesdays Bug
OpenOffice.org (also known as OOo or just Open Office) is a free and widely used MS Office alternative and a lot of its users reported in recently that it would just stop printing on every tuesday. Come Wednesday, everything would be just fine and dandy, but for just less than a week till Tuesday showed its face again. Now, before I continue to unravel the mystery behind this unique bug, let me outrightly clear it out to my Indian friends that OOo is not devotee of Lord Hanuman, deciding to go on a fast on Tuedays to offer its obesceinces.
Well, after days of discussions, and people blaming everything from OOo to cups (the printer daemon), printer drivers, or the printer itself, one enterprising soul decided to investigate and found out that if he changed the “CreationDate” tag in the generated postscript file to replace the “Tue” of Tuesday with something else, the file happily printed.
(more…)
© Safer Code | A Bug Is Not Always Where It Seems To Be
Tags: 500 mile email, bugs, debugging, OOo, Open Office Bug, Open Office tuesday print bug, OpenOffice.org
Posted in General | No Comments »
Tuesday, May 5th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
First of all, I apologize for the decreased frequency of updates. We have been quite busy with our offline lives and primary livelihoods lately keeping us away from posting much. But we intend to not let it remain like this for much longer. I’m posting a short article today about something that almost everyone of us has had to do at some point of time, i.e., to find the offset (or relative position in bytes) of an element in a structure. Let’s take the following structure as an example:
struct
{
char a;
int b;
char c;
}example;
Now, if I were to ask you to find out the element b’s offset in the above structure, you won’t probably be able to answer with complete confidence unless I tell you the compiler you are working with and whether packing has been turned on or not. The easiest way to find it out is to use a small snippet of code to do it for us and that always works. e.g.
struct example s1;
unsigned int offset;
offset = (unsigned int)&s1.b - (unsigned int)&s1;
The above snippet will work, but not always (Hint). Many people use a much simplified form, which does not involve any pointer arithmetic:
unsigned int offset;
offset = (unsigned int)(&(((example *)(0))->b));
The above code is much simpler/faster but again, it might not be portable. So, what is the best method to do this portably. It’s quite simple really, just use the “offsetof” macro provided by any ANSI-C compliant compiler. It is present in stddef.h and can be used in the following way:
size_t offset;
offset = offsetof(example, b);
If you noticed, offsetof() also presents another advantage to you like the 2nd method, i.e., it does not require an extra structure to be defined. In fact, this macro is defined in forms similar to our method 2 but the benefit is that it ensures portability for your code.
© Safer Code | Find The Offset Of An Element In A Structure In C- offsetof()
Tags: C, element offset, element offset in structure, offsetof, Portability, structure in c, structure offset
Posted in C/C++, General | 9 Comments »
Tuesday, March 10th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
Writing portable code is very important but it is one of the aspects that most people neglect until it is too late to realize its importance. Till few years ago, most people writing code for personal computers were not worried about the data sizes on their machines. They didn’t even think whether the machines, on which their code would be running, would be 32 bit or 64 bit. But the recent advent of 64 bit machines in normal every-day usage has them running helter-skelter to get their programs into shape. Many of these programs would like to run the same code base on 32 bit as well as 64 bit machines. There are many ways to do it. A few allow us to use data types that would work as expected in both machines while in other ways, they explicitly check for the architecture of the machine and carry out their tasks accordingly. Before I give you the code to run this check, let’s see a bit of theory behind it.
First, let’s be clear that the discussion we will be doing now will not always give you the “hardware architecture” of a machine. Rather it’ll allow you to know the “Programming Model” (or Data Model)that the OS or your compiler enforces on you. What I mean is that if you run a 32 bit OS on top of your latest 64 bit processor based system, it will still mean a 32 bit programming model for you. Infact, if you were to compile your programs using the ancient Turbo C compiler, you’d be in for an even bigger surprise
. That said, ultimately the programming model is what you’d be interested in knowing to make sure that your program can compile and run accurately on that particular system. The most common programming models in use are as below:
| Datatype |
LP64 |
ILP64 |
SILP64 |
LLP64 |
ILP32 |
LP32 |
| char |
8 |
8 |
8 |
8 |
8 |
8 |
| short |
16 |
16 |
64 |
16 |
16 |
16 |
| int |
32 |
64 |
64 |
32 |
32 |
16 |
| long |
64 |
64 |
64 |
32 |
32 |
32 |
| long long |
64 |
64 |
64 |
64 |
64 |
64 |
| pointer |
64 |
64 |
64 |
64 |
32 |
32 |
(more…)
© Safer Code | Portable Code: How To Check If A Machine Is 32 Bit Or 64 Bit
Tags: 32–bit, 64–bit, Architectue, C, check machine 64 bit, CPP, Data Model, data models, data sizes, data types, function pointers, Hardware Architecture, ILP32, ILP64, LLP64, LP32, LP64, pointers, Portable code, Programming Model, SILP64, writing portable code
Posted in C/C++, Portability | 11 Comments »
Tuesday, February 24th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
Last time we explained the real meaning of const keyword, this time it’s going to be Volatile, the other sibling of this most misunderstood duo in C history. Let’s separate out the myths and the facts first and then we will discuss the how’s and why’s of it.
FACTS:
- A volatile qualifier is important to be used for auto-storage variables within setjmp and longjmp.
- A volatile qualifier must be used when reading the contents of a memory location whose value can change unknown to the current program.
- A volatile qualifier must be used for shared data modified in signal handlers or interrupt service routines.
MYTHS:
- All shared data in multi-threaded programs must be declared volatile.
Now, we’ll see how we made the above classfication.
(more…)
© Safer Code | Volatile: C Keyword Myths Dispelled
Tags: -O2, atomicity, C, const, CPP, gcc, keyword, memory barriers, memory fences, non-atomic access, Optimization, order of access, volatile, volatile keyword
Posted in C/C++, General, Optimization | 14 Comments »
Wednesday, February 4th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
There are a few very basic things in C that are widely misunderstood by programmers of all caders. This is not because these things are very complex, but because books and teachers do not give them their due importance while teaching beginners and the misconceptions stick even years later. A few of these are like the keywords “volatile” (covered in next post) and “const”.
Look at the following piece of code:
#include <stdio .h>
int main()
{
const int a = 1;
a = 2;
printf("%d\n", a);
return 0;
}</stdio>
You’d be quick to point out (correctly) that the above code wouldn’t compile. It would fail with an error on the lines of “assignment of read-only variable a” because a is now a “constant” because of the const type qualifier attached to it. And now comes the issue. Many start assuming (or might even be explicitly taught) that the value of a cannot be changed now. Did you think so too? Well, you thought wrong. Look at the following code:
(more…)
© Safer Code | "const" Keyword Explained
Tags: C, compiler, const, keyword, Optimization, read only, volatile
Posted in C/C++ | 18 Comments »
Tuesday, January 27th, 2009
Subscribe To Our Feed | Follow Us On Twitter | Get Updates on Email
One of the ways to optimize your code for speed is to unroll the loops you have in your program. If you don’t know what loop unrolling is, then see the following simple example, where we are trying to copy a 5 element long string:
A Simple Loop:
for (i = 0 ; i < 5; ++i)
{
*j++ = *k++;
}
The same loop unrolled:
*j++ = *k++;
*j++ = *k++;
*j++ = *k++;
*j++ = *k++;
*j++ = *k++;
Now, the unrolled version is definitely going to be faster because of the following reasons:
1. There are no branches any more. In a loop, the compiler has to insert branches to jump back to the for statement.
2. There are no condition checks. In a loop, the value of i is checked every time to see what to do next.
3. A third rather hidden, but really important, advantage is that unrolled code can be pipelined efficiently by the processor hence resulting in faster execution.
The above code in itself might not show you any perceivable difference in execution time, but it does become crucial when the loops are of much higher magnitude and especially in embedded/real time scenarios.
Now, the loop unrolling can be done by the compiler or manually by the coder. In this part, we will see how to facilitate compiler to carry out loop unrolling efficiently. (more...)
© Safer Code | Tweak Your Code For Speed: Unroll Those Loops Part 1
Tags: C, iterations, Loop Unrolling, Loops, Optimization, Optimization techniques, Pipelining
Posted in C/C++, Optimization | 5 Comments »