Click to See Complete Forum and Search --> : scope of a variable declared in a for loop


dude_1967
July 3rd, 2002, 06:15 AM
Hello all code gurus,

Here is another issue which has bothered me for years.
One poor soul dared to ask a while ago and the thread turned ugly --- with authors radically defending one development environment over the other and such things.

So please here just try to stick to the C++ language specification.

What is the scope of a loop variable?
I was of the opinion that the scope of i (see code snippet below) must be limited to lines within the loop.

Some compilers exhibit different behavior regarding this syntax.
Can anyone verify the proper language specification here?

#include <iostream>
using namespace std;

int main(int argc, char* argv[])
{

for(register int i = 0; i < 10; i++)
{
cout << i << endl;
}

#ifdef __GNUC__

#else

// This line should not compile in my opinion.
cout << i << endl;

#endif

return 1;

}

Thanks. Don't get too rowdy...
Chris.

:confused:

cup
July 3rd, 2002, 06:38 AM
It really depends on which version of the C++ standard the compiler is based on. If it is the later standard, the scope is only within the loop. If it is the earlier standard, the scope is within the enclosing block.

The arguments were probably based on the different compilers which were based on different standards; hence the confusion.

dude_1967
July 3rd, 2002, 06:47 AM
Thanks for clearing it up, cup.

Nowadays I just use the following:

register int i;

for(i = 0; i < 10; i++)
{
cout << i << endl;
}

All compilers can handle this one. Nevertheless, I wish newer compilers were consistent with the newer standard.

Chris.

;)

Zeeshan
July 3rd, 2002, 06:50 AM
According to C++ Standard 6.5.3.2


If the for-init-statement is a declaration, the scope of the name(s) declared extends to the end of the for-statement


So this is legal code in C++ although VC 6 gives error in this


for (int i = 0; i < 10; i++)
// do something

for (int i = 0; i < 10; i++)
// do something



In fact you can make your code portable this way


int i;

for (i = 0; i < 10; i++)
// do something

for (i = 0; i < 10; i++)
// do something



Hope it helps.

Chambers
July 3rd, 2002, 02:22 PM
I think the VC++ compiler only limits variables to a scope when they are declared within the body of a for, while, do or switch statement. So while:

for (int i=0; i<10; i++)
{
//Do something.
}

for (int i=0; i<10; i++)
{
//Do something.
}

is INVALID because both integers called i are declared OUTSIDE the statements body { }, which means they are not limited to the scope of the loop (which makes sense when you think about it). Remember that the first param of a for loop is initialisation, and it is done BEFORE the program enters the loop (makes even more sense IMO). The other params are used IN THE STATEMENTS BODY i.e. every time it loops. The statement above is trying to do this basically (in effect):

int i;
for (i=0; i<10; i++)
//blah

int i
for (i=0; i<10; i++)
//blah

which is obviously going to cause a compiler error.

Alan.

NigelQ
July 3rd, 2002, 02:47 PM
...Good question.

See MSDN article Q167748 which describes this very problem.

In fact it only became a problem when Microsoft introduced some language extensions back in version 5.0 that extended the scope of the variable beyond the for-loop.

You can disable this behaviour (and all language extensions) to limit the variable to the loop using the /Za compiler switch.

This (now default) behaviour does not comply with the ANSI/ISO C++ standard (there are many other things that don't comply too).

Hope this helps,

- Nigel

NigelQ
July 3rd, 2002, 02:49 PM
...I should also point out that using the /Za compiler option is not without its own problems.

See the Q167748 article for a specific workaround for the 'problem' you describe.

- Nigel

Paul McKenzie
July 3rd, 2002, 02:53 PM
Regardless of what VC++ does, this code is legal C++:

int main()
{
for (int i=0; i<10; i++) { }
for (int i=0; i<10; i++) { }
}

The scoping rules were clearly defined by Zeeshan's post. Also, go to http://www.comeaucomputing.com/tryitout and enter code in their on-line compiler, which adheres to the ANSI C++ specification. You will see that the VC++ way of doing this causes an error.

int main()
{
for (int i = 0; i < 10; i++ ) {}
//...
for (i = 0; i < 10; i++) { }
}

Regards,

Paul McKenzie

dude_1967
July 3rd, 2002, 03:40 PM
Thanks to everyone.

It is pleasing that so many experienced developers shared their knowledge.

When I was much youngerin the early 90's, I first learned C++. There was no real specification then. I learned that the scope of a loop variable was, in fact, confined to the lines within the loop. Furthermore, it is allowed to declare a variable of the same name in back-to-back loops, as many developers pointed out in the previous text.

I tend to agree that the answers of Paul McKenzie and Zeeshan are in conformance with the new language specification. However if portability is an issue, one might have to remain conservative and avoid relying on the tight scope.

The different behavior of various compilers, which are both of such incredibly high quality, is simply something that has to be watched out for (I'm, of course talking about VC and GNU). Unfortunately, I still code more bugs than the compilers artificially create in the translation.

Thanks again for all the information exchange.

Chris.

:cool:

Anthony Mai
July 3rd, 2002, 03:40 PM
Regardless of what C++ standard says or whether the code can compile by a small potato compiler or not, it is NEVER a good idea to jam a variable declaration into a for statement.

Plus you really have nothing to gain. Whether the loop variable is declared together with other local variables at the beginning of the function, or some where before the for statement, or within the for statement. The code is implemented exactly the same. The following two code snippets will compile into EXACTLY the SAME executable code:
int foo()
{
int result;
//other variables
int i;

//...
for (i=0; i<16; i++)
{
//...
}
return result;
}

int foo()
{
int result;
//other variables
//int i;

//...
for (int i=0; i<16; i++)
{
//...
}
return result;
}

Also, notice that the spot before the first ; within a for statement is reserved for variable initialization list, you can do something like this:
for (i=0,j=1,k=2; i<16; i++,j++,k++)
{
//...
}

If you insert a variable declaration there, you have deprived yourself the ability to put an initialization list there for those already declared variabled.

I challenge any one to write a for statement in which you declare two variables of different types. Can any one do it? No one can.

I also challenge any one to write a for statement in which you declare variables, AND also initialize variables already declared previously. Can any one do this? No one can.

Paul McKenzie
July 3rd, 2002, 04:25 PM
[QUOTE]Originally posted by Anthony Mai
Regardless of what C++ standard says...

Of course the standard means nothing to you.

... or whether the code can compile by a small potato compiler or not...

That comment is so ridiculous, it deserves no response except one: What do you have against other compilers which do things correctly?Ever get a copy of "Windows Developers Journal" and hear of "VC++ Bug of the Month"?

The code can't be compiled on any UNIX compiler that I've come across. It doesn't matter whether the compiler is "small potatoes" or "big potatoes" or "medium sized onions". If it can't compile it, it is not ANSI standard. Plain and simple.

Also, the original post was whether the code is valid or not. That was answered (most adequately by Zeeshan). It was not a question about programming style or how many compiler sales Microsoft has made.

Regards,

Paul McKenzie

Paul McKenzie
July 3rd, 2002, 04:34 PM
Originally posted by dude_1967
Thanks to everyone.

It is pleasing that so many experienced developers shared their knowledge.

When I was much youngerin the early 90's, I first learned C++. There was no real specification then. I learned that the scope of a loop variable was, in fact, confined to the lines within the loop. Furthermore, it is allowed to declare a variable of the same name in back-to-back loops, as many developers pointed out in the previous text.

I tend to agree that the answers of Paul McKenzie and Zeeshan are in conformance with the new language specification. However if portability is an issue, one might have to remain conservative and avoid relying on the tight scope.

The different behavior of various compilers, which are both of such incredibly high quality, is simply something that has to be watched out for (I'm, of course talking about VC and GNU). Unfortunately, I still code more bugs than the compilers artificially create in the translation.

Thanks again for all the information exchange.

Chris.

:cool: Your right about the portabiliy issue. Right now, I have to compile code with VC++, and under different versions of UNIX. I can't tell you how many times I've had to edit code to "factor out" the loop initialization so as to compile the code under UNIX.

So far, I've yet to come acrosss a UNIX C++ compiler that can compile the VC++ (i.e. wrong) version of the for() loop.

Regards,

Paul McKenzie

zdf
July 3rd, 2002, 04:36 PM
The old book I have (TC++PL, 2nd edition) says like this:


iteration-statement:
...
for ( for-init-statement; condition(opt); expression(opt) ) statement
condition:
expression
type-specifier declarator = expression



That means one can write:

int condition( int ai )
{
// do something creative
return ai;
}
for ( int i = 0; int j = condition(i); i++ )
{
// use i and j
}

VC++ does not compile the code above (missing int before; ). It’ll be interesting to find out what the others compilers say about this piece of code.
Is this of any practical use? Maybe…
One can say it looks ugly... Maybe...

Anthony Mai
July 3rd, 2002, 05:53 PM
Paul:

When you follow up, make sure you read my message first.

Which part of the code I cited can not be compiled on UNIX?

I am against putting variable declarations inside for statements. If you do not declare variable within the for statement, every thing will work just fine and every compiler can compile it.

It is when you declare a variable in the for statement that ambiguity of what scope that variable belongs to raises, and you have to modify to cope with different compilers. It is a trouble that you invite in the first place by declaring something like for(int i=0;;).

BTW, I do work on both UNIX and Win32 at the same time, on a daily basis.

And finally it is never without a reason that big potato compilers sale much better than small potato ones. It's something developers voted using their money.

Zeeshan
July 3rd, 2002, 11:47 PM
it is NEVER a good idea to jam a variable declaration into a for statement


Just want to reply one thing in response of this. Take a look at Item 32 of Effective C++ 2nd edition by Scott Meyers.


Item 32: Postpone variable definitions as long as possible.


Hope it helps.

cup
July 4th, 2002, 05:01 AM
I didn't quite understand
it is NEVER a good idea to jam a variable declaration into a for statement
There are two meanings one can derive from this statement

for (int i = 0; i < 1000; i++)
{
int a = b[i];
...
}

Is "i" or "a" the jammed variable declaration? There is no reason why "i" is a bad idea since it is only created once and only exists for the duration of the loop. "a" is a bad idea because it calls the int constructor every time. Since the topic of conversation is the loop control variable and declarations with the loop body, why bring this up?

Anyway, in my younger days, the term jamming meant something else: it was a form of optimization i.e.

// unoptimized loop
for (i = 0; i < 1000; i++)
a[i] = 0;
for (i = 0; i < 1000; i++)
b[i] = 1;

// Jammed loop
for (i = 0; i < 1000; i++)
{
a[i] = 0;
b[i] = 1;
}

Chambers
July 4th, 2002, 06:52 AM
I think that the point of this matter is the code

for(int i=0; i<10; i++)
//blah

will only compile on VC++ because it doesn't conform to the ANSI standard. What people have to realise is that this standard was developed years ago and it was a god send. I am just surprised that Microsoft haven't tried to alter the standard to accomodate for this technique (or have they??). Although I think the VC++ does make sense, it is certainly not self explanatory. The ANSI standard ensures that code is self explanatory and portable. The first param (as I said earlier) is meant for initialisation under ANSI standards, not variable declaration AND initialisation, which are two different things. Since this technique is not standard practice, I think it is better to declare it right before the for loop and not during the statement, because it makes life easier for everybody - remember that is what the ANSI standard was designed to do. As ... pointed out earlier, the code compiles into the same executable independant of whether the variable was declared in the for statment or outside it, its just probably more readable being declared outside the loop because then nobody would be bothered about the scope of the variable in the first place.

Alan.

Graham
July 4th, 2002, 07:46 AM
The ANSI standard is saying that declaring a variable in the control statement is functionally equivalent to declaring it inside the loop body. I find that quite reasonable. MS didn't implement that because it would have broken a lot of code that came before the standard and that was developed using older versions of VC++.

Given that we live in a real world and, in that world, MS is a major player, I think that we need to suborn our desire to conform to the standard to the the reality of what is. To that end, my personal approach is:

1) Try to use one of the standard algorithms, if that's feasible and appropriate (e.g. for_each, transform, accumulate). These tend to increase self-documentation (unless you're shoe-horning them into an inappropriate use).

2) Don't use the same variable (name) to control multiple loops in the same scope unless the loops are trivial iterations (but see 1 above) using a fundamental type (like int) in which case, declare it before the first loop.

3) Where there is only one loop, or where the loops are fundamentally different (and 1 above doesn't apply), declare a variable dedicated to that loop in the loop control statement.

Paul McKenzie
July 5th, 2002, 04:35 AM
Originally posted by Anthony Mai
Paul:

When you follow up, make sure you read my message first.

Which part of the code I cited can not be compiled on UNIX?The question is very simple "What are the current rules of the ANSI standard?" And yes, the code that is cited by the OP (which is what I'm talking about) does not compile using a UNIX compiler.It is when you declare a variable in the for statement that ambiguity of what scope that variable belongs to raises, and you have to modify to cope with different compilers [quote]. It is a trouble that you invite in the first place by declaring something like for(int i=0;;).What's troubling about it? That's the way C++ programmers naturally write loops. The troubling part is when a certain compiler doesn't follow the rules. You're a 'C' programmer, where you're forced to declare the loop variable outside the function block, so you are used to writing things that way. Do you also declare your variables at the top of the function block? And finally it is never without a reason that big potato compilers sale much better than small potato ones. It's something developers voted using their money. [/B]And that is not what counts. There are rules to the language. The ANSI comittee doesn't meet and decide on compiler issues because a lot of people buy VC++. This is a case where VC++ cannot boss ANSI around -- if anything, VC++ is scrambling to get their compiler *more* ANSI compliant. So your comment about "small potatoes" and "big potatoes" is totally irrelevant to the discussion of what the rules of ANSI C++ are. The question asked by the OP was an ANSI C++ question, the answer was given.

Regards,

Paul McKenzie

Zeeshan
July 5th, 2002, 05:33 AM
Well said Paul, You got full marks. Now i want to add something about Big vendors like Microsoft.

What about take a closer look at VC 6 and gnu. I run these programs on VC 6 Nt 4 sp 6a and Gnu 2.96 on Red Hat Linux 7.2.

Here is valid C++ Program but VC gives warning to it and gcc compile it


int main()
{
}


According to C++ Standard 3.6.1.5


If control reaches the end of main without encountering a return statement, the effect is that of executing

return 0;


This is valid code according to C++ Standard 3.4.2.2


namespace NS {
class T { };
void f(T) { }
};

int main()
{
NS::T param;
f(param);
}


But VC refuse to compile it.

Here is one more example. Here is a valid program according to C++ standard and compile correctly but refuse to compile on VC 6


#include <vector>
#include <list>
#include <algorithm>
#include <iostream>
using namespace std;

int main()
{
vector<int> vec;
list<int> lst;

vec.push_back(1);
lst.push_back(2);
lst.push_back(3);

vec.insert(vec.begin(), lst.begin(), lst.end());
copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, "\n"));

return 0;
}


Because insert member function of vector should be template function according to 23.2.4.3 section of C++ Standard but it is not template in VC 6.

Same is true in case of list 23.2.2.3, assign function of deque section 23.2.1.1.6 and insert of deque section 23.2.1.3

Also VC 6 didnt support exception specification, partial template specialization.

Now what about big Names?

Paul McKenzie
July 5th, 2002, 08:42 AM
Hello Zeeshan,

How about the opposite of your first example. This shouldn't compile whatsoever:

void main()
{
}

VC++ 6.0 compiles this, without error. The problem is that this code is invalid. Throw this code at another compiler, and you get the error that "main() must be defined as returning int", or something similar to that.

The C++ examples in MSDN are littered with this code. Also, a rule of thumb when buying C++ books is to look for the above. If the author uses it, get another book!

Of course, the fix is to do this:

int main()
{
}

You would think that VC++ would fix this, but so far, they haven't. (I haven't tried NET, but I bet it hasn't been fixed).

Regards,

Paul McKenzie

Zeeshan
July 5th, 2002, 09:04 AM
Yes paul you are right, i check this code on Win 2000 Server SP 2 with VC.Net and this code correctly compile there too.


void main()
{
}


Also VC.Net havnt fixed Koenig Lookup yet, this is still error in VC.Net although it is legal code.


namespace NS {
class T { };
void f(T) { }
};

int main()
{
NS::T param;
f(param);
}


In the same way this code is not legal in Standard C++


namespace NS {
class T { };
void f(T) { }
};

void f(NS::T) { }

int main()
{
NS::T param;
f(param);
}


But VC.Net compile it sucessfully. For further information take a look at Item 32 of Exceptional C++ by Herb Sutter. But VC.Net at least solve insert and assign problem of STL. Now this code compile sucessfully at VC.Net


#include <vector>
#include <list>
#include <algorithm>
#include <iostream>
using namespace std;

int main()
{
vector<int> vec;
list<int> lst;

vec.push_back(1);
lst.push_back(2);
lst.push_back(3);

vec.insert(vec.begin(), lst.begin(), lst.end());
copy(vec.begin(), vec.end(), ostream_iterator<int>(cout, "\n"));

return 0;
}


and this one also compile sucessfully on VC.Net


int main()
{
}


I havent check the VC.Net carefully according to ANSI standard, may be there are some more, but till now it hasnt support partial template specialization and exception specification.

Chambers
July 5th, 2002, 02:55 PM
Yeah but this IS microsoft we're talking about, whenever have they given a rats a**s about good programming techniques and robust coding? I just hope that, having identified these obvious flaws, people don't use them (because they are aware it is not good practice). I certainly have never declared a program statement

void main()
{
}

or

int main()
{
}

(even though the later complies with ANSI C++ standard. I feel lonely without that return 0; statement at the end :).

Alan.

stober
July 5th, 2002, 06:08 PM
Just as a side note -- some new programmers I've encountered have the impression that variables are actually allocated on the stack when they are encountered in the source code. In the VC6 compiler, this is not true -- all variables are allocated on the stack when the function is declared.

For example: if you have a function like this:

int foo()
{
for(int i = 0; i < 5; i++)
{
char buf[16];
// some code that does something
}

for(int x = 0; x < 5; x++)
{
char buf[16];
// some code that does something
}
}

In the above buf is actually using 32 bytes of stack space because it is declared twice, so the compiler must allocate space for both of them. The compiler allocates all stack space immediately following the function definition. (look at the assembly list file to see how this is done.)

If you are concerned about stack space, you should rewrite the function like this:

int foo()
{
char buf[16];
for(int i = 0; i < 5; i++)
{
// some code that does something
}

for(int x = 0; x < 5; x++)
{
// some code that does something
}
}

Part of the assembly output looks like this (my comments):

// Declare the function name
foo PROC NEAR ;
// save the current stack pointer
00000 55 push ebp
// Copy the stack pointer to a baseline pointer.
00001 8b ec mov ebp, esp
// Allocate space for all auto data objects declared within the function.
// This also includes space for the compiler's own temporary stack sace.
00003 83 ec 54 sub esp, 84 ; 00000054H

Paul McKenzie
July 5th, 2002, 10:34 PM
Originally posted by stober
Just as a side note -- some new programmers I've encountered have the impression that variables are actually allocated on the stack when they are encountered in the source code.Your "new programmers" should have stated that the ANSI C++ does not state where those variables should be stored. That would have been the proper answer. As long as the program performs the proper task (to ANSI specs), that is all that's required.

For example, everyone (or almost everyone) talks about the stack storing parameters when discussing C++ in general. The problem with this is that there are processors (and even compilers) that do not use a stack to store parameters. Computers based on RISC and Motorola processors have a boatload of registers, unlike the Intel processors. More than likely, these registers are used to store the parameters for these processors. On some other compilers, I believe there is a block of memory dedicated for arguments. If memory serves me correctly, the Watcom compiler uses registers if the parameter list is small. Same with the Borland compiler (a "fastcall" is what I believe Borland calls it).

Your code to "conserve stack space" may be the case for one version for a certain compiler, and maybe for a certain set of optimization switches for that compiler. If you used a different optimization, maybe the compiler would generate different code. Anyway, assembly listings become moot if you are using a different compiler or working in another OS.

Regards,

Paul McKenzie

stober
July 6th, 2002, 07:00 AM
Yes, you are correct about the parameters to the function. But the code and example I presented was about the auto data objects that are declared within the function. Also, my comments to not account for compiler optimizations when compiled for release mode. MSDN says VC6 may do "stack packing" for objects that are not within scope. By that, I assume that it will recognize the ineffiency in my example and allocate only 16 bytes for the buffer, not 32.

Anthony Mai
July 6th, 2002, 11:17 AM
Some clarifications and some disputes.

When I say "it is NEVER a good idea to jam a variable declaration into a for statement", I mean the for statement that starts with a "for" and ends with a closing parenthesis. I call the code block that starts and ends with {} following the for statement the "for loop". Is that clear.

I am against putting a variable declaration in the for statement but I am all for putting a variable statement INSIDE a for loop.

Quote from Graham: "The ANSI standard is saying that declaring a variable in the control statement is functionally equivalent to declaring it inside the loop body. I find that quite reasonable."

If that's truth, I find it quite ridiculous!

If a variable declared IN a for statement has a defined scope, it is NOT the same scope as a variable declared INSIDE the for loop. It has got to be a scope bigger that the for loop scope and smaller than the scope of the function that it occurs in.

For one thing, variables declared WITHIN the for loop can NOT be used in the for statement.

For another, variables declared WITHIN the for loop may come into scope and out of scope (constructed and destructed) multiple times, and may also never even come to existence even once, depending on the condition of the for statement. While as variables declared in a for statement comes into scope exactly once, regardless of the for condition.

So a variable declared in a for statement is never the same thing as a variable declared within the for loop. They are not even close.

Commenting on "Item 32: Postpone variable definitions as long as possible."

That is a good rule only if you don't follow it to the extreme. I still prefer to declare any local variable at the top of the function body, rather than in the middle. There are two problems.

1. If you declare something in the middle of line, and later on you modified the code and you would suddenly find the need to move the declaration a few lines ahead. And this need may occur again and again until you finally give up and just put it at the beginning of the function.

2.When looking up where and how a local variable is declared, you have to look through the whole function body to search for it, while as I can immediately jump to the beginning of function body and find it immediately.

If you really need to limit the scope of a variable, I would rather put in an extra pair of {}, and declare the variable within the {}, at the top. Declaring a variable some where in the middle of code block and let its existence linger there until the the end of function call, that's really not something right to do.

In any case, if you feel that putting a variable declaration at the begnning of function call would make it too far away from where it is first used, and hence hurt readability, then the problem becomes that you have a function that is too long, and it is time to try to split it into several sub functions to reduce size of each function call.

Graham
July 6th, 2002, 11:55 AM
Gawd, you do take the 'ump easily, don't you? I said "functionally equivalent". I was putting into English a sense of the formal statement used in the standard. Essentially, what I meant was that a variable declared in the control statement will (should!) die at the end of the loop processing. Yes, I know that there is a difference in that variables declared with the controlled body die and regenerate with each iteration, but the point is that the control-statement-declared variable should not have a lifetime to the end of the function. This might be important if, for example, you wanted to lock a resource for the life of a loop. Using VC++, you would have to put a superfluous set of braces around the loop, as you would if you declared the variable ahead of the control statement.

for (lock l, ... /* etc */)
{
// whatever
}

would have to be:

{
for (lock l, /* etc */)
{
// whatever
}
}

// or

{
lock l;
for (/* etc */)
{
// whatever
}
}

Now, anyone may argue that that's actually more explicit and therefore better. OK, that's a point of view. What is a point of fact, however, is that VC++'s interpretation of a piece of standard C++ is wrong/

Chambers
July 6th, 2002, 03:09 PM
Anthony, I don't suppose you learnt to program in Pascal by any chance? Your programming ideals seem to be in-line with a Pascal nature of programming (not that that is a bad thing). However, I do hasten to add that speed and program efficiency is of utmost importance in many resource sucking applications. People tend to use larger functions to avoid that heavy call overhead (as I'm sure you know). In such functions I personally don't see a problem with declaring the variable further into the function, to aid reading. Its just my personal preference to declare a variable where I use it. I think the main thing is that people should remain consistent in their coding, and so if you declare your variables at the beginning of a function block, then good on you, so long as you always do it.

Alan.

Zeeshan
July 7th, 2002, 12:03 PM
Sorry to again discuss the topic of C++ Standard and VC again, but according to Standard of C++ section 10.3.4 this is legal code of C++, i.e. different return type in function overriding, but VC 6 refuse to compile it


#include <iostream>
using namespace std;

class Base {
};

class Drive : public Base {
};

class base {
public:
virtual Base* fun() { cout << "base" << endl; return 0; }
};

class drive : public base {
public:
Drive* fun() { cout << "drive" << endl; return 0; }
};

int main()
{
base* pbase = new drive;
pbase->fun();

delete pbase;
return 0;
}


But at least VC.Net compile it. As far as standard is concern, VC.Net is much more better than VC 6.

stober
July 7th, 2002, 02:15 PM
remove the virtual keyword and it will compile ok with VC6.

Zeeshan
July 7th, 2002, 11:28 PM
But in that case (i.e. remove virtual from the function) we cant get polymorphic behaviour.

Graham
July 8th, 2002, 03:35 AM
It's just another one of the (very) many ways that VC++ 6 departs from the standard. Either you make sure you know the differences and live with them or get a new compiler. Some of us don't have the choice, so we have to live with it. It's annoying, but it's life.

Anthony Mai
July 8th, 2002, 08:13 AM
Chambers:

Please do NOT cite speed and programming efficiency as an excuse for writting long functions. For one thing, I spend a good part of my programming life in boosting code performance. For two, some one on this board can care less about code perforamnce. More than once when I showed examples where a plain C solution beats the standard C++ template solution by more than 10 times in speed, they say who cares, ANSI C++ standard compilation is all that counts.

And for three, it becomes irrelevant to performance when you are talking about functions so long that when a variable is declared at the top of the function instead of near where it is used, it hurts readability. We are not talking about a three line function, where calling overhead makes a difference, we are talking about 500 lines of code versus 50 lines of code. The function call overhead no longer makes a difference.

When your function becomes so large it is hard to look for variables at the beginning of the function, variable declarations will NOT be the only thing that's unreadable. The whole function becomes unreadable, and you really need to cut the function down into smaller pieces.

It is my experience that when improving code speed performance, usually only 1% of the code, and possibly less, is relevant. Trying to optimize the other 99% could only do a dis-service.

I do view code efficiency, robustness, and performance as the ultimate goal every programmer should strive for. If my product runs 10 times as faster, and is 50% less in code size, and never throws any C++ exception, than a competitor's product, I could care less that mine is written in assembly and plain C, mixed with some VB, while my competitor can boast full ansi C++ compliance.

Paul McKenzie
July 8th, 2002, 08:51 AM
Originally posted by Anthony Mai
Chambers:

Please do NOT cite speed and programming efficiency as an excuse for writting long functions. For one thing, I spend a good part of my programming life in boosting code performance. For two, some one on this board can care less about code perforamnce. More than once when I showed examples where a plain C solution beats the standard C++ template solution by more than 10 times in speed, they say who cares, ANSI C++ standard compilation is all that counts.I will only comment on this, since large functions are usually to be avoided in all languages.

I don't know what example you've shown to do this "speedup". It must be some super-contrived example that you must have come up with (you've done that before). It's hard to believe that unrolling a template causes a 10x speedup. Also C++ programmers have heard of something called a "profiler". They do use them, you know. Did you use a profiler on your code to see which parts were slow, or did you just (as always) knee-jerk blame C++ and went back to your 'C' coding? Second, (and I've said this many times), this is a C++ forum, where the solutions (whether you like them or not) will be C++ solutions. If someone posts to use try-catch, or use a template, or use a std::vector<>, is it your duty to say it's not a solution? If you went to the Java forum, would you post 'C' code, saying that Java is too slow?

And I know you don't want to hear this, but when C++ experts hear that "my program is slower in C++", the problem most of the time is that the programmer does not use the language or library properly, with the second (less of a reason) being that the compiler implementation is poor. It is not a problem with the C++ language in general. I can give you links that cite this, but why bother? Your "10x speed up" could be a "10x slowdown" on another compiler or on another platform.

Use whatever language you like. The problem is when you advocate other languages in a general C++ forum and with your advocacy, claim that the C++ solution is wrong.

Regards,

Paul McKenzie

Anthony Mai
July 8th, 2002, 10:28 AM
I only follow up on what Paul only commented.

I do not blame Paul for bad memory, but check my post just a few days ago, June. 20th, at 4:50pm, in the thread about sorting array of structures. I pointed out that a simple C solution which uses qsort to sort the index to the structure array easily beat a standard C++ solution which uses the sort() template function. The speed-up is 10-20 times.

You can not have code that is both generic and is the best in performance. You can not have a pair of shoe that is "one-size fits all" and which also fits your feet perfectly. You can't use C++ template and achieve code performance.

And am I not talking about C++ in this board? As long as I write any class/structure that has member functions at all, it's C++. I am against SOME stuff in C++ standard, like templates. It is C++, not C, when you take away templates, exceptions etc., the stuff that is harmful to performance and robustness, away.

I challenge Paul or any one to write a piece of code that can sort 1 million structure which contains at least two strings, each of which is at least 1KB long, using std:sort() and other STL stuff, within ONE hour, on a regular PC. I will give you one week.

Meanwhile, in time, I will provide a solution which takes just a few minutes to do the same thing, using plain C and no assembly code will be used.

Paul McKenzie
July 8th, 2002, 11:49 AM
Originally posted by Anthony Mai
I only follow up on what Paul only commented.

I do not blame Paul for bad memory, but check my post just a few days ago, June. 20th, at 4:50pm, in the thread about sorting array of structures. I pointed out that a simple C solution which uses qsort to sort the index to the structure array easily beat a standard C++ solution which uses the sort() template function. The speed-up is 10-20 times.Here is the quote from your supposed speedup solution:
A better thing to do is not to copy or move the objects at all, instead, create an integer array which contains indexes into the object array, and use qsort to sort just the index. After the indexes are sorted in proper order, access the objects through the indexes.You keep moving the goalposts in the middle of the game. So you sorted indices using qsort(). You did not qsort as would std::sort() would do, did you? You didn't do strcmp() did you? Comparing apples to oranges, don't you think? Don't you think that indices can also be sorted using std::sort()? Why not use your very same technique, but use std::sort() instead of qsort()? Trying to pull a fast-one?

Also if I were you, I wouldn't mention that thread. It kind of exposes a few things about your knowledge of C++.
And am I not talking about C++ in this board? As long as I write any class/structure that has member functions at all, it's C++. I am against SOME stuff in C++ standard, like templates. So basically, you are against C++. Templates are a big part of C++ programming, whether you accept them or not, as are exception handling and the standard library. When someone gives a correct answer pertaining to the C++ language, there is no reason to arrogantly chime in and say that the person is wrong (especially when they are not wrong).It is C++, not C, when you take away templates, exceptions etc., the stuff that is harmful to performance and robustness, away.This, of course, is your opinion.
I challenge Paul or any one to write a piece of code that can sort 1 million structure which contains at least two strings, each of which is at least 1KB long, using std:sort() and other STL stuff, within ONE hour, on a regular PC. I will give you one week.Yeah, just sort the indices using std::sort(), just like you did with qsort(). People have quoted line and verse from the standard to you, they have given you links to the experts on the language that go contrary to whatever you have to say, they've suggested books, papers, etc. You've even claimed that the inventor of the language is wrong about his own language. Why waste any time showing you code?

Regards,

Paul McKenzie

Bob Davis
July 8th, 2002, 12:30 PM
I'm not seeing how the sorting of the indices would work. To sort the array of integers, which admittedly would be much faster than doing the structs themselves, you need to do some sort of analysis of the structs. How is it any more efficient to extrapolate the indices from the structs, then sorting, rather than just sorting the structs themselves? It's been a while since I've read much about sorting algorithms, so I have to plead ignorance on this one. :)

Graham
July 9th, 2002, 03:39 AM
The saving would come in not having to shuffle potentially large objects around. They all stay where they are and you copy and move simple data types.

Paul McKenzie
July 9th, 2002, 05:54 AM
Originally posted by Bob Davis
I'm not seeing how the sorting of the indices would work. To sort the array of integers, which admittedly would be much faster than doing the structs themselves, you need to do some sort of analysis of the structs. How is it any more efficient to extrapolate the indices from the structs, then sorting, rather than just sorting the structs themselves? It's been a while since I've read much about sorting algorithms, so I have to plead ignorance on this one. :) As Graham pointed out, your moving ints rather than entire strings. Basically, it looks something like this using std::sort:

#include <algorithm>
#include <string>
#include <vector>
#include <iostream>

class SomeClass
{
public:
std::string fld1;
std::string fld2;
};

typedef std::vector<SomeClass> SomeClassVect;
typedef std::vector<int> IntVector;

struct SortFld
{
SortFld(const SomeClassVect& Vect, int which) : m_Vect(Vect), nw(which) { }
bool operator() (int first, int second)
{
if ( nw == 0)
return m_Vect[first].fld1 < m_Vect[second].fld1;
return m_Vect[first].fld2 < m_Vect[second].fld2;
}

private:
SomeClassVect m_Vect;
int nw;
};


using namespace std;

int main()
{
SomeClassVect SV;
SomeClass S;
IntVector IV(2);
IV[0] = 0;
IV[1] = 1;
S.fld1 = "ABC";
S.fld2 = "000";
SV.push_back(S);
S.fld1 = "000";
S.fld2 = "111";
SV.push_back(S);

cout << "Original fld1" << endl;
cout << SV[IV[0]].fld1 << " " << SV[IV[1]].fld1 << endl;

cout << "Sort by field 1" << endl;
sort(IV.begin(), IV.end(), SortFld(SV, 0));
cout << SV[IV[0]].fld1 << " " << SV[IV[1]].fld1 << endl;

cout << "Original fld2" << endl;
cout << SV[IV[0]].fld2 << " " << SV[IV[1]].fld2 << endl;

cout << "Sort by field 2" << endl;
sort(IV.begin(), IV.end(), SortFld(SV, 1));
cout << SV[IV[0]].fld2 << " " << SV[IV[1]].fld2 << endl;
}

So the only thing being done is the comparison, and not the moving of the data. Of course, you are not supposed to be able to do this using std::sort (according to one such soul here) ;)

Regards,

Paul McKenzie

Paul McKenzie
July 9th, 2002, 05:58 AM
Originally posted by Paul McKenzie
[B]You didn't do strcmp() did you? Comparing apples to oranges, don't you think?That should read that you didn't copy the structures, as a "normal" (i.e. 2 param) version of std::sort would do.

Regards,

Paul McKenzie

Anthony Mai
July 9th, 2002, 06:37 PM
Paul,

You are really embarrassing yourself by showing a piece of code that shows that you have no idea about how to write efficient code at all.

The code you show is far from meeting my challenge of sorting 1 million structures within one hour. Actually it will take probably one full day to do it.

I did a simple test with your code. Some modifications are made so it sort 8192 elements instead of two. And it takes 107 seconds to sort just 8192 elements, each contain just two string of just 3 charactors!!!! Full source code will be posted in my next message so any one can have a try.

Had you used std::sort directly on the structure array, you would have been better off. What's the point of sorting by index if you can not do it (do not know how to do it) faster?

What's your problem? Your problem is you create a whole copy of the original structure array, and sort on that copy. No wonder it is slow!!!

Now be serious. Are you up to the challenge at all to write a piece of code that can sort 1 million structures in less than one hour?

Anthony Mai
July 9th, 2002, 06:39 PM
// This is Paul's original code, modified slightly so an array
//of 8192 structures are generated and then sorted.
// It takes nearly 2 minutes on my machine!!!

#include <windows.h>
#include <assert.h>
#include <algorithm>
#include <string>
#include <vector>
#include <iostream>

#define ELEMENT_COUNT 8192

class SomeClass
{
public:
std::string fld1;
std::string fld2;
};

typedef std::vector<SomeClass> SomeClassVect;
typedef std::vector<int> IntVector;

struct SortFld
{
SortFld(const SomeClassVect& Vect, int which) : m_Vect(Vect), nw(which) { }
bool operator () (int first, int second)
{
if ( nw == 0)
return m_Vect[first].fld1 < m_Vect[second].fld1;
return m_Vect[first].fld2 < m_Vect[second].fld2;
}

private:
SomeClassVect m_Vect;
int nw;
};


using namespace std;

string RandomAlphaString()
{
int i;
char tmp[4];
string result;

const char char_tbl1[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";

for (i=0; i<3; i++)
{
tmp[i] = char_tbl1[((unsigned int)rand())%(sizeof(char_tbl1)-1)];
}
tmp[sizeof(tmp)-1] = 0x00;

result = tmp;
return result;
}


string RandomNumberString()
{
int i;
char tmp[4];
string result;

const char char_tbl2[] = "0123456789";

for (i=0; i<3; i++)
{
tmp[i] = char_tbl2[((unsigned int)rand())%(sizeof(char_tbl2)-1)];
}
tmp[sizeof(tmp)-1] = 0x00;

result = tmp;
return result;
}

int main()
{
int i, StartTime, EndTime;
IntVector IV(ELEMENT_COUNT);
SomeClassVect SV;

srand(GetTickCount());

for (i=0; i<ELEMENT_COUNT; i++)
{
SomeClass S;

IV[i] = i;
S.fld1 = RandomAlphaString();
S.fld2 = RandomNumberString();

SV.push_back(S);
}

cout << "Original fld1" << endl;
cout << SV[IV[0]].fld1 << " " << SV[IV[1]].fld1 << endl;

cout << "Sort by field 1" << endl;

StartTime = GetTickCount();

sort(IV.begin(), IV.end(), SortFld(SV, 0));

EndTime = GetTickCount();

cout << "Sorting " << ELEMENT_COUNT << " elements takes " << (EndTime-StartTime) << " ms" << endl;

cout << SV[IV[0]].fld1 << " " << SV[IV[1]].fld1 << endl;

cout << "Original fld2" << endl;
cout << SV[IV[0]].fld2 << " " << SV[IV[1]].fld2 << endl;

cout << "Sort by field 2" << endl;
sort(IV.begin(), IV.end(), SortFld(SV, 1));
cout << SV[IV[0]].fld2 << " " << SV[IV[1]].fld2 << endl;
}

Paul McKenzie
July 9th, 2002, 08:51 PM
Originally posted by Anthony Mai
Paul,

You are really embarrassing yourself by showing a piece of code that shows that you have no idea about how to write efficient code at all.
Please, you know very little about C++. You should be ashamed just showing up here.
The code you show is far from meeting my challenge of sorting 1 million structures within one hour. Actually it will take probably one full day to do it.Please go away.
That code was to show that std::sort can sort indices i.e. demonstration purposes only. That code was not for any challenge of "1 million elements".I did a simple test with your code. Some modifications are made so it sort 8192 elements instead of two. And it takes 107 seconds to sort just 8192 elements, each contain just two string of just 3 charactors!!!! Full source code will be posted in my next message so any one can have a try.With your track record, I highly doubt any one would touch your code.Had you used std::sort directly on the structure array, you would have been better off. What's the point of sorting by index if you can not do it (do not know how to do it) faster?See second paragraph.Now be serious. Are you up to the challenge at all to write a piece of code that can sort 1 million structures in less than one hour?I have no time for your sophistry. The thread was originally about whether a piece of code is ANSI compliant. Zeeshan mentioned line and verse from the standard. Then you chime in with a totally irrelevant comment about style, completely missing the point of the original question. Now you want to take the advantage of throwing this entire thread over to some ego trip that you are on. I am not going to take part in this, and if I weren't a good sport, I can provide links to some of the most bizarre things you've stated, outright insults, and totally irrelevant, misleading, and incorrect things about basic C++ and its usage, just to make the others here have a chuckle.

You posting 'C' code "in your next message" is your "Hail Mary pass" to save yourself (for those who are not familiar with American Football, that's the long, high pass that the team that is losing must throw completely down the field with the clock running out of time). I won't be providing any code to show anything. Now go along and pick another thread (and CodeGuru poster) to pick on. So far, you are really a person on an ego trip, and you've targeted individuals here (some aren't even posting anymore to CodeGuru) to go out of your way to prove what -- I have no idea.

Regards,

Paul McKenzie

stober
July 9th, 2002, 09:55 PM
Ok you guys -- quit your squabbling! I'm sick of you filling up my e-mail box with your blasted quarels. Send private e-mails to each other if you want to fight like children!

Paul McKenzie
July 9th, 2002, 10:00 PM
Originally posted by stober
Ok you guys -- quit your squabbling! I'm sick of you filling up my e-mail box with your blasted quarels. Send private e-mails to each other if you want to fight like children! I agree that this thread should have stopped on page 1, and it is clear who wants to keep this going. Higher-ups have been informed.

Regards,

Paul McKenzie