bitset seems the way to go then, but I would want to come up with more flags for the program, which shouldn't be hard.
For performance, it may matter or it may not, but the class this is part of will be instantiated for many, many, and maybe many, many more agents in the system, all of which may test this value for each and every other instance of the class so if there are, say, 100 agents in the program, each one would check this value of each other instance, giving you 9,900 checks for the value, and this class was designed to check this value whenever another class satisfies the check's requirements, say time of day, and if all instances are online at the same time, worst case scenario, 9,900 checks. Now assume this check is part of an overhead loop for the program and each class makes this check every second. Now this value check is happening, again with a worst case scenario, 9,900 times per second, or 594,000 times per minute. That is why I want the most efficient data type.
I will look into bitset, but I would like to ask more question on the subject: Would inline assembly for C++ help me with manipulating single bits better than C++ alone does? I already know assembly and enjoy it.
bitset seems the way to go then, but I would want to come up with more flags for the program, which shouldn't be hard.
OK. The bitset is a template class, so you only pay for the overhead at compile time (so a bitset<10000> is no different than a bitset<1> in terms of runtime speed).
For performance, it may matter or it may not, but the class this is part of will be instantiated for many, many, and maybe many, many more agents in the system, all of which may test this value for each and every other instance of the class so if there are, say, 100 agents in the program, each one would check this value of each other instance, giving you 9,900 checks for the value, and this class was designed to check this value whenever another class satisfies the check's requirements, say time of day, and if all instances are online at the same time, worst case scenario, 9,900 checks. Now assume this check is part of an overhead loop for the program and each class makes this check every second. Now this value check is happening, again with a worst case scenario, 9,900 times per second, or 594,000 times per minute. That is why I want the most efficient data type.
The check will come down to either calling a function to check a bool, an int, a bit, whatever.
The real issue is the higher-level design and maintainability. I think this is more important than the data type at this stage. Once you have a working design, you profile the code and determine where the bottlenecks occur (again, you can't optimize by sight). Changing the data type is the easiest task (and could even be made transparent if using templates).
Also, "over-pointerizing" the code doesn't necessarily make the code faster, especially in this day and age of C++ optimizing compilers. There are cases where using pointers (where the coder is trying to be smarter than the compiler), leads to slower code. The reason being that the compiler's optimizer can't make heads or tails of the pointer manipulations, and the pointer-aliasing renders the optimizer useless.
I will look into bitset, but I would like to ask more question on the subject: Would inline assembly for C++ help me with manipulating single bits better than C++ alone does? I already know assembly and enjoy it.
Look into this after you have a running program and it has been profiled.
There are profilers built into Visual Studio, but I haven't used them since I have use commercial products similar to the ones above. Also see this link for free profilers (don't know the quality of them though):
nuzzle posted before I got this post done. That makes a lot of sense. I am ok with using a bool because the data will never be interpreted in a way the human will see except for possibly one or two instances, where a simple conversion can be made.
Separation of concerns is another very general principle of good programming. Internal data handling is one thing and user display is another. Keeping them well separated will increase flexibility. The program becomes more modular.
"It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong."
Richard P. Feynman
For performance, it may matter or it may not, but the class this is part of will be instantiated for many, many, and maybe many, many more agents in the system, all of which may test this value for each and every other instance of the class so if there are, say, 100 agents in the program, each one would check this value of each other instance, giving you 9,900 checks for the value
That sounds like you could get a massive performance improvement by finding a smarter algorithm that just comparing all pairs. When you are talking about optimization, the first thing to consider is algorithms and data structures. Making the right choices here can make your program several times as fast. Only then should you look at lower-level issues, such as how to avoid redundant copies (where the compiler doesn't do it for you already) or how to improve cache performance. The really low-level stuff is something you hardly ever have to touch, in my experience. Compilers nowadays are great at handling the low-level things to make your code as fast as it can be. The exception might be when you are targeting a specific platform.
Regarding the use of booleans, my 2 cents is that you should be careful to use booleans when the meaning of true or false may not be clear in the context where it is used, especially when using several in parallel. E.g. if you have the function
Code:
void DoSomething(bool option1, bool option2);
// that will be called like
void foo()
{
DoSomething(true, false);
}
It is not clear what the true and false mean in the function call, even if you give the function and its arguments meaningful names. You can use enums instead to make the code self-documenting.
I wanted to add to always, always, be very very careful with values "that can only hold two states". Unless your variable actually holds a representation of "true/false", more often that not, it ends up being able to hold 3 or more states.
If you created an interface based on bools, you'll be f'ed beyond belief.
In these cases I recommend to always go for an enum. As long as you have only two states, it'll be just as efficient anyways. But the day your variable holds a third "grey" state, you'll have some hope that your program can handle it...
Is your question related to IO?
Read this C++ FAQ LITE article at parashift by Marshall Cline. In particular points 1-6.
It will explain how to correctly deal with IO, how to validate input, and why you shouldn't count on "while(!in.eof())". And it always makes for excellent reading.
D_Drmmr and monarch_dodra, thank you for more clarification. I guess I was too concern with the low level stuff due to my assembly education. So I can trust my Visual C++ 2008 Express Edition compiler to optimize as best it can for me? And now that you mention the enumeration for the states possibly having a third, I rethought my data and I could stretch it a bit for maybe a third state. It could cause a nice bit of confusion or lols in the program and it would make the class more modular. Thanks for your help.
D_Drmmr and monarch_dodra, thank you for more clarification. I guess I was too concern with the low level stuff due to my assembly education. So I can trust my Visual C++ 2008 Express Edition compiler to optimize as best it can for me?
You can trust the optimizer to do a great job after you have fixed the project settings. In all its wisdom, Microsoft decided that it was a good idea to turn off optimizations in release builds by default. This was fixed in later versions, but if you use VS2008, you're stuck with the bad default settings. The most important ones are under 'C/C++ -> Optimization' and 'C/C++ -> Code Generation' in the project settings. Also, make sure you define _SECURE_SCL as 0 in release builds. See http://msdn.microsoft.com/en-us/libr...(v=vs.90).aspx.
Cheers, D Drmmr
Please put [code][/code] tags around your code to preserve indentation and make it more readable.
As long as man ascribes to himself what is merely a posibility, he will not work for the attainment of it. - P. D. Ouspensky
So I can trust my Visual C++ 2008 Express Edition compiler to optimize as best it can for me?
ok, but you should not trust the compiler passively; as Paul said, you should ask yourself how to write C++ code that enables the compiler to produce optimized code. In c++, this is neither obtained by excessively using pointers, reducing calls, packing data structures or whatever, nor by writing Java-like code; this is obtained by leveraging the type system.
For example, if you want to copy a range of data in STL you write something like
indipendently of what's the nature of those ranges. Now, if those begin()/end() return pointers to POD's than memcpy will be called, if they are iterators of some special kind then some other copy-strategy will be used, etc... . Do you see it ? the compiler is using the information embedded in the types to automatically select the more efficient method for you. And note that this happens at no readability/maintenance cost. And it's scalable in the sense that the more abstraction you add the more information the compiler will have to optimize your code ( for example, have you ever heard of linear algebra expression templates libraries ? these can make c++ code better performing than numerically-oriented languages like fortran ... ).
This is in contrast with low-level optimizations which, albeit useful in the specific context in which they were thought, don't scale with even innocuous code changes resulting in high mantainability costs. Of course, you can/should use them wherever no type information is possible or required ( for example, this can happen deep inside the implementations of those higher level constructs ), but with monarc_dodra's remarks in mind, of course.
Last edited by superbonzo; November 24th, 2012 at 04:48 AM.
Bookmarks