Click to See Complete Forum and Search --> : [RESOLVED] error resistant read and multithreading
superbonzo
January 28th, 2009, 04:03 AM
Suppose that two threads share the same variable, say, an array of int's (in general, a POD type); the first thread regularly writes and reads from/to that variable and the second thread only needs read access to it;
furthermore, the second thread doesn't need an "error free" view of that variable: I mean, it can estimate the consistency of the read and it doesn't matter if occasionally the read value is inaccurate or even totally inconsistent.
My question is, is doing this without any synchronization mechanism safe ? (i.e. is there any side effect other then the possibly inconsistent read of the variable ? )
MrViggy
January 28th, 2009, 10:54 AM
If I remember my multi-threaded programming correctly, it really kinda depends on the size of the int. In general, if the int is the natural size of the machine (i.e. 32 bit int on a 32 bit machine), then writing is atomic. You don't really need any synchronization. However, you've already stated that this isn't really a requirement.
The only side affect, in this situation, would be that a write to the array may be "incomplete" when you go to read from it. As long as the array size isn't changing, then you can probably get away without any synchronization.
Viggy
Codeplug
January 28th, 2009, 11:36 AM
>> ... then writing is atomic.
This is an architecture dependent statement. I'm guessing x86 was in mind, which can be summarized here (http://cboard.cprogramming.com/showpost.php?p=771803&postcount=6).
>> ... then ... You don't really need any synchronization.
Architecture and algorithm dependent statement (and very dangerous).
>> it doesn't matter if occasionally the read value is inaccurate or even totally inconsistent.
So why even read in the first place?
Word-tearing and atomicity aside, you still have to consider memory-model/cache-coherency and compiler optimization issues.
Trying to shortcut synchronization is just "playing with fire" - and even the "experts" get burnt.
gg
superbonzo
January 28th, 2009, 12:48 PM
>> it doesn't matter if occasionally the read value is inaccurate or even totally inconsistent.
So why even read in the first place?
There are two situations that comes to my mind (and that I'm actually exploiting in code) where that assumption apply
(off course, there could be others, so there is also an interest on my original question per se (i.e. given the assumptions, is there any side effect other then the possibly inconsistent read of the variable ? ) ) :
1) the read/write thread is a worker thread doing some computation and the read-only thread is supposed to give a visual representation of the partial result; this representation doesn't need to be accurate, so if the incosistent reads are "sufficiently occasional" (what does this mean concretely depends on the specific computation+user interaction) the "peeking" thread can estimate concistency dropping a visual update or, if the user tolerates it, it simply draw an unconsistent visual update (delegating to the user the interpretation of what he actually see on screen) (by the way, the "user" it's me or a colleague of mine :) ).
2) you have a couple of worker thread that are doing some computations that can in principle run in parallel without sharing any data (so, ignoring the beginning and the end, they don't need synchronization );
now, suppose that these calculations run faster if they get to know the partial result of each other;
suppose also that if the partial result is inaccurate (off course, this happens anyway at the beginning ) the calculation simply run at "normal" speed. But if the result is accurate (and the peeking thread can recognize this condition) the calculation run faster.
So, it doesn't matter if the inaccuracy comes from the algorithm or from the multithreading artifacts: again, if the incosistent reads are "sufficiently occasional" the complessive computation will run faster (as above, the meaning of "sufficiently occasional" depends on the algorithm and the actual measurements).
Off course, the reason of my concern is primarly to simplify the code ( eg. avoiding synchronization )
Codeplug
January 28th, 2009, 01:50 PM
Well, you're system shouldn't catch fire if you're doing unsynchronized reading from shared memory just for visualization - as long as your visualization code is ready for any possible data values.
If you want to speed up computation by utilizing results of prior (parallel) computations, then you should synchronize access. Trying to detect the correctness of the data seems futile when you have guarantees with proper synchronization.
Partitioning the work such that data isn't even shared is usual ideal when it comes to parallelized algorithms. Don't get sucked into thoughts of premature optimization. Also note that there are factors like "false (http://developer.amd.com/Membership/Print.aspx?ArticleID=32&web=http%3a%2f%2fdeveloper.amd.com) sharing (http://msdn.microsoft.com/en-us/magazine/cc850829.aspx)" that you should consider as well.
gg
superbonzo
January 28th, 2009, 03:15 PM
Well, you're system shouldn't catch fire if you're doing unsynchronized reading from shared memory just for visualization - as long as your visualization code is ready for any possible data values.
ok, so, can we say that the unique possible side effect is data inconsistency (eg. there will be no crash, memory corruction, or any other unexpected behaviour, indipendently of eventual compiler optimizations/settings ) ?
Partitioning the work such that data isn't even shared is usual ideal when it comes to parallelized algorithms.Don't get sucked into thoughts of premature optimization.
I would agree if the data sharing implyed the need of synchronization; In that case the coding effort is not proportional to the hypothetical benefits. But in the case of a read-only POD variable (in my specific situation) it's only a matter of <10 lines of code.
Trying to detect the correctness of the data seems futile when you have guarantees with proper synchronization.
maybe my situation is rather atypical, but the only check needed is if the variable (that is a float) is inside a specific interval.
Now, if a multithreading artifact corrupts the read then if it lies outside the interval is rejected, if it lies inside the interval (and I assume that is uniformly distributed, although is not an essential assumption) then the algorithm accepts it ( and nothing special happens ).
You could ask how can such a thing speed up the computation ... well, that variable is used to "suggest" initial conditions to the algorithm. A "bad" suggestion (eg. uniformly distributed in the interval) is roughly equivalent to the default initial condition; a "good" suggestion makes the algorithm converge many times faster.
STLDude
January 28th, 2009, 04:13 PM
I don't know if that any useful for your situation, but have you looked at lockless algorithm?
Codeplug
January 28th, 2009, 05:03 PM
>> there will be no crash, memory corruption, or any other unexpected behaviour
Typically, a read in of itself won't do that - it's what you do with the read data and whether or not it means anything.
>> I would agree if the data sharing implied the need for synchronization
From a high-level perspective, access to shared memory always requires synchronization. From a Posix perspective, it's non-standard without it. Once C++0x is finalized, it'll be undefined behavior. Without using the synchronization primitives provided by your compiler/implementation - you should also consider unsynchronized access as invoking undefined behavior.
>> it's only a matter of <10 lines of code.
Prefer correctness over LOC.
>> Now, if a multithreading artifact corrupts the read...
There are other things to consider than just "artifacts". It may be completely valid for every read to return 0.
gg
superbonzo
January 28th, 2009, 05:43 PM
>> It may be completely valid for every read to return 0.
well, even in that case the algorithm would simply run at "normal" speed : as sead, the uniformity of the probability distribution of a corrupted read is not an essential assumption ...
anyway, if this is (or will be) undefined behaviour then I won't do it.
Thanks
codeguru.com
Copyright Internet.com Inc., All Rights Reserved.