How to sample a node from a binary tree efficiently?
Hello,
I have a binary tree and I want to choose a node randomly from the tree, or from any subtree of this tree. The tree is changing during the algorithm as well. I mean nodes' position change in the tree.
The performance is very important in my program. I will be really appreciated if you have any idea about how to choose a node randomly from the tree.
Many thanks in advance.
Re: How to sample a node from a binary tree efficiently?
If all you want is random, just start at the root of the tree and go to the left or right child in a random fashion until you reach a leaf. Return that leaf. Note that this will not (I think) give you a uniform sample of the tree. To get something more uniform, you can probably do something like:
Keep track of the number of leaf nodes in the left and right subtrees of each node. Then choose which subtree to explore randomly, but based on their weights. That is, if there are L leaf nodes in the left subtree and R leaf nodes in the right tree, go left with probability L / (L+R). That is, if rng.NextDouble() < L/(L+R) go left, otherwise go right (where rng.NextDouble() is a function that returns a number randomly from a uniform distribution between 0 and 1).
Um, I think that will be random and uniform, but no proof is offered, so take it for what it's worth.
Re: How to sample a node from a binary tree efficiently?
Quote:
Originally Posted by
BioPhysEngr
Um, I think that will be random and uniform, but no proof is offered, so take it for what it's worth.
I think so too but if all nodes should have a chance of being selected you need to modify the probabilities like this. Probability that,
- a specific node is selected is 1 / (L+R+1)
- the node's left subtree is selected is L / (L+R+1)
- the node's right subtree is selected is R / (L+R+1)
According to definition, to be evenly distributed each of N nodes should be selected with an equal probability of 1/N.
Now say the above node is the root node. This means the whole tree has N = L+R+1 elements.
The root node itself is selected with 1/N probability.
The left subtree is selected with a probability of L/N. For its nodes to be evenly distributed they should be selected with a probability of 1/L and if they are the combined probability of subtree selection and node selection will be L/N*1/L = 1/N for each node in the subtree. The same is true for the right subtree. And the same will apply recursively to every subtree in the tree.
So all nodes in the entire tree will be selected with a probability of 1/N and thus be evenly distributed.