asbo, ShaunR:
How would you suggest I place clusters within clusters within clusters within... (dynamically)? I'm desperate for speed, and if it means sacrificing OOP design I'm willing to try it, but as far as I can tell LV specifies cluster contents pretty rigidly. Doing a prev/next doesn't work in a kd tree because we have a large number of dimensions... and implicit kd trees will give this type of structure but use huge amounts of memory (curse of dimensionality and all that). As you'll see in my project, I've already tried a simple brute force method with a simple array subtract/square/sum/min for finding the nearest point, and if it isn't fast enough with raw arrays it definitely won't be fast enough with clusters. Unless I've completely misunderstood you...
sam:
The recursive implementation is simply because I can't think of any other way to move into a tree of unknown depth in a data-flow environment without doing a copy of each subtree as we go. Sure,i could use a while loop, but it would be really slow as it would require copying each subtree onto the stack as I go.
Aristos Queue
Thanks for the feedback. I've gone through and implemented the ideas you've given, excepting the parent_kdtree class, and I've almost doubled my performance, but it's still 50x slower than the .Net. As I'm not really concerned about concurrent access, would you then recommend getting rid of the DVR-on-a-stack technique and instead go for a recursive unbundle/bundle without any DVRs at all then?
I thought the casting might be slow so I tried changing the data type in the queue to just straight kdtree DVRs and it slowed me down about 4x. Not sure why, but it seems that is the wrong thing to do.
mje: A kd-tree is a tree that works in more than one dimension, ie. for a k-dimensional point cloud. It is typically used for fast nearest-neighbor lookup. Check out http://en.wikipedia.org/wiki/K-d_tree. The one described on Wikipedia isn't quite what I'm using, as I don't store points on lower parts of the tree, just the leaves, and at each leaf I keep an array of points. I also have some optimizations such as keeping track of the hyperrect containing each subtree and only traversing that tree if that hyperrect could contain a closer point. A bit abstract, I know
I just tried profiling, and I spend more time in Exemplar:getSqDist than I do in the entire .Net lookup algorithm. So maybe I'll try a C++ .dll instead... sigh.