Daklu Posted March 21, 2008 Report Share Posted March 21, 2008 For my current project I have string data that I need to have organized into a tree structure. Since there is already a tree control I decided to use that and created a bunch of support VIs to do the operations I need, including decomposing the tree into a variant and vice versa. Now that I've done that, I've discovered it's much too slow. I suppose it shouldn't surprise me; there is an awful lot of overhead that isn't really necessary for my simple needs and there are tons of operations using references. So my question is, what is a good way to simulate a tree structure without using the tree control? Two solutions immediately come to mind: Store the Tree as an Array of Clusters - Each cluster in the array corresponds to a single tree node. The cluster contains the data string, the index to the parent node, and an array of indicies to the children nodes. It seems like there are lots of advantages using this method over the tree control: Easier (and faster?) conversions between the data type and variants (with an array I might be able to skip using variants altogether), no jockying with tags, can act on the data directly instead of using references, etc. Store the Tree as Variant Attributes - I found an .llb in the Code Repository written by John Lokanis that uses this type of implementation, although I haven't looked close enough to see how well it will fit my needs. Additional stream-of-thought questions: If I use an array, what's the best way to manage it? If I fill it sequentially I imagine it will get rather large and sparse after repeated add/remove operations. Hashing seems a bit overkill, assuming I could even come up with a decent algorithm. Perhaps preallocating an array and having it run through a compression routine when it fills up? Hmm, if I preallocate an array with default clusters how would I keep track of which element is the 'next empty' element as the array gets passed around? Reserve element 0 as a pointer? Encapsulate the whole thing as a variant and keep track of the pointer using an attribute? (ugh) If I implement it as variant attributes I'll have to convert it every time I want to do an operation on the tree. How does variant conversion compare to array access in terms of speed? Are there any other inherent advantages to using arrays or variants that make one "better" than the other for this? Thanks. Quote Link to comment
jdunham Posted March 22, 2008 Report Share Posted March 22, 2008 Have you checked out the Map Class? Quote Link to comment
crelf Posted March 22, 2008 Report Share Posted March 22, 2008 QUOTE (Daklu @ Mar 20 2008, 04:35 PM) For my current project I have string data that I need to have organized into a tree structure. Have you seen http://forums.lavag.org/Tree-Control-API-file71.html' target="_blank">Norm's Tree Control API? Quote Link to comment
Norm Kirchner Posted March 22, 2008 Report Share Posted March 22, 2008 I've actually recently made some updates to the tree control API (unsubmitted as of yet) that does something great! It now has an integrated variant based database which allows storage of data associated with each element of the tree. The only thing is that it's fully operational but I couldn't in good faith post it because I haven't tested it as well as the rest of the API. The only thing that really I'm concerned of is management of the data as elements are moved/removed/cleared and so on. If I get piinged hard enough I'll try to take some time to get something submitted. Quote Link to comment
John Lokanis Posted March 23, 2008 Report Share Posted March 23, 2008 QUOTE (Daklu @ Mar 20 2008, 02:35 PM) If I implement it as variant attributes I'll have to convert it every time I want to do an operation on the tree. How does variant conversion compare to array access in terms of speed?Are there any other inherent advantages to using arrays or variants that make one "better" than the other for this? The variant tree structure is fast for reading but slow for writting. So, if you build the data tree once and then read from it often, this is a good approach. If you do a lot of reading and writing, then I would look at the Map Class mentioned above. I have not tested it against my variant tree but I plan to someday if I ever get some spare time. Let us know how you end up solving the problem! -John Quote Link to comment
Daklu Posted March 25, 2008 Author Report Share Posted March 25, 2008 QUOTE (crelf @ Mar 20 2008, 04:11 PM) Have you seen Norm's Tree Control API? I did see that and looked at it a bit, but it also uses the tree control and carries the overhead associated with it that I'm trying to avoid. (I wish I had seen that before I cooked up my version though.) QUOTE Have you checked out the Map Class? Interesting, but... QUOTE (AQ) THE CODE IN THIS .zip FILE IS REALLY ADVANCED USE CASE OF LabVIEW CLASSES. Novices to classes may be scared off from LabVIEW classes thinking that all the code is this 'magickal'." ...I'm scared. I understand the concepts of classes but have little pratical experience with them. The discussion alone is over my head much less the code . Maintaining it could be difficult. I'll have to play around with it when I'm ready to rewrite the tree section. QUOTE (jlokanis) The variant tree structure is fast for reading but slow for writting. So, if you build the data tree once and then read from it often, this is a good approach. Lots of reading and writing. I guess I'll either have to go with AQ's map class or an array of clusters. Quote Link to comment
Norm Kirchner Posted March 25, 2008 Report Share Posted March 25, 2008 QUOTE (Daklu @ Mar 24 2008, 03:17 PM) I did see that and looked at it a bit, but it also uses the tree control and carries the overhead associated with it that I'm trying to avoid. (I wish I had seen that before I cooked up my version though.) Will the data ever be displayed or was your thought to use the tree to only store the data in a hierarchical format? The reason I ask is what is the overhead you're speaking of, because to grab values/properties from the tree is very fast depending on what you're doing, and w/ the Variant based DB that I have integrated, if you're extracting and modifying information and adding information, the only overhead is the usual time associated w/ the variant operations. Quote Link to comment
jdunham Posted March 26, 2008 Report Share Posted March 26, 2008 QUOTE (Daklu @ Mar 24 2008, 12:17 PM) ...I'm scared. I understand the concepts of classes but have little pratical experience with them. The discussion alone is over my head much less the code . Maintaining it could be difficult. I'll have to play around with it when I'm ready to rewrite the tree section. Don't be scared. You don't have to do anything with the classes. I modified part of that library and one of the things I did was to clean up the API so that you can use it right out of the box. Maintaining it should be extremely easy. The only thing is that initial reports (see the end of that forum thread) say that it doesn't run as quickly as expected, and AQ apparently hasn't yet had time to figure out why. Quote Link to comment
Daklu Posted March 28, 2008 Author Report Share Posted March 28, 2008 QUOTE (Norm Kirchner) Will the data ever be displayed or was your thought to use the tree to only store the data in a hierarchical format? It will never be displayed as a tree although I will display parts of the tree in combo boxes occasionally. 99% of the time I use the tree I don't care about displaying it. QUOTE (Norm Kirchner) The reason I ask is what is the overhead you're speaking of, because to grab values/properties from the tree is very fast depending on what you're doing, and w/ the Variant based DB that I have integrated, if you're extracting and modifying information and adding information, the only overhead is the usual time associated w/ the variant operations. [WARNING - Long boring application details follow] This will be a little easier to explain if I give you an example of the data heirarchy and explain a little bit about how I implemented the application. One particular data file type I work with is organized into the following 'filters.' (I'll explain the naming in a minute.): Test Number -> Speed -> Angle -> Offset Each data file may have multiple Test Numbers, each of which may contain multiple Speeds, each of which may contain multiple Angles, each of which may contain multiple Offsets. It is important that I am able to have cousins with the same name, since I will have (for example) data with the same angle but different speeds. There will be several data points associated with each unique filter combination. When I load the data I decode a 'tag' value recorded with each data point to determine what filter combination that data point belongs to. I group all the points with identical filter combinations into an array and bundle that array with a tree representation of the filter combination. This cluster I refer to as a 'Level 1 Element.' (Creative naming, I know.) I store all the data that has been loaded in an array of L1 elements. (Since multiple files of the same type of data are loaded, I prepend the filters above with a file alias filter to uniquely identify the data file, but we'll ignore that.) My application is for data analysis. For example, I'll take a set of data and calculate the min, max, mean, and st dev of the error. The data set may include a single L1 element, or it may include an entire data file. Users need to be able to choose how much data to include in the data set when displaying data and calculating metrics. I populate combo boxes with appropriate filter values and users use the combo boxes to select the data they wish to view, hence 'filters.' Undoubtedly some of the overhead has to do with poorly written code and bad architecture on my part. These may not apply to your code, but some of the inefficiencies I've encountered are: Since I couldn't store a tree as a native type in the L1 element I had to store it as a variant. But, since you can't convert a tree directly into a variant and maintain the structure, I built a pair of vis to convert between a variant and a tree control. Any time I want to operate on a tree I have to convert it into a tree before I do anything. Let's say I have a data file with 10,000 data points and there are 100 unique filter combinations in this data set. With each data point I have to search through my L1 elements to find the one this point should be added to. On average I will have to search through 50 L1 elements before I find the one I want. That means I have to convert a variant to a tree 500,000 times just to load that data. Every time the user changes a filter combo box I have to do a variant -> tree conversion for each L1 element as I search the data store for the right data. (My initial implementation attempt stored the data heirarchically instead of as an array with a unique filter combination. That would have improved data searching efficiency but it got very confusing very quickly.) All the operations I'm interested in require a reference to the control. As I understand it reference operations are inherently slower than operating directly on data. I might have to build thousands of individual tree branches for a given data set. It appears to add up quick. Simply deleting all the nodes in a tree seems to take an unusually long time. (Which I have to do often seeing as how 'Reinit to Default' only changes the selected value in the tree and doesn't alter the structure at all.) My tree nodes are defined by the path, not by the node value. Due to my kissing cousins (cousins with the same name) knowing a node string doesn't do me any good unless I know the entire path. Because the tree control doesn't allow identical tags they have no value to me beyond using them for add/delete/etc. operations. I spend a lot of time finding the tag <-> string relationships, especially when checking a tree to see if a branch exists. I'll be the first to admit I didn't spend a lot of time looking through your api. Perhaps I dismissed it too quickly without giving it a fair shake. I'll take another look at it. With the unique requirements for my tree my hunch is that a simple, application-specific tree will work better than shoehorning it into a multipurpose tree api. (Of course, my hunches often get me in trouble...) QUOTE (jdunham) Don't be scared. You don't have to do anything with the classes. I have looked at it since my last post and it clearly belongs in the deep end of the pool. (Where'd I put my floaties?) Assuming I could decipher it enough to use it, I'm not sure that implementation would work since it appears to be a binary tree. Features like auto-balancing would also really mess me up. QUOTE (jlokanis) Let us know how you end up solving the problem! This is what I've cooked up so far. It is far from a general tree implementation but I think it will do what I need. I haven't tested it much nor benchmarked it yet so I don't know if it is better than what I've got. Changing the tree implementation is fairly major application surgery; I'll need to test the various solutions before weilding the knife. Download File:post-7603-1206578444.zip Quote Link to comment
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.