Jump to content

drjdpowell

Members
  • Posts

    1,964
  • Joined

  • Last visited

  • Days Won

    171

Posts posted by drjdpowell

  1. I just noticed that, although the OpenG version of “Wait (ms)” has an optional input for an Occurrence to use to Abort the wait, the complimentary version of "Wait Until Next ms Multiple” does not. I suggest modifying this VI to also accept an optional Abort Occurrence. Here is a modified version I just made:

    post-18176-0-08404200-1341407753.png

    post-18176-0-71279200-1341407763_thumb.p

    Modified Wait Until Next ms Multiple__ogtk.vi

    — James

  2. That's the point. IF you are going to make it viewable by other applications, they inherently assume the encoding by the pragma call (PRAGMA encoding; ). Sqlites default encoding scheme is UTF8 but you can set it to others so text in the DB "should" be one of the defined types (none of which LabVIEW supports natively). If, for example, Chinese characters are inserted ( which, in labview are MBCS) then they will not display correctly in other apps.

    Yes, but can I be certain that the string the User provides is actually meant to be interpreted in LabVIEW's standard encoding? Strings can be anything; LabVIEW only really applies an encoding for display purposes. The User could already be working with UTF-8 or any other encoding, and applying the so-called “String-to-UTF8” function would scramble that.

  3. I wouldn't worry too much about performance to begin with. Getting everything mapped out and functioning is (IMHO) more important since the optimisation does not prevent it's use and can take a while due to it being an iterative process (this can be achieved with each stable release).

    Hi Shaun,

    I agree (which is why I hadn’t spent much time on benchmarking till recently). Getting a working implementation in OpenG is more important, as optimization can happen later. And SQLite is very valuable even when less than 100% optimized; I’ve written a couple of applications with it so far and the speed is not an issue.

    If you are looking at making it directly compatible with other apps for viewing, you will need to insert using the "string to UTF8" and recover using the "UTF8 To String" vis as the methods Matt and I use do not honor this. UTF8 Conversion

    I’m not sure I want to make that conversion an implicit part of the API. Users may want the full UTF-8 (which I don’t think is recoverable once it goes through "UTF8 to String”). And if they are using regular LabVIEW text (ANSI, I think) then it is a subset of UTF-8. I think it is better to document that the SQLite character encoding is UTF-8 and that ANSI is a subset, and let the User deal with any issues explicitly. Perhaps I should include those conversion primitives in the palettes.

    — James

  4. If I saved the number with the variant interface it will not be stored as a null (that's why nan's are flattened)...

    There are differences between blob and text, but I think they're more meaningfully when your language uses c style strings...

    I’m thinking about interoperability with other programs (admittedly, not the most common use case) that don’t use flattened NaN’s and the like.

    When I read text the CLN's return value is a string, if that strings length doesn't match the expected number of bytes (can only happen if it contains \0) then I reread it using the moveblock method. So if the string doesn't contain /0 I can read it faster, but if it does mine is slower. This optimization is the reason my select is faster than Shaun’s.

    I’m surprised that would be faster than the MoveBlock method (but I’ve not benchmarked it).

    I would suggest using Byte Array to String instead of type cast, they're the same speed and the Byte Array will error if it's input type get's changed some how.

    Good catch; I don’t know why I didn’t use Byte Array to String there.

    Just seems like a lot of work to fix a rare, non critical bug. And locking every function seems like it'll have a performance hit. Personally I would just add something to the documentation that the error description can potentially be wrong if two errors occur nearly simultaneously, and not worry about. The hard part in my mind is verify that whatever you did actually fixed the bug. For now I would suggest adding the errmsg (it's really helpful with syntax errors), and make fixing the race condition a low priority.

    I think you’re right. I’ll get the error message added.

    Thanks,

    — James

  5. Mine handles null fine, how it handles it depends on the mode it's in. In string mode it get's zero length strings, In variant mode the variants are null, the newer typed reading depend on the particular type for that column.

    What I mean is, if you take your null variants from variant mode and try to cast them to a number, the “Variant to Data” node will throw an error. Your other two modes specify the type when getting the column, as mine does, allowing SQLite to do the Null conversion.

    SQLite text can hold binary data just like LabVIEW strings. In mine Blobs are typically used to hold flattened LabVIEW data (although they don't have to). As far as SQLite is concerned there is very little difference between a blob and text, the only things that come to mind are type affinities and blob i/o.

    Yeah, but why does SQLite, which is very economical in numbers of types, bother making separate types for TEXT and BLOB? Must make a significant difference somewhere. Remember, I want to remain compatible with non-LabVIEW programs, which may have there own constraints on valid TEXT data. Binary data is NOT valid UTF-8 or UTF-16 data.

    I do have an eye towards eventually implementing BLOB I/O. Another difference between TEXT and BLOB are the collation functions and sort order.

    I have an optimization where I assume TEXT rarely contains /0 when reading columns, but that's not a functionality difference.

    Could you explain this? I don’t see how /0’s have any effect. I extract TEXT or BLOBS as strings with this code, which is unaffected by /0’s:

    post-18176-0-59422300-1341135301_thumb.p

    The only way I know of that might work is to have the vi's for savepoints/begin start the lock, and VI's for commit/release release the lock. Otherwise the user cannot compose multiple SQL commands together with out the risk of parallel work screwing it up. Since SQLite is by default in serialized threading mode, I'm not sure if that setup would even gain any protection. There's only so much you can do to protect a programmer from them self. With yours what would happen to the data output if someone made a copy of the statement object and tried to process the two of them in parallel, I think you'd get a really nasty race condition, and I'm not sure if there's a good way to stop them that from happening. I've been meaning to add a non copyable class wires to the idea exchange for stuff like that, but I never really fleshed out the design in my head.

    A good point about the Statement, but a User could be running multiple statements from the same connection. I only need to lock the connection from function execution to query of the extended error code in order to be sure I get the correct code.

    Shaun's and mine are both highly optimized, so it'll take some work to catch up to them. I would suggest either inlining the majority of your vi's (mine does this) or use subroutine priority (Shaun's does this) as the first optimization to try.

    To get in OpenG I have to be 2009 compatible, which means no inlining. And I think OpenG frowns on advanced optimizations (or even turning off debugging) so I may be stuck here.

    — James

  6. I use the type of the data within sqlite to determine how to read it. When you use "variant to data" with a variant containing 64 bit int (as read from sqlite) it can be converted into a 32bit int without error(as can a double to single). So I store int (all sizes),singles,doubles and strings as their related sqlite types. empty variants as nulls, and every things else (including NaNs) as flattened data . As mine is written anything saved via variant, when read back as a variant will "variant to data" back to it's originally type without loss of data. Which handles all the use cases I could think of. NaN's being flattened was the only iffy part about it. I don't think variant support is critical, but with the way my interface works it gives some advantages.

    If I’m imagining it right, your package goes the route of getting SQLite to serve as a LabVIEW data-type repository. I would guess you could abstract away the details of the SQLite loose typing system from the User entirely, making it simpler to learn.

    I went a different route of defining an explicit boundary between the two type systems, with the User dealing directly with the SQLite types. So my BIND VIs refer to SQLite3 types (Text, Blob, Real, Integer) while my GET COLUMN VIs refer to LabVIEW types (string, DBL, I64). On the its side of the boundary, I let SQLite store things as it wants, including storing empty strings or NaNs and NULL. This, I think, is an advantage if I were to need to interact with an SQLite Database created by another program; can your package handle reading data from columns where some rows are NULL?

    How do you handle the fact that LabVIEW strings can represent either ANSI text or binary data? The former maps to SQLite_TEXT, while the later maps to SQLite_BLOB. Do you store all strings as TEXT?

    Mine is a lower-level approach, I think, which has tradeoffs of greater flexibility but also greater required knowledge to use. Fortunately SQLite has excellent documentation.

    On one of my benchmarks where I write a bunch of string data to the database I'm 12% slower if I pass the path in on just the bind text CLN. It should be worse with numeric data (since those have far less overhead). I remember it being worse the last time I checked so I guess some optimizations were made since then.

    I need to recheck my benchmark, as 25 ns seems unbelievably fast even to me.

    I just assumed that the user (me in my case) would never use the same database connection in two places at once. The possible bugs from that are far worse than the rare incorrect error description, so I just considered it back practice and don't try to deal with it's problems.

    I’m trying to get this into OpenG, though, and don’t want rare race conditions. I could get around this issue using DVRs or some other lock, but that’s some effort, so I’m putting it off until I fully understand all issues.

    It's not the LVOOP mine uses LVOOP and is faster than Shaun's. On Shaun's benchmark with 100000 points, his is 181.95 insert and 174.82 dump mine is 155.43 and 157.77

    I realized a problem in using my Example1 as a benchmark. Fixing that, I’m still slower than Shaun by about 40%. I need to see if I can improve that.

    — James

    In dealing with an oversight I found in the lvsound2 library recently, I was experimenting with forcing a DLL to unload unilaterally. Unfortunately, it does not seem this is possible.

    In the context help for the “Specify path..” checkbox, there is a “Tip” that seems to indicate the ability to unload a path-referenced dll:

    post-18176-0-64655900-1341085350.png

  7. I’ve been benchmarking it (by just running a “Bind” in a loop), and using the path adds about 25 nanoseconds per CLN. Haven’t figured out yet why my code seems to be slower than Shaun’s (hope it’s not the LVOOP :) ).

  8. If I would have to have a guess, using a not changing diagram path adds up maybe 100us, maybe a bit more, but compare that to the overhead of the Call Library node itself which is in the range of single us. Comparison of paths on equality is the most expensive comparison operation, as you only can determine equality if you have compared every single element and character in them.

    A quick test using my “Example1” shows that I can INSERT 100,000 points, each involving 4 calls with a diagram path, in 0.75 seconds (this time does not include the “COMMIT” to disk). That’s less than 2 microseconds per CLN. So the overhead of the diagram path can’t be that much. Though if it is a significant fraction of the 2 microseconds, then it will be good to eventually get rid off it.

    — James

    Added later: I had a look at ShaunR’s “SQLite_Speed Example.vi” which INSERTs pairs of strings: he can INSERT 100,000 in 0.36 seconds, half my time. So perhaps I will look into statically specifying the library. Wish I could specify it in one place, though. One thing a User might want to do is have a different SQLite3 version (compiled with different options, for example) for different applications, and statically specifying the library for each CLN makes that problematic. Is there any way to specify the path at runtime, but do it only once? Or at compile time, but specify it in only one place?

  9. Hi Matt, thanks for bringing your experience to this.

    Handling Variants can be done (mine handles them) but there's several gotchas to deal with.

    SQLite's strings can contain binary data like LabVIEW strings. It looks like your functions are setup to handle the \0's with text so that's not a problem. So you can just write strings as text and flattened data as blobs, then you can use the type of the column to determine how to read the data back. The next trick is how to handle Nulls. As your code is written now NaN's, Empty strings and Nulls will all be saved as sqlite Nulls. The strings are null because the empty is string is passed as a 0 pointer to bind text. So when you have an empty string you need to set the number of bytes to 0 but pass in a non empty string. I never figured out an Ideal solution to NaN's. Since I treat null's as empty variants I couldn't store NaN's as nulls. The way I handled NaN's was to flatten and store them as blobs. I also would flatten empty variants with attributes instead of storing them as nulls (otherwise the attribute information would be lost). Be aware of the type affinity since that can screw this up.

    It was my feeling that there is no clean way to directly connect SQLIte3’s loose typing system with LabVIEW variants. One could make a system similar to the OpenG Variant Config VIs, where one inputs a cluster to define the datatypes to read in, but a straight “Get Column as Variant” seems to have too many gotchas to be worth it. If one did want to store arbitrary LabVIEW datatypes in SQLite, one could just flatten the data and store as BLOB, but I thought that option could be left outside the scope of the package.

    I like how you used property nodes to handle the binding and reading of different types. If you don't mind I might try to integrate that idea into my implementation.

    Please do. I have wondered if it is a good idea to make functions like Step or Finalize also available as Property nodes, as that would allow more compact code in many cases (though as these functions aren’t really “properties” that might be confusing).

    If you want to improve the performance, passing the dll path to every CLN can add a lot of overhead to simple functions (at least when I last checked).

    Is that true? I wouldn’t have thought that, but I have never tested it. The advantage of passing the dll path is that one can alter it easily. Do you have an performance data with your system that I could compare to?

    If your executing multiple statements from one SQL string you can avoid making multiple string copies by converting the SQL string to a pointer (DSNewPtr and MoveBlock be sure to release with DSDisposePtr even if an error has occurred). Then you can just use prepare_v2 with pointers directly.

    I realized this after I did it. But I don’t want to introduce “pointers” into any public API function like “Prepare”. I am considering making an alternate, private version of “Prepare” that uses a pointer in this way to allow higher performance in VIs like “Execute SQL".

    You might want to add the output of sqlite3_errmsg to "SQLite Error.vi" I've found it helpful.

    On the “to do” list. Slightly tricky because of the issue of needed mutexes described in the documentation:

    "When the serialized threading mode is in use, it might be the case that a second error occurs on a separate thread in between the time of the first error and the call to these interfaces. When that happens, the second error will be reported since these interfaces always report the most recent result. To avoid this, each thread can obtain exclusive use of the database connection D by invoking sqlite3_mutex_enter(sqlite3_db_mutex(D)) before beginning to use D and invokingsqlite3_mutex_leave(sqlite3_db_mutex(D)) after all calls to the interfaces listed here are completed."

    — James

    • Like 1
  10. This is where the potential problem lies. It's API is too simple and loses robustness. I assume Redeem Future Tokens.vi blocks execution waiting until the Future is filled. What happens if a Future is never filled? Maybe an actor has an error and shuts down after the message is sent but before it has a chance to fill the Future. How do you tell the async subvi to quit waiting so your application has a chance to recover? Granted the example is simple enough the correct behavior can be verified visually, but in a larger application that won't necessarily be true.

    I was just going to use the Timeout, which would throw an error message.

    Off the top of my head I can think of two options if you don't want to poll the Future:

    1. Use a "fail and notify" system like I mentioned earlier. The async subvi waits for a finite amount of time for the data, and if the data isn't ready it automatically fails and alerts the Requestor of the failure.

    Simpler to throw the error message downstream to the Consumer. One could add another input for a queue to send the error messages, but I’m thinking of going the simple route. If the Consumer is a standard Actor design of mine, it will publish received error messages, and Requestor can register for error messages if it wants them.

    2. Promote the async subvi to an actor by implementing a message handler. Add an "Exit" message so it can accept an external shutdown message. Implement code tracking the progress of the Futures through the application so someone can figure out whether or not they need to send an Exit message to the helper actor.

    I have a “Cancel Future” VI that can be applied to invalidate the future tokens if one needs this. This immediately causes the helper to error out and shutdown, having the same effect as an “Exit” message. If the VI hierarchy that created the futures goes idle, that will also invalidate the queue references inside the futures and shut the helper down. So “Exit” functionality is already there if you want it and there is an automatic exit feature. Otherwise there is the timeout.

    Both options require adding more overhead code to manage the process. I think overall Option 1 requires roughly the same amount of code as having the Requestor receive by-value reponses from the actors, while at the same time obscuring the code's intent and execution flow a bit more. Option 2 seems to require a lot more code for little benefit. I'd have to have really good reason for choosing that option.

    But the helper is reusable. Once it works, I don’t care how complex it is internally because no-one needs to look inside it. And I only have to write it once; “Requestor” is code that needs to be written for each application. Instead of internal complexity, I care about the clarity and simplicity of the API.

    If you also route the "requestor" requests through the "Wait on Responses" (no need for your dotted line then) the you end up with the "Dispatcher" that I've been describing.

    I had ment to ask you if your framework supports replies to messages. I would imagine it would if your messages are of the form “Target->Sender…” and can easily be reversed. But can your dispatcher gather replies into ordered groups?

  11. The helper actor I’m thinking of would be fully generic and reusable; it would be dynamically launched and configured with an array of Futures and index over them to get the array of messages. It’s API would be very simple.

    So I took the time to actually do it. Reworked the prototype “Futures” implementation I mentioned at the start of this conversation so that it had a helper actor.

    post-18176-0-16493400-1340204655_thumb.p

    The above code implements this diagram (though I didn’t make the “Requestor” a message handler, it could be):

    post-18176-0-79753100-1340204689_thumb.p

    Note the random delays in the three Actors; the reply messages are sent in arbitrary order, yet the set of messages received by the Consumer are always ordered A, B, C.

    The “helper actor” (not really a full actor, just an async subVI) is quite simple (though I have yet to complete error handling):

    post-18176-0-80263800-1340205280.png

    “Redeem Future Tokens.vi” both waits for the futures to be filled, and destroys the Future Token (internally, the Future is a single-element queue). This deliberately makes it impossible to use polling on the Future.

    — James

  12. I would say the "real value" of Futures is in the abstract concept. It's the idea of creating and holding an object that doesn't have the value you need yet, but it will sometime in the future.

    Yeah, but how useful is that in LabVIEW?

    The basic use for “futures” in step-by-step text languages is very similar to the dataflow already present in LabVIEW. Only once we’re talking about message-handling loops does a “future” become interesting, and in that case it’s hard to see how useful they are when we’re already using asynchronous messaging. In your example application, it’s only the fact that you need multiple TransformData messages for only one DoCalc message that make the futures solution interesting. It’s that you can pass an array of futures to DoCalc, and thus gather your four separate TransformData responses together that is something you can’t otherwise do as easily.

    Other than that you still have to write all the same functional code that you would if you had the Consumer collecting the responses and Futures become unnecessary. (Unless, as you pointed out, there was some metadata associated with the responses that couldn't be retained in the response itself or reasonably communicated to to the the helper via a message.)

    Not really. The helper actor I’m thinking of would be fully generic and reusable; it would be dynamically launched and configured with an array of Futures and index over them to get the array of messages. It’s API would be very simple.

    ...and messaging Futures is a way to get the functional equivalent without implementing query/response logic, in certain situations.

    I noticed that your futures were very similar to the “message reply” system I use. I attach a “return address” to the message, and you attach the future. Both allow the direction of responses to arbitrary recipients. Though with futures, the recipient has to be written to specifically handle futures, while with replies it’s just an ordinary message.

    — James

  13. I was thinking about this a while last night, and I wondered if the real value of “futures” is in defining an ordered grouping of otherwise independent asynchronous messages. Imagine, for example, that one process needs to make requests of several other processes, with responses to these requests being dealt with all at once.

    post-18176-0-80164700-1340109411.png

    The problem here is that the Response messages come individually and in any time order, meaning that “Consumer” needs to have logic to identify and store the messages, and determine when all are available.

    The advantage of using an array of Futures here (passed between Requestor and Consumer) is the very fact that it is an array; it is grouped and has a defined order. Thus Consumer need only index out the elements of this array and need not have any complex logic.

    The array of Futures serves to predefine a grouping of multiple asynchronous messages that have yet to be sent.

    As is, the Futures have the downside of requiring potential blocking or polling in Consumer. However, this can be avoided by using a small helper process that is dedicated to waiting on the array of Futures and forwarding the resulting array of massages:

    post-18176-0-33235000-1340110037_thumb.p

    Note that the “Wait on Responses” Actor is serving to group and order the messages, before passing them to the Consumer. Requestor makes a set of requests, and Consumer receives a corresponding set of responses.

    — James

  14. Hi Daklu,

    A comment:

    If your “Model” was a complex, multi-loop construct like your last post diagram, it is possible that you might put your future-filling logic (“TransformData”) in a different loop than the Future-redeeming logic (“DoCalc”). It would then be possible for the future to be redeemed before it is filled, which for your DVR design would return default data, followed by an “invalid DVR" error message from “TransformData". A “future” based on a Notifier would instead just block momentarily if this happened, and would be a much more widely applicable construct because of that. Your DVR future can only be used in cases where it is filled and redeemed in the same loop, or can otherwise be assured it is filled before redeemed.

    — James

  15. 1) Isn’t the point of futures to block (not poll), just like synchronous operations, but to delay the block until the information is needed, rather than when it is requested? It’s “lazy blocking”. And very similar to standard LabVIEW data flow (blocking till all inputs available).

    The information I read about them described them like an IOU in a "I don't have the data now so I'll give you this instead, and later on you can turn it in to get the data" kind of way. The point was to avoid blocking, not just postpone it. That said, I've not used them in any other language nor seen how they are implemented, so what do I know? For all I know what I implemented aren't really futures.

    If something takes 10ms and one delays blocking for 11 ms, then one has avoided blocking all together. I hadn’t appreciated, though, that you are filling your futures in the same message handler that is redeeming them, and thus in your case there is no possibility of ever actually blocking on the redeeming of the futures. Clever, and I can’t think of a cleaner way of doing it.

  16. Re: 2) Yes, it is easier to code than watching for all the messages to come back. I wonder, though, if it might also be easier to design a "round robin" message: create a message with a list of processes to visit, send the message to the first one, it adds its info, then passes the message to the next process on the list, coming back to the original process when it is done. That would reduce the "do I have them all yet" bookkeeping and still be consistent with asynch messaging. I've never tried to build anything like that.

    A “round robin message” would work, but would be serial, rather than parallel. And I suspect a “Wait on all Futures” actor would be just as simple.

    The information I read about them described them like an IOU in a "I don't have the data now so I'll give you this instead, and later on you can turn it in to get the data" kind of way. The point was to avoid blocking, not just postpone it. That said, I've not used them in any other language nor seen how they are implemented, so what do I know? For all I know what I implemented aren't really futures.

    I suspect we all have somewhat different ideas about what “futures” might be. My first reading on futures was some webpage (which I can’t find again) that gave a pseudocode example like this:

    future A=FuncA()

    future B=FuncB()

    …do something else...

    C = FuncC(A,B)

    Here FuncA and FuncB run in parallel, and the code blocks at the first use the results. Note that we can already do this kind of stuff in LabVIEW due to dataflow.

  17. Thoughts on Futures, as I understand them (and without reexamining Daklu’s implementation):

    1) Isn’t the point of futures to block (not poll), just like synchronous operations, but to delay the block until the information is needed, rather than when it is requested? It’s “lazy blocking”. And very similar to standard LabVIEW data flow (blocking till all inputs available).

    2) One use of Futures I can think of is if I wish to request information from several processes, and perform some action only when I receive all replies. I can send all the requests and pass the array of futures to a spawned “Wait on all Futures” process/actor that sends a single bundled-reply message back to the original process when all the futures are filled. This would be much easier than having to record each reply and checking to see if I have all of them.

    — James

  18. I want them to work with both. Do I have to use references inside my delegates or is bundling/unbundling inside IPE structures enough?. There's so much to learn... :lightbulb:

    It will only work with by-value objects in limited cases. Every wire branch leads to a separate object. For example, in your last attachment, you have an “ImplementorInit” VI that returns five entirely independent “Implementer” objects; two in Child 1(at parent and child levels), two in Child 2, and an overall object that holds Children 1 and 2. If these were five references to a single by-ref object then you would be able to work on that object from any of your methods. But by-value you are working with different objects; changing one has no effect on the others.

    In the code I posted, I’m trying to keep all the by-value objects together, with no copies, so any method in one interface can call methods on any or all of the over interfaces. Child2 can access and modify the Child1. Child1 can access and modify Child2.

    — James

    BTW: Daklu has an interface framework in the code repository. He uses a by-ref object (using a DVR).

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.