Jump to content

Feedback Requested: Daklu's NI Week presentation on AOD


Daklu

Recommended Posts

What other kinds of challenges would you like to see?
Well, I am struggling with a few actor programming challenges.You state in your presentation that the message handler should always be waiting. Therefore, it needs to hand off action to another actor or a helper loop ASAP when it receives a message. This sounds a lot like the producer part (UI Event loop, for example) of a standard producer-consumer architecture. The goal is to always be ready for the next message. In command pattern message systems like AF, what do you do if the message says to do something that takes an unknown amount of time? For example, what if the message requires the actor to access a database? That database call can take significant amount of time depending on what the call is doing and if there are retry-able errors. Should that be handed off to a helper loop? Or should it be handed off to another actor?If you say helper loop, then do you create a helper loop for every potentially slow action that acting on a message can instantiate? That seems like a lot of loops that could be hard to maintain.If you say, create a DB handler actor, then what is there to stop that actor from backing up with requests for database calls if the database is running slow? (a very common real-world occurrence) My next challenge is dealing with state. Actors in AF (and in other actor based implementations) often have state data associated with them. This can be set at initialization or can be changed by a message at some point in the life of the actor. This state data can also affect how an actor responses to message. In the example above with the database, let’s say an actor is sent a message that says 'do this action' but that action requires information contained in a database. The actor needs to get the data before it can fully process the message. How do we best handle this?Do we do the DB call in-line and wait for the data, then take the requested action? That would block the actor.Do we hand the DB call off to the helper loop or DB handler actor and have it populate the state data with a later message that then also triggers the action?What if the actor receives another message that depends on that state data from the database but the helper loop/DB actor has not returned it yet? Do we save that message for later? Do we ignore it? Seems complicated. The last challenge is how to get state machine behavior out of an actor. If you have a series of steps that need to happen and are set off by a specific message (like 'run this script'), how to you construct a set of actors and helper loops to accomplish that? You want your actors to not be blocked. You want to be able to interrupt the process at any point with an outside message. You want to be able react to state changes between messages and change the order of the steps (say, the script has a 'goto' or an abort or pause step in it). So, you do not want to queue up a bunch of messages like your favorite QMH. But you do need to have the system react to changes and take appropriate actions. Does all of this belong in a single actor with helper loops or a set of actors or something else entirely? Boiling this down to some fundamental questions:How do you deal with data dependencies between messages (state)?How do you decide when to in-line, when to make a helper loop and when to create another actor?How do you implement state machine behavior in an actor based architecture?
Link to comment
Maybe one way to go is to show a simple implementation of a particular concept in each AOP style: AF, AA, QSMLH (queued-state message-loop handler).

 

I can do the AA implementation, but I don't want any part of a QSMLH implementation.  Maybe someone from NI would do it? :shifty: (Don't forget the JKI-SM.)

 

 

You state in your presentation that the message handler should always be waiting. Therefore, it needs to hand off action to another actor or a helper loop ASAP when it receives a message. This sounds a lot like the producer part (UI Event loop, for example) of a standard producer-consumer architecture. The goal is to always be ready for the next message. In command pattern message systems like AF, what do you do if the message says to do something that takes an unknown amount of time? For example, what if the message requires the actor to access a database? That database call can take significant amount of time depending on what the call is doing and if there are retry-able errors. Should that be handed off to a helper loop? Or should it be handed off to another actor?

 

Yes, that task should be handed off to either a helper loop or a sub actor.  Which one you choose depends on how much functionality that logical thread needs.  If it’s just waiting for a response from the db, will forward it to your MHL, then exit, I’d say make it a helper loop.  If instead it’s an abstraction of the database connection, I’d probably make it a full-blown actor.

 

 

If you say helper loop, then do you create a helper loop for every potentially slow action that acting on a message can instantiate? That seems like a lot of loops that could be hard to maintain.

 

I create a helper loop or sub actor for every computation that is potentially slower than it needs to be to meet the project’s requirements.  Is it a lot of loops?  *shrug*  I dunno… what do you consider a lot?  I probably use more loops than most other developers.  Taking a quick look through a smallish RT project I’m working on, the RT code has 17 while loops and the FPGA code has 13 loops.  Is it hard to maintain?  Not if you do a good job assigning responsibilities and designing the API for each loop.  (Incidentally, helper loops can still be dynamically launched, so you don’t need to create 100 static loops on your block diagram just in case you’re waiting for 100 db connections.)

 

 

If you say, create a DB handler actor, then what is there to stop that actor from backing up with requests for database calls if the database is running slow? (a very common real-world occurrence)

 

You are correct—it’s turtles all the way down.  That’s one of the reasons why I make a stink about the difference between a MHL and QMH.  Eventually the actual waiting part will be delegated to a helper loop.  How many layers of actors you need to go through to get to the helper loop depends on your design.

 

 

My next challenge is dealing with state. Actors in AF (and in other actor based implementations) often have state data associated with them. This can be set at initialization or can be changed by a message at some point in the life of the actor. This state data can also affect how an actor responses to message. In the example above with the database, let’s say an actor is sent a message that says 'do this action' but that action requires information contained in a database. The actor needs to get the data before it can fully process the message. How do we best handle this?

Do we do the DB call in-line and wait for the data, then take the requested action? That would block the actor. Do we hand the DB call off to the helper loop or DB handler actor and have it populate the state data with a later message that then also triggers the action?

 

Most of the time I design systems without having to use request-response messaging, but sometimes it is necessary.  When I need a RR message I usually design them to be non-synchronous.  Actor A sends the request to Actor B, optionally setting an internal flag indicating “I’m waiting for a specific response.”  When a message arrives from Actor B, A checks the flag to see if it's the message he is waiting for and acts accordingly.

 

 

What if the actor receives another message that depends on that state data from the database but the helper loop/DB actor has not returned it yet? Do we save that message for later? Do we ignore it? Seems complicated.

 

I can't tell you what you should do--what behavior you want that actor to have?  You could discard the message and tell the sender to try again later.  You could store the message in an internal buffer and process it when the first DB call returns.  You can open another database connection (dynamic launching FTW) and have both queries going in parallel.

 

[Edit - If you don't know what you should do with the message then your system design is incomplete.  Step away from the code and go back to your model.  That's where you'll find the answer to your question.]

 

 

The last challenge is how to get state machine behavior out of an actor. If you have a series of steps that need to happen and are set off by a specific message (like 'run this script'), how to you construct a set of actors and helper loops to accomplish that? You want your actors to not be blocked. You want to be able to interrupt the process at any point with an outside message. You want to be able react to state changes between messages and change the order of the steps (say, the script has a 'goto' or an abort or pause step in it). So, you do not want to queue up a bunch of messages like your favorite QMH. But you do need to have the system react to changes and take appropriate actions. Does all of this belong in a single actor with helper loops or a set of actors or something else entirely?

 

You’re asking about a sequencer.  Like implementing a flow chart with branching logic and whatnot, except you also want interrupts (which incidentally flow charts don’t allow) right?  I’ll describe the basic approach I take to ensure there are no race conditions, but there could be alternative rule sets that are also thread safe. 

 

Let’s say you have designed your process using a flow chart and decided it is correct.  Create a sub actor with a message handlers for each process block on your flow chart.  Do not create message handlers for decision blocks.  Do not have message handlers queue up other message handlers.  The sub actor is simply a concurrent thread capable of executing the individual processes required by your flow chart.  It knows about each step in the process, but it has no idea how the steps are connected.

 

Every time a step is finished, the sub actor sends a StepCompleted message to the super actor along with any data that needs to be persisted or is required for branching logic.  The super actor takes that information, figures out what the next step should be, and sends a message and data to the sub actor requesting it to do that step.  Rinse and repeat until your process is complete.

 

The super actor is responsible for knowing what the next step should be; the sub actor is responsible for knowing how to do that step.  Because the super actor’s queue is not clogged up with lengthy flow chart steps, it is free to receive messages from external entities requesting to change the normal sequence of steps.  That’s about as close as you can get to implementing interrupts in a data flow language.

 

 

Boiling this down to some fundamental questions:

How do you deal with data dependencies between messages (state)?

How do you decide when to in-line, when to make a helper loop and when to create another actor?

How do you implement state machine behavior in an actor based architecture?

 

1. I don't understand what you mean by "data dependencies between messages."  Can you give me a use case to help me understand?

2. Any message handling code can be inlined as long as it finishes before the next message arrives.  (Helpful, huh?)  As a rule of thumb, I inline data accessors and decision logic (the brainy stuff) in the MHL.  Lengthy computations, continuous processes, periodic processes, etc. (the brawny stuff) gets pushed into helper loops and sub actors.  Deciding between a helper loop or sub actor comes down to how much functionality you want that logical thread to have.  Complex functionality requires an actor, simple functionality gets by with a helper loop.

3. I assume you mean behavioral state machine, and not queued state machine, yes?  I implement BSMs by nesting message handling loops inside a simple state machine.  (If you really meant queued state machine, then my response is "I don't."  They behave badly, why would I want to replicate that?  :P )

Link to comment
The cooler example is too involved to quickly understand - the theory would get lost under the wires (unless one already knows AF, in which case simple code can teach the theory just as well as more complex code).

 

I've been poking around in the cooler example again.  IMO, a big reason it is so complicated is because it illustrates so many different concepts and has a lot of abstract classes.  I've started an AA implementation that strips away all the extra stuff and focuses on the basic AOD principles.  I'll post it in a new thread when it's ready.

  • Like 1
Link to comment

Yes, that task should be handed off to either a helper loop or a sub actor.  Which one you choose depends on how much functionality that logical thread needs.  If it’s just waiting for a response from the db, will forward it to your MHL, then exit, I’d say make it a helper loop.  If instead it’s an abstraction of the database connection, I’d probably make it a full-blown actor.

Would you dynamically create an actor to handle each DB request and have it destroy itself after it completed and returned the results?

What if you wanted to share a DB connection, to avoid the expense of opening and closing the connection for each transaction?  You could wrap the DB class in a DVR and then have each DB actor use that object.  This would have the effect of serializing your DB calls (if you needed logic to do error retries and restore the connection if it goes bad).  Would that be a bad implementation?  (this could apply to any shared resource like a file or some hardware)

It seems to me that it might be best implemented as an actor or helper that has limited lifespan and is exclusive the caller.  But, my only concern is cleanly shutting down the system if we try to exit while the DB actor/helper is in the middle of processing.  If we are going to call it an actor, then it is already violating the principle that its MHL is always waiting, right?

 

Most of the time I design systems without having to use request-response messaging, but sometimes it is necessary.  When I need a RR message I usually design them to be non-synchronous.  Actor A sends the request to Actor B, optionally setting an internal flag indicating “I’m waiting for a specific response.”  When a message arrives from Actor B, A checks the flag to see if it's the message he is waiting for and acts accordingly.

I understand this but I am not sure I like it.  I am thinking it might be better design for Actor B to send the data to Actor A along with the next step it should do.  That way Actor A only knows two things: How to ask for data and how to process data when received.  This sounds kinda like what you say below with your sequencer.

 

I can't tell you what you should do--what behavior you want that actor to have?  You could discard the message and tell the sender to try again later.  You could store the message in an internal buffer and process it when the first DB call returns.  You can open another database connection (dynamic launching FTW) and have both queries going in parallel.

 

[Edit - If you don't know what you should do with the message then your system design is incomplete.  Step away from the code and go back to your model.  That's where you'll find the answer to your question.]

I want to design it such that an Actor is never asked to do something it is not ready to do.  It seems like it might be best to divide the flow logic up between multiple actors.

 

You’re asking about a sequencer.  Like implementing a flow chart with branching logic and whatnot, except you also want interrupts (which incidentally flow charts don’t allow) right?  I’ll describe the basic approach I take to ensure there are no race conditions, but there could be alternative rule sets that are also thread safe. 

 

Let’s say you have designed your process using a flow chart and decided it is correct.  Create a sub actor with a message handlers for each process block on your flow chart.  Do not create message handlers for decision blocks.  Do not have message handlers queue up other message handlers.  The sub actor is simply a concurrent thread capable of executing the individual processes required by your flow chart.  It knows about each step in the process, but it has no idea how the steps are connected.

 

Every time a step is finished, the sub actor sends a StepCompleted message to the super actor along with any data that needs to be persisted or is required for branching logic.  The super actor takes that information, figures out what the next step should be, and sends a message and data to the sub actor requesting it to do that step.  Rinse and repeat until your process is complete.

 

The super actor is responsible for knowing what the next step should be; the sub actor is responsible for knowing how to do that step.  Because the super actor’s queue is not clogged up with lengthy flow chart steps, it is free to receive messages from external entities requesting to change the normal sequence of steps.  That’s about as close as you can get to implementing interrupts in a data flow language.

I like this concept.  I am going to see if I can do something similar.

 

1. I don't understand what you mean by "data dependencies between messages."  Can you give me a use case to help me understand?

2. Any message handling code can be inlined as long as it finishes before the next message arrives.  (Helpful, huh?)  As a rule of thumb, I inline data accessors and decision logic (the brainy stuff) in the MHL.  Lengthy computations, continuous processes, periodic processes, etc. (the brawny stuff) gets pushed into helper loops and sub actors.  Deciding between a helper loop or sub actor comes down to how much functionality you want that logical thread to have.  Complex functionality requires an actor, simple functionality gets by with a helper loop.

3. I assume you mean behavioral state machine, and not queued state machine, yes?  I implement BSMs by nesting message handling loops inside a simple state machine.  (If you really meant queued state machine, then my response is "I don't."  They behave badly, why would I want to replicate that?  :P )

1. I think you answered this above already.  I was referring to the case where Actor A needs data from Actor B before it can process a message from Actor C.  Mainly, the point was some messages are acted on differently based on the current state of the actor.  I think your discussion should have some example of this and how to deal with it.

2. All good points.

3. Your description of the sequencer answered this.

 

As to your overall presentation, I still feel you should avoid actual LabVIEW code examples and instead use pseudo code or pseudo-block diagrams (not 'G' block diagrams!).  I would not waste your time on fancy animations.  They rarely do much to communicate information and mainly just entertain the observer. (I'm not saying that your presentation should not be entertaining.)  Just focus on putting up pictures with very few words.  Maybe just add arrows between slides to emphasize portions of a diagram.

If you dont have a lot of text on your slide, then you cannot make the mistake of reading your slides aloud for the audience!

 

For making your diagrams, I recommend you check out yEd.

http://www.yworks.com/en/products_yed_about.html

Link to comment

I'll respond to the rest as I get time, but this part jumped out at me.

 

I want to design it such that an Actor is never asked to do something it is not ready to do.

 

NO!  This is the wrong approach to take! 

 

You must accept that you cannot--in general--prevent an actor from receiving a message it is not prepared to act on.  Even if you only have two actors talking to each other.  This is a fundamental truth of concurrent programming.  Quit trying to break the laws of time; it's not likely to work out in your favor.

 

(I've written about this subject before.  Search for "sender side filtering" or "receiver side filtering" and you might find something.)

Link to comment

I am thinking it might be better design for Actor B to send the data to Actor A along with the next step it should do.  That way Actor A only knows two things: How to ask for data and how to process data when received. 

I would have Actor A attach some kind of token to its request to Actor B, that Actor B would send back attached to the requested data.  This token would contain the next step for A to do.  This way the code for A is all contained in A, and B can be more generic and service requests from multiple actors.

 

In the framework I use, which like Lapdog uses text identifiers on messages, I usually do this by configuring A’s request to relabel the reply from B:

 

post-18176-0-06181700-1376646166.png

 

Here, A will receive back a message containing B’s data, but with the label specified by A (overwriting the label set by B).

Link to comment

BTW, a Database is not a good example to use for considering “actors”; a database is already well-designed for handling concurrent access, so someone reading may not see much value in introducing actors.  Instead, how about a piece of hardware that can only do one thing at once, but may be needed by multiple concurrent processes.  A part-handling robot, for example.  An actor that handles all interaction with the robot can rewritten to mediate concurrent requests, perhaps through some kind of “transaction” system.  Eg:

 

ProcessA —> Robot: “Request robot transaction"

Robot—>A “Transaction Started"

ProcessB —> Robot: “Request robot transaction"

Robot—>B “Busy; you are in job queue"

ProcessA—>”Do action 1"

  <robot working>

ProcessC —> Robot: “Request robot transaction"

Robot—>C “Busy; you are in job queue"

  <robot working>

Robot—>A “Action 1 Finished"

ProcessA —> Robot: “End transaction"

Robot—>B “Transaction Started"

etc.
 
 
The Robot Actor would either refuse any “Do action” requests from a Process that doesn’t have an open transaction, or consider such a request as implicitly being equivalent to a combined “Request transaction; Do action; End transaction”.  
 

 

 

 
  • Like 1
Link to comment
Daklu wrote (about my mentioning “no global addresses”):

 

2b.  That's an implementation detail and actually not a property of the actor model.  It's a convention I strictly adhere to and I believe it makes code easier to read, but I can't claim it's a necessary part of AOD.

 

Actually, I think it is a property of the Actor Model (though descriptions of it are not very clear so I may be wrong); it’s part of something referred to as “locality", which I would restate as "instantaneous changes are local, and are transmitted through the system only via messages”.  Being able to launch an actor and then have any other actor be immediately able to address it is a violation of locality.

 

From Wikipedia:

 

Another important characteristic of the Actor model is locality.

 

Locality means that in processing a message an Actor can send messages only to addresses that it receives in the message, addresses that it already had before it received the message and addresses for Actors that it creates while processing the message. (But see Synthesizing Addresses of Actors.)

 

Also locality means that there is no simultaneous change in multiple locations.

Unfortunately, “addresses that it already had” is too vague to definitively interpret. 

Link to comment
I'll respond to the rest as I get time, but this part jumped out at me.

 

 

NO!  This is the wrong approach to take!

 

You must accept that you cannot--in general--prevent an actor from receiving a message it is not prepared to act on.  Even if you only have two actors talking to each other.  This is a fundamental truth of concurrent programming.  Quit trying to break the laws of time; it's not likely to work out in your favor.

 

(I've written about this subject before.  Search for "sender side filtering" or "receiver side filtering" and you might find something.)

 

Perhaps I was not being clear.  What I meant was to design the system so the Actor that had the ability to tell an actor to do something with data was the one that also sent the data needed.  So, in theory, you would not get in a situation where an actor get a message it could not act on because the message was designed to include the data in the first place.

 

But the more I think about this, the more I am unsure if this is even possible.  I need an actor to have state data to act on.  I need to have multiple messages that command it to do something with that state data.  Somehow I need to have the actor load the state data in the first place.  And I need the ability to re-load that state data in the future.

 

More thought required... :book:

Link to comment
Perhaps I was not being clear.  What I meant was to design the system so the Actor that had the ability to tell an actor to do something with data was the one that also sent the data needed.  So, in theory, you would not get in a situation where an actor get a message it could not act on because the message was designed to include the data in the first place.

 

I see what you mean.  The receiving actor has a single behavioral state, so all messages are always handled exactly the same way.  Yes, you can do that for certain actors.  I don't know if it is feasible to design all your actors that way.  My gut feeling is eventually an actor somewhere in the app will need to handle a message differently depending on its internal state data.

 

 

Actually, I think (no global addresses) is a property of the Actor Model (though descriptions of it are not very clear so I may be wrong)...

 

Here's my reasoning for why I think it is not.

 

In the Actor Model (AM) actors don't send messages to other actors; they send them to addresses.  An address is an abstraction of the list of all actors who will receive messages sent to that address--an alias of sorts.  Nothing I've seen in the AM indicates the list of actors the address refers to must be static.  Actors can be added to or removed from the list during run time, but it's not necessary to send messages to every actor that currently has the address to let them know the list has changed.  The change is handled automatically, sometime behind the scenes (such as registering for user events) and sometimes in code (such as a subscription manager actor.)

 

Saying named queues aren't allowed in the actor model oversimplifies things.  I could use a named queue and pass around the name instead of the reference itself, and that wouldn't violate the AM.  The important thing is the actor doesn't know about it before it's supposed to know about it.  If your Planet Killer missile system is designed so every actor needs to always know the address of the Self-Destruct subsystem, then I don't see anything in the AM that says you can't use a global address implemented in whatever way you want.

 

 

Would you dynamically create an actor to handle each DB request and have it destroy itself after it completed and returned the results?

What if you wanted to share a DB connection, to avoid the expense of opening and closing the connection for each transaction?  You could wrap the DB class in a DVR and then have each DB actor use that object.  This would have the effect of serializing your DB calls (if you needed logic to do error retries and restore the connection if it goes bad).  Would that be a bad implementation?  (this could apply to any shared resource like a file or some hardware)

 

It's been a long time since I've done any programming with DBs, so my terminology might be wrong and certainly some of the details on how I think they work are wrong.  For this response I'll use the term "DB connection" to mean a connection to the database that can only process one operation at a time.

 

The general solution to reuse anything expensive to create is pooling.  The initial approach I'd take with a DB is to create an actor that abstracts the database.  The DbActor creates a private pool of connections it uses to service the requests it receives.  Internally it keeps a look up table of the connections currently executing an operation and an address where the response should be sent.  When a connection returns a response DbActor looks up the associated address, sends the response to it, and returns the connection to the pool.

 

I would not distribute DbActor in a DVR for the reasons you mentioned.  I also would not give connection objects to the other actors.  One of the responsibilities of DbActor is to manage the connection pool, and if it is giving connections to others it can no longer do that.

 

 

It seems to me that it might be best implemented as an actor or helper that has limited lifespan and is exclusive the caller.  But, my only concern is cleanly shutting down the system if we try to exit while the DB actor/helper is in the middle of processing.  If we are going to call it an actor, then it is already violating the principle that its MHL is always waiting, right?

 

Helper loops are always exclusive to a single actor.  Same for sub actors in hierarchical messaging.  That exclusivity makes certain things much easier to reason about, but adds the overhead of having to write extra message routing logic.

 

Regarding shutdown, when I have low level processes that might take some time to shut down, I write my shutdown logic so an actor doesn't shut down until all its sub actors have shut down.  That's easy enough to do with hierarchical messaging.  Aren't you using a variation of direct messaging?  I've never tried doing a controlled shutdown with that, though I suppose you could as long as your actor topology has a clear hierarchy of responsibility.

 

Yes, in my Agile Actor model the MHL is "always" waiting.  If the MHL allows messages to remain on the queue for "too long," then in the AA model it is not an actor, because it requires priority messages or transport manipulation to get the behavior you want.  (The AM makes no guarantees about the arrival order of messages, so priority messages can not exist.)  I'm not familiar with NI's database palette.  If the function for querying a DB blocks, then that function needs to be in a helper loop, not actor's MHL.

Link to comment
I see what you mean.  The receiving actor has a single behavioral state, so all messages are always handled exactly the same way.  Yes, you can do that for certain actors.  I don't know if it is feasible to design all your actors that way.  My gut feeling is eventually an actor somewhere in the app will need to handle a message differently depending on its internal state data.

 

It seems to be that in some cases, when a message is executed, the last step of the execution may be to execute another message (send to self) to continue some sort of sequence.  This could be dependent on some state data (check if we performing a multi-step process) so that you could still execute the step independent from a sequence.  Also, this gives you the ability to interrupt the process with exit messages.

 

It's been a long time since I've done any programming with DBs, so my terminology might be wrong and certainly some of the details on how I think they work are wrong.  For this response I'll use the term "DB connection" to mean a connection to the database that can only process one operation at a time.

 

The general solution to reuse anything expensive to create is pooling.  The initial approach I'd take with a DB is to create an actor that abstracts the database.  The DbActor creates a private pool of connections it uses to service the requests it receives.  Internally it keeps a look up table of the connections currently executing an operation and an address where the response should be sent.  When a connection returns a response DbActor looks up the associated address, sends the response to it, and returns the connection to the pool.

 

I would not distribute DbActor in a DVR for the reasons you mentioned.  I also would not give connection objects to the other actors.  One of the responsibilities of DbActor is to manage the connection pool, and if it is giving connections to others it can no longer do that.

I am really leaning towards the DVR idea.  Here is my reasons why:

I can allow multiple actors access to the database without them having to message a central DB actor.  They can still serialize their execution using an inplace structure.  I can perform error handing and retries within the DVR class and all actors can benefit from this (if I have to drop and reconnect the DB handle, when I release the DVR, the next actor gets the new handle because it is in the class data of the DVR wrapped class).

If at some point I do not want to do a DB operation inline, I can simply alter the class to launch a dynamic actor to process the call and then terminate when complete.  Since the state data is in the DVR wrapped class, It will work the same.

   

 

Regarding shutdown, when I have low level processes that might take some time to shut down, I write my shutdown logic so an actor doesn't shut down until all its sub actors have shut down.  That's easy enough to do with hierarchical messaging.  Aren't you using a variation of direct messaging?  I've never tried doing a controlled shutdown with that, though I suppose you could as long as your actor topology has a clear hierarchy of responsibility.

What good is a MHL that is always waiting if it is waiting for a child actor to shut down?  It seems that it is still being blocked in that case.  I just do not see a way to truly free up all actors at all times when there are processes in an application that take unknown amounts of time to execute.

 

I keep thinking that actor programming is somehow different from other ways of designing applications and that it is supposed to make things more adaptable and maintainable, like OOP does.  But I just can't seem to wrap my head around how to do this for applications with a lot of sequenced steps.  The 'actor is always waiting' and 'actor can handle any message at any time' rules just seem impossible to adhere to when there are so many preconditions that need to be met before many operations can be performed and many operations can cause blocking.

Link to comment
Here's my reasoning for why I think it is not.

 

In the Actor Model (AM) actors don't send messages to other actors; they send them to addresses.  An address is an abstraction of the list of all actors who will receive messages sent to that address--an alias of sorts.  Nothing I've seen in the AM indicates the list of actors the address refers to must be static.  Actors can be added to or removed from the list during run time, but it's not necessary to send messages to every actor that currently has the address to let them know the list has changed.  The change is handled automatically, sometime behind the scenes (such as registering for user events) and sometimes in code (such as a subscription manager actor.)

I disagree, if you mean you can update these lists of actors non-locally.  For example, if Actor A launches Actor B, then only A has the address of B, and A cannot add B to any list of Actors held by any third actor, C, except via sending the address in a message.  Having by-reference updatable lists shared between Actors is a definite violation of the Actor Model; the lists have to be by-value.  

 

Note that you can build structure on top of actors that do subscriptions or channel-like message routing, but these must all be built on top of messages.  No by-ref data sharing.  My “Observer Registry” Actor, for example, in my actor-like framework, holds by-value lists of addresses, and everything is done by messages.  

 

I say “my actor-like framework” because I can’t claim I don’t break the Actor Model rules, but I do try and know why I should be wary about breaking them. 

 

Saying named queues aren't allowed in the actor model oversimplifies things.  I could use a named queue and pass around the name instead of the reference itself, and that wouldn't violate the AM.  The important thing is the actor doesn't know about it before it's supposed to know about it. 

That isn’t using the names as a global link, so that is fine.  

 

If your Planet Killer missile system is designed so every actor needs to always know the address of the Self-Destruct subsystem, then I don't see anything in the AM that says you can't use a global address implemented in whatever way you want.

Nothing wrong with not using the Actor Model for everything, but personally, I’d rather my missile not be mistakenly destroyed because of a bug is some forgotten unimportant subsystem that misspelled a queue name.  If everyone needs to know it, then explicitly pass it to everyone.

Link to comment
I am really leaning towards the DVR idea.  Here is my reasons why:

I can allow multiple actors access to the database without them having to message a central DB actor.  They can still serialize their execution using an inplace structure.  I can perform error handing and retries within the DVR class and all actors can benefit from this (if I have to drop and reconnect the DB handle, when I release the DVR, the next actor gets the new handle because it is in the class data of the DVR wrapped class).

If at some point I do not want to do a DB operation inline, I can simply alter the class to launch a dynamic actor to process the call and then terminate when complete.  Since the state data is in the DVR wrapped class, It will work the same.

Are you actually working on a DB actor or is this just academic?  Because you are in danger of spending a lot of time on redundantly recreating features that the database software already handles (and probably handles better; for example, a DB will only serialize transactions that actually need to be serialized).  

Link to comment
I disagree, if you mean you can update these lists of actors non-locally.  For example, if Actor A launches Actor B, then only A has the address of B, and A cannot add B to any list of Actors held by any third actor, C, except via sending the address in a message.  Having by-reference updatable lists shared between Actors is a definite violation of the Actor Model; the lists have to be by-value.  

 

Actors don't maintain lists of other actors; they maintain lists of addresses.  Each address is a list of zero or more actors (and zero or more other addresses) who will receive messages sent to that address.  I agree an actor's internal list of addresses is localized; however, the address itself must be implemented as a reference or global for it to have any use in the AM. 

 

Reading back over this discussion, I think your use of the term "global addresses" is too vague.  You may be referring to addresses in an unmanaged (meaning no actor is responsible for it) global data store.  In that case I agree, since as near as I can tell the AM doesn't allow anything to exist that is not contained in an actor, address, or message.  I'm referring to the address' behavior when implemented.

Link to comment
It seems to be that in some cases, when a message is executed, the last step of the execution may be to execute another message (send to self) to continue some sort of sequence.

 

Oh, you mean like QMH/QSMs?  :P

 

Having a loop send messages to itself is a dangerous game.  It's not well-understood exactly when it is safe to do so and when you are introducing race conditions.  As far as I can tell, the only way you can guarantee race conditions don't exist is by not maintaining any loop data in a shift register.  Once a loop becomes stateful race conditions have to be rooted out by inspection.  That can be very difficult.  You have to make sure the implied contract of each message will be fulfilled regardless of what other messages arrive before the sequence is complete.

 

Example: There is a race condition in the QSM project template that ships with 2012.  In the Initialize message handler, it is possible for a ChangeData message to be sent after InitializeData but before InitializePanel.  If that occurs, then the Initialize message isn't honoring its implied contract, because the panel isn't in its initial condition when the sequence completes.  You can't easily tell the race condition exists--you have to trace through each message and figure out what they do.

 

Personally, I prefer to avoid that issue altogether by not self-sending messages.  It makes it much easier to verify the app behaves correctly.

 

 

I am really leaning towards the DVR idea.

 

I agree with James.  It sounds like you're over-constraining your application.  Why is it an advantage to only allow one actor to access the DB to the exclusion of all other actors?  You're imposing synchronous behavior where it isn't required.  It makes more sense to me to have a ConnectionPool actor that sends connection objects to individual actors that need DB access.  The actor can use the connection to do its sequential operation.  When it's done with the connection it can return the connection object to the ConnectionPool actor.  No DVR needed.

 

(This does expose you to the possibility of a connection leak if an actor does not return the connection when finished and asks for a new one the next time.  You'd have to verify correct behavior via unit tests or inspection.)

 

 

What good is a MHL that is always waiting if it is waiting for a child actor to shut down?  It seems that it is still being blocked in that case.  I just do not see a way to truly free up all actors at all times when there are processes in an application that take unknown amounts of time to execute.

 

The point of having a waiting MHL is that it always has the most up-to-date information available to make decisions.  Information is never stuck in the queue waiting to be read.  Actually, "always waiting" is a simplification.  What you want to avoid is having more than one message waiting to be read.  Whenever a second message is pending, it is possible the message contains information that alters how the loop would have processed the first message.

 

Always waiting does not mean the actor will be able to instantly do what the message requested.  If the database API doesn't provide a way to abort a query in progress, then there's nothing you can do to change that.  You have to wait for it to finish or timeout.

 

 

But I just can't seem to wrap my head around how to do this for applications with a lot of sequenced steps.

 

It's just a matter of choosing the correct encapsulations and defining the desired behavior.  Actors only expose messages which are guaranteed to fulfill their contract regardless of what other messages arrive in the meantime.  Like I said earlier, if I had a sequence of steps that needed to be executed I'd wrap the sequence in an actor expose a few high level messages like StartSequence and AbortSequence.  All the sequencing logic would be hidden inside the actor.

Link to comment

Ok, I am not going to quote a bunch of posts but I am going to try to respond to your points above and try to get this back on the topic of how to improve your presentation.

 

But first a slight deviation off topic to clarify the point about making a singleton object vs an actor:

I used database access as an example but I think this can apply to other shared resources.  Here are the things I am trying to consider:

1. There is overhead to opening and closing a connection to a database.  So, caching a connection is preferred.

2. A database connection reference can go stale for many reasons.  Also, the database can go down and be restarted.  To be immune to these situations, error handling code needs to be able to reestablish connection to the database and reattempt the execution of a database call before issuing a critical error to the rest of the system.

3. Some database operations do not require a response and therefore should not block the caller.  Other operations require a response before the caller can continue and are therefore blocking.  The ability to have both options is desirable.

4. Having a central actor handle all database operations can work in some messaging architectures but is problematic in hierarchical systems (like AF).

 

By having a object handle database communication (instead of an actor), you can call methods inline (the callers thread is used to execute the operation) when you require the response to continue the work of the caller OR you can spawn a dynamic daemon and have it call the database when you simply want to write data and do not require a response (unless there is an unrecoverable error).

 

By making the database object a singleton, you can reuse a connection between calls (saving the open-close overhead).  This makes most sense if you anticipate making a lot of calls but not a lot of simultaneous calls.  Also, by having a single object, when there is an issue that can be resolved by retries and reestablishing the connection, you block other callers while working the issue.  When you unblock them, you are passing them a repaired connection or you have issued a critical stop because the database is down.  Either way, you avoid the issue of multiple callers attempting to talk to a dead database and filling up the error log with redundant information.  And yes, this could be achieved with an actor but then you lose the ability to inline the calls and add the need to have reply messages from the database actor.  Finally, you have to break the hierarchical messaging architecture in AF to do this.

 

So, my point of using this example was to talk about some cases where an actor might not be the best choice.  If you are designing a system and you want to use actors, there are still going to be cases where you want to use other techniques as well.  Your presentation should address this in some ways.  Maybe give a few examples of places where an actor is not the most efficient solution.

 

Ok, back to the main discussion.  Making actors that do not block.  I have given this some more thought and I think I now understand what you are saying but let me state it in my own words and you can confirm.

The message handler of an actor should not be blocked but the overall actor 'system' (the message loop and all helper loops) can be in a busy state.

So, if you have some process that can take an undefined amount of time (let's use the database call again as an example) then you should call that process from the helper loop of an actor and have the message handler respond with a status while the helper loop is busy.  If another request comes in, it should queue that job until the helper loop is available again and (if required by your design) reply with a status (ie: I'm busy, you are in the queue).  This should leave the actor always able to respond.  For example, it could receive a message asking for status and response with how many jobs are in the queue.

So, to summarize If you send a message to the actor telling it to do something that takes time, it should hand that off to a helper loop and go back to listening for more messages.  Lengthy processes should never be done by the message handler.  One point: you sometimes say that your helper loops are like actors, but I think you need to make the point that they do not need to adhere to the rule that they are always ready to receive a message.  Otherwise, they would need to be wrapped by a MHL and you start getting into that turtles reference you made earlier.

 

As for sequencing, I think the actor should encapsulate the sequence from the caller but I am less clear on why it is bad to call itself.  For example, if you need to initialize a system with several steps, I would anticipate a design like this:

1. Actor is asked to initialize system.

2. Actor calls helper loop to perform first step of initialization and sets state data to indicate what phase of the initialization it is in.

3. Helper loop responds that first step is complete.  Actor updates state and calls helper loop to perform next phase of initialization.

4. Process repeats until all steps completed.  Actor responds to caller with message that initialization is complete.

At any time during the above process the caller can ask the actor what its status is and it can response with what phase of the process it is performing.  This could then be used by the caller to update progress in the GUI.

In this example, it seems to me that it would make the most sense to make each phase of the initialization process a separate message that the actor responds to.  That would allow the developer to easily rearrange the order in the future and it would allow the caller to request re-initialization of a single phase at a point in the future.  So, that is why I thought having an actor message itself was a good idea.

 

Expanding your presentation to include some common real world scenarios (like executing a sequence) would be helpful.  I would include a discussion of the pitfalls in this example and how to avoid them.  I still think it would be best to use simple diagrams to illustrate your examples instead of actual G code.

 

I hope this is helpful.  I know this discussion has helped me in better understanding actor programming (or at least shown me what I do not understand about actor programming!  :lol: )

Link to comment
So, to summarize If you send a message to the actor telling it to do something that takes time, it should hand that off to a helper loop and go back to listening for more messages.  Lengthy processes should never be done by the message handler.

 

Correct.

 

 

As for sequencing, I think the actor should encapsulate the sequence from the caller but I am less clear on why it is bad to call itself.

 

I don't like having *loops* send messages to themselves.  An actor may consist of multiple loops, so an actor can send messages to itself (helper loops) internally.

 

Having a loop send messages to itself isn't inherently bad--it is possible to do it safely.  I choose not to do it because it is very easy to overlook race conditions and very hard to verify they do not exist.

 

 

By making the database object a singleton, you can reuse a connection between calls (saving the open-close overhead).  This makes most sense if you anticipate making a lot of calls but not a lot of simultaneous calls.

 

What DB are you using?  Databases support multiple connections.  I don't understand why you are intent on limiting yourself to only one connection.

Link to comment
Correct.

Good.  I'm glad I got one right!

 

I don't like having *loops* send messages to themselves.  An actor may consist of multiple loops, so an actor can send messages to itself (helper loops) internally.

 

Having a loop send messages to itself isn't inherently bad--it is possible to do it safely.  I choose not to do it because it is very easy to overlook race conditions and very hard to verify they do not exist.

So, if the MHL does not trigger the next step in the sequence by sending itself a message, are you saying that the 'helper loop' should send the MHL a message to do the next step in the sequence?  And if so, wouldn't that mean that the helper loop would be less useful since it would always trigger the next step even if you only wanted to repeat a single step?  It seems to me that since the state data of the actor lives in the MHL loop, that it should be making the logic decisions about what step to do next in the sequence or to abort the sequence altogether if some error or exit condition has occurred.

 

What DB are you using?  Databases support multiple connections.  I don't understand why you are intent on limiting yourself to only one connection.

You are making me sorry I used a database as an example.  Yes, databases support multiple connections.  And opening a connection and closing it for every transaction is expensive.  And I understand your point about pooling connections and opening a new one if two transactions are called at the same time.  You don't actually have to do that as you can execute two transactions on the same connection at the same time (the database will sort it out).  But what you are missing is the real world fact that databases often stop working.  They become too busy to respond because some IT guy's re-index process is running.  They because unreachable because some IT guy is messing with the network routing or a switch has gone down, been rebooted or is overloaded.  They disappear altogether when some IT guy decides to install a patch and reboot them.  If your system is dependent on continuous access to the database, you must do everything in your power to correct or at least survive any of these scenarios as gracefully as possible.

So, let's look at your pooling idea:

Actor A requests data from the DB Actor.  The DB actor spawns a helper to execute the transaction using the available connection.  The database is unreachable so the helper loop closes the connection, sleeps, reestablishes the connection (the ref has now changed value) and retries the transaction.  It repeats this process 10 times over a periods of 15 minutes, hoping the DB comes back.  Each attempt is logged to the error log.

At the same time, Actor B requests data from the DB Actor.  The DB actor adds a new connection to the pool, giving Actor B's request the new connection.  A second helper is spawned to execute this request and runs into the same problem, closing the connection and then reopening and retrying.  More errors pile up in the error log.

This is repeated many more times, resulting in a bunch of connections being added to the pool just so they can fail to connect, the error log is so convoluted with messages that it because nearing impossible to untangle the threads of errors, and all for what?  So you can call two database transactions at the same time?  Not worth it.

 

But I digress from the topic.  The point I was trying to illustrate was an example where something other than an actor was the best solution.  If you don't like my example, think of a different one.  I just think your presentation would benefit from a discussion of where it is appropriate and not appropriate to use actors.

Link to comment
So, if the MHL does not trigger the next step in the sequence by sending itself a message, are you saying that the 'helper loop' should send the MHL a message to do the next step in the sequence?

 

No, I'm saying the MHL sends a message to the helper loop telling it what step to execute.

 

MHL:  Do step 1. Here are the data you need.

HL:  I finished step 1.  Here are the results.

MHL:  <checks results and figures out the next step>  Do step 2. Here are the data you need.

HL:  I finished step 2.  Here are the results.

MHL:  <checks results and figures out the next step>  Do step 8. Here are the data you need.

etc.

 

In the presentation I said any time I drop a MHL I consider it another actor.  That's another simplification.  I only call a MHL an actor if it has the characteristics I consider essential for agile actors--atomic messages (no sequential dependencies) and always waiting.  The helper loop in this case could be a message handling loop, but it's not an agile actor because it directly executes lengthy processes.

 

 

You are making me sorry I used a database as an example.

 

Really? I think it's a remarkably good example.

 

 

You don't actually have to do that as you can execute two transactions on the same connection at the same time (the database will sort it out). 

 

Depends on the database.  Some only process a single transaction at a time on each connection.  That's why I asked which one you are using.  :P

 

 

This is repeated many more times, resulting in a bunch of connections being added to the pool just so they can fail to connect, the error log is so convoluted with messages that it because nearing impossible to untangle the threads of errors, and all for what?  So you can call two database transactions at the same time?  Not worth it.

 

There's nothing preventing the connection manager from monitoring the database and refusing to hand out connection objects if it is down.  Anyway, I'm not trying to tell you how to design your app.  I'm just trying to understand why you are favoring imposing synchronous access on an inherently asynchronous process.  If the tradeoff for having a clean error log is worth it to you, I'm certainly not in a position to tell you it isn't.

 

 

The point I was trying to illustrate was an example where something other than an actor was the best solution.  If you don't like my example, think of a different one.

 

I'll grant you your solution is a viable alternative, and may even be the best solution in your situation.  I'm not anywhere close to being convinced it is "the best solution" in general.

 

 

I just think your presentation would benefit from a discussion of where it is appropriate and not appropriate to use actors.

 

I understand the sentiment, but there are no general rules for when they are and are not appropriate.  It's appropriate if you can build it so it fits into your design and behaves how you need it to behave.  It's like asking when it is appropriate and is not appropriate to use classes.  There are no technical reasons not to use classes, and there are no technical reasons not to use actors. 

Link to comment
No, I'm saying the MHL sends a message to the helper loop telling it what step to execute.

 

MHL:  Do step 1. Here are the data you need.

HL:  I finished step 1.  Here are the results.

MHL:  <checks results and figures out the next step>  Do step 2. Here are the data you need.

HL:  I finished step 2.  Here are the results.

MHL:  <checks results and figures out the next step>  Do step 8. Here are the data you need.

etc.

 

In the presentation I said any time I drop a MHL I consider it another actor.  That's another simplification.  I only call a MHL an actor if it has the characteristics I consider essential for agile actors--atomic messages (no sequential dependencies) and always waiting.  The helper loop in this case could be a message handling loop, but it's not an agile actor because it directly executes lengthy processes.

Ok, that helps understand the role of the helper loop better.  But what about external Actors wanting to have the sequencing actor perform just step 5?  You would have to make a separate message just for that case, but if you made each step a message, you could internally execute them in any particular order and you could externally execute them individually.

 

A concrete example I am thinking of is reading a settings file.  You need to read this file as part of the process when you init your application and configure your system.  But what if the customer requested the ability to re-read the file while the application is running because they edited it with an outside tool and want the new settings to be applied?  If that part of your initialization process was a separate message, you could execute just that part again.  If your state data had a flag that indicated if you were doing a full initialization, you could use this to determine if the next step in the sequence should be called or skipped as appropriate.  Your actor is still executing atomic steps.  You just have the option to have it cascade to additional steps if in the correct mode.

Why is this a bad idea?  You mentioned race conditions (I understand how the QMH has those) but I think this implementation would avoid them as you are not pre-filling a queue and you have the opportunity to check for for exit conditions before started the next step.

Link to comment
...there are no general rules for when they are and are not appropriate.  It's appropriate if you can build it so it fits into your design and behaves how you need it to behave.  It's like asking when it is appropriate and is not appropriate to use classes.  There are no technical reasons not to use classes, and there are no technical reasons not to use actors. 

 

This precisely why an EXAMPLE would help.  Presumably IF there were general rules THEN you could just specify those and pointe finale.  :yes:

Link to comment
But what about external Actors wanting to have the sequencing actor perform just step 5?  But what about external Actors wanting to have the sequencing actor perform just step 5?

 

External actors shouldn't know anything about individual steps in the process.  All they know is they send a StartProcess message to your SequenceController actor and get a message back when the process is done.

 

 

A concrete example I am thinking of is reading a settings file.  You need to read this file as part of the process when you init your application and configure your system.  But what if the customer requested the ability to re-read the file while the application is running because they edited it with an outside tool and want the new settings to be applied?  If that part of your initialization process was a separate message, you could execute just that part again.

 

Your question is more about API design that AOD.  I can tell you one way to do it; I can't tell you what you should do.

 

I usually implement configuration settings as ordinary classes, not as actors.  A typical AppCfg object for me might have the following methods:

 

-Create AppCfg (path Path);

-Save ();

-Get configItemA();

-Get configItemB();

-Set configItemB(string Key, string Value);

-etc.

 

During initialization the startup code creates an AppCfg object, which load the data from disk, and hands the object off to whoever is responsible for that data.  If there is a need to reload the AppCfg data while the app is running, the actor responsible for that object can expose a ReloadCfgData message.  When the message is received it would discard the existing object and create a new one from the file on disk.  There's no reason for the initialization code to expose any steps in the process, because the initialization code is only run during startup.

 

I try to write my component APIs so each method is independent, meaning there's no overlap in the methods' functionality.  If the Initialize method calls LoadAppCfg, I wouldn't expose a separate LoadAppCfg method.  I find the component is easier to use and the code is easier to understand when I don't have to keep referring back to the documentation to remind myself exactly what low level API methods are called by each high level API method.

 

That said, there's nothing in AOD that prevents you from exposing low level and high level messages in the same actor.

 

 

This precisely why an EXAMPLE would help.  Presumably IF there were general rules THEN you could just specify those and pointe finale.  :yes:

 

Examples don't help with figuring out when an actor should be used.  An example can show how to implement them, but there are way too many variables for me to tell someone when they should use an actor.  When do I use actors?  Any time I have multiple processes that need to run concurrently.  When should you use actors?  When the business conditions are suitable to do so.

 

(I am working on reimplementing the Evaporative Cooler example, but my motherboard gave up the ghost over the weekend so I have to postpone it for a while.)

Link to comment
Examples don't help with figuring out when an actor should be used.  An example can show how to implement them, but there are way too many variables for me to tell someone when they should use an actor.  When do I use actors?  Any time I have multiple processes that need to run concurrently.  When should you use actors?  When the business conditions are suitable to do so.

 

I disagree on this but for the moment will follow your line.  If I ask "When will the business conditions be suitable for me to implement Actors?" the answer is: Quite likely, never.  But I'm a special use case: single product, single developer (soon to bring on a new programmer so I can just architect), with a deploy history stretching back to LV5 so LOTS of legacy code.  Refactoring even some of that to incorporate Actors would be a fairly substantial effort at this point, even if it made definite sense to do so.

 

On the other hand, me seeing concrete examples of when others deliberately chose to implement Actors to solve certain problem or design specs can be instructive to me in understanding better how and when Actors might become time/cost effective for me to implement within the overall enterprise.  Hearing the reasoning behind making the choice to implement Actors instead of other approaches in specific examples is even more instructive.  So examples would help -- at least help me -- a lot.

Link to comment
  • 8 months later...
I've been poking around in the cooler example again.  IMO, a big reason it is so complicated is because it illustrates so many different concepts and has a lot of abstract classes.  I've started an AA implementation that strips away all the extra stuff and focuses on the basic AOD principles.  I'll post it in a new thread when it's ready.

 

Out of curiosity were you ever able to finish and post this?  I couldn't seem to find the thread if you did.  

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.