Jump to content


Photo
- - - - -

[CR] Random permutation


  • Please log in to reply
36 replies to this topic

#1 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 25 October 2006 - 06:39 PM

This data set consists of thousands to hundreds of thousands of input output pairs in which both the input and the output are vectors of floating point numbers.


I see...
You must have a lot of memory on that computer... :)

#2 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 25 October 2006 - 06:47 PM

I see...
You must have a lot of memory on that computer... :)

Yep. I've got 4GB on my workstation and have access to a server with 24GB of RAM. Unfortunately, LabVIEW is limited to just under 2GB per instance so I play some games with ramdisks and file I/O when dealing with large data sets. I'm really looking forward to the 64-bit version of LabVIEW. :D

#3 Mellroth

Mellroth

    The 500 club

  • Members
  • PipPipPipPipPip
  • 535 posts
  • Version:LabVIEW 2011
  • Since:1995

Posted 25 October 2006 - 06:53 PM

...I include the option to process the training set in a random order each time it is used. To do this random processing, I implemented the random permutation in LabVIEW.

Have you considered using random numbers with a seed specified, e.g. by using Uniform White Noise.vi. This way you could run exactly the same sequence of "random" tests with a new model.

/J

#4 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 25 October 2006 - 06:36 PM

Now I'm curious. In my 10 years or so of technical computing, I'm having trouble thinking of many times where I've come across a need for randomizing an array. A LabVIEW implementation of Boggle that I did on a long flight, and a card shuffling exercise in college come to mind, and obviously those don't need to be really fast.
So, just out of curiosity, what type of real applications require such fast and efficient randomization?

I have implemented a number of artificial neural network models in LabVIEW. While performing stochastic optimization, i.e. "training the neural networks", I will often process the "training set" many times. This data set consists of thousands to hundreds of thousands of input output pairs in which both the input and the output are vectors of floating point numbers. I include the option to process the training set in a random order each time it is used. To do this random processing, I implemented the random permutation in LabVIEW.

#5 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 27 October 2006 - 01:38 PM

I ran this on both LV 7.1 and 8.2 with similar behavior. I wonder if it's CPU-dependent somehow -- mine's an AMD Athlon XP...

Maybe EVERYBODY gets to be right! :thumbup:

-Kevin P.

Good point. The numbers I put in the spreadsheet were derived from my IBM Celeron laptop. My P4 desktop seems to execute the implicit faster.

#6 Kevin P

Kevin P

    Very Active

  • Validating
  • PipPipPip
  • 60 posts

Posted 27 October 2006 - 01:33 PM

Depending on the original and final data types, I saw that the explicit coercion was between 5 and 35% faster than the automatic coercion (i.e. coersion dot). See attached data excel doc.


Hmmm, curiouser and curiouser...

I've only toyed around briefly so I don't have systematic charts for all the variations. But I kept seeing smaller (faster) times for Frame 1, the implicit coercion. For example, by simply taking the code as posted, enabling auto-indexing on the array at the For Loop boundaries, and making the input array large enough to matter, I got the screeshot below.

I ran this on both LV 7.1 and 8.2 with similar behavior. I wonder if it's CPU-dependent somehow -- mine's an AMD Athlon XP...

Maybe EVERYBODY gets to be right! :thumbup:

-Kevin P.

Posted Image



#7 Aristos Queue

Aristos Queue

    LV R&D: I write C++/# so you don't have to.

  • Premium Member
  • 2,620 posts
  • Location:Austin, TX
  • Version:LabVIEW 2011
  • Since:2000

Posted 26 October 2006 - 12:13 AM

:unsure: Can we please get a definative answer from NI on this one?

It should help if you have multiple coercion dots on the same wire.
:unsure: This answer is not definative... I know that the above is one situation where explicit coercion helps. There may be others.

#8 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 26 October 2006 - 01:35 AM

:unsure: This answer is not definative... I know that the above is one situation where explicit coercion helps. There may be others.

Hmm.... I decided to check this out a bit more, using variations of the attached vi.
First of all, for this particular operation, I found that the explicit coercion was faster. In my test, I always coerced a scalar. Maybe tomorrow, I'll try coercing an array.

I noticed that the speed difference between the explicit and automatic has a lot to do with the type of coercion being done. It appears that what you're coercing to has more of an impact than what you're coercing from.
Depending on the original and final data types, I saw that the explicit coercion was between 5 and 35% faster than the automatic coercion (i.e. coersion dot). See attached data excel doc.

Download File:post-4344-1161823302.zip

#9 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 21 October 2006 - 01:14 AM

Posted Image

File Name: Random permutation
File Submitter: syrus
File Submitted: 20 Oct 2006
File Updated: 23 Oct 2006
File Category: General

This SubVI takes a positive I32 integer n as input and generates a uniformly random array of the integers from 0 to n-1 as output. It is equivalent in function to the ‘randperm’ command in MATLAB. If a non-positive value is provided, an error is raised to alert the caller.

Click here to download this file

#10 Aristos Queue

Aristos Queue

    LV R&D: I write C++/# so you don't have to.

  • Premium Member
  • 2,620 posts
  • Location:Austin, TX
  • Version:LabVIEW 2011
  • Since:2000

Posted 21 October 2006 - 08:02 PM

This SubVI takes an assumed-to-be-positive I32 integer n as input
....
this SubVI does not perform any input validation or error checking.

Instead of "assumed to be positive I32", why not use an unsigned 32-bit? Then you don't have to worry about someone passing a negative.

#11 Michael Aivaliotis

Michael Aivaliotis

    MindFreak

  • JKI
  • 2,662 posts
  • Version:LabVIEW 2012
  • Since:1994

Posted 23 October 2006 - 12:21 AM

Instead of "assumed to be positive I32", why not use an unsigned 32-bit? Then you don't have to worry about someone passing a negative.

Sounds like a good reason for a 1.0.1 release. :)
Thank You
Michael Aivaliotis

VI Shots

#12 Mellroth

Mellroth

    The 500 club

  • Members
  • PipPipPipPipPip
  • 535 posts
  • Version:LabVIEW 2011
  • Since:1995

Posted 23 October 2006 - 06:30 AM

...It is equivalent in function to the 'randperm' command in MATLAB...

I think that the randperm function in MATLAB generates numbers between 1 and N.
Maybe add an option to select if the sequence is zero indexed or not?

/J

#13 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 23 October 2006 - 07:32 PM

Instead of "assumed to be positive I32", why not use an unsigned 32-bit? Then you don't have to worry about someone passing a negative.

In my applications, this function is used to generate an array of indices that, in turn, are used to randomize the order in which data is processed from another array. I believe that LabVIEW coerces array indices to I32, so I want to use I32 for the output array. A question then is whether to use U32 for the input which could theoretically cause a problem if the user decides to input a size larger than 2147483647. The answer to that question is no because a value of 2147483647 generates a "Memory Full" error in LabVIEW, so this issue is irrelevant until the 64-bit version of LabVIEW is released. :unsure:

I think that the randperm function in MATLAB generates numbers between 1 and N.
Maybe add an option to select if the sequence is zero indexed or not?

/J

Yep. MATLAB does generate numbers between 1 and N because MATLAB indexes arrays starting with 1 instead of 0. I can add a recommended boolean input to switch the indexing to 1...N.

Another non-trivial improvement could be to make the VI polymorphic allowing any compatible integer type to be used for both input and output (I32, I64, U32, U64, I8, U16, etc.), but this VI is simple enough that the end user can modify the types and the range to match their application.

In my opinion, this VI is most useful to demonstrate an efficient way to randomize an array in-place in LabVIEW, a pattern that comes up once in a while.

I'm going to wait a while to allow further discussion before I incorporate changes. :unsure:

#14 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 23 October 2006 - 07:35 PM

Forgive me if I'm being dense here, but how is that different from doing this?

Posted Image



#15 Mellroth

Mellroth

    The 500 club

  • Members
  • PipPipPipPipPip
  • 535 posts
  • Version:LabVIEW 2011
  • Since:1995

Posted 23 October 2006 - 07:45 PM

Yep. MATLAB does generate numbers between 1 and N because MATLAB indexes arrays starting with 1 instead of 0. I can add a recommended boolean input to switch the indexing to 1...N.

Sounds good, and it was a very nice implementation.
If you are going to do a change, it is enough to use one "index array" node plus one "replace array subset" node.
Then you could add a "convert to I32" before feeding the random number to the array functions, this way you will only have one conversion from DBL to I32. Just to make the code even cleaner (IMHO).

/J

#16 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 23 October 2006 - 08:34 PM

Forgive me if I'm being dense here, but how is that different from doing this?

Posted Image

Your solution requires at least three times as much allocated memory. It might also be slower (which one might or might not care about). I've learned to optimize for memory allocation in LabVIEW to get maximum performance out of my applications.

#17 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 23 October 2006 - 08:39 PM

Your solution requires at least three times as much allocated memory. It might also be slower (which one might or might not care about). I've learned to optimize for memory allocation in LabVIEW to get maximum performance out of my applications.


Hmmm... Any chance you could post your VI saved as 7.1?
Thanks,
Gary

#18 Darren

Darren

    Extremely Active

  • NI
  • 398 posts
  • Location:Austin, TX
  • Version:LabVIEW 2012
  • Since:1999

Posted 23 October 2006 - 08:41 PM

Forgive me if I'm being dense here, but how is that different from doing this?

Posted Image


The difference that I can see is that the submitted code runs 3-4 times faster than the Sort 1D Array approach.

-D

#19 syrus

syrus

    Active

  • Members
  • Pip
  • 24 posts

Posted 23 October 2006 - 08:46 PM

Hmmm... Any chance you could post your VI saved as 7.1?
Thanks,
Gary

I'll try. I do still have 8.0.1 and 7.1 running on my primary workstation. I'll have to get back to this later--I've got a few errands and meetings coming up this afternoon. --Syrus

#20 Gary Rubin

Gary Rubin

    The 500 club

  • Members
  • PipPipPipPipPip
  • 612 posts
  • Location:Northern Virginia, USA
  • Version:LabVIEW 8.6
  • Since:1997

Posted 23 October 2006 - 08:48 PM

I'll try. I do still have 8.0.1 and 7.1 running on my primary workstation. I'll have to get back to this later--I've got a few errands and meetings coming up this afternoon. --Syrus

Thanks,
Gary