Jump to content

save pdf file from website


Recommended Posts

Posted

I need to save multiple file from a site. Every file have to be dowloaded anda saved in a folder.

I try Activex or .vi found in the net but them can save only text or html.

For me is good to dowload the raw file of pdf and in second time with Reader i will open it.

I already tried whith WinInet but not good idea.

Thank you

lelluzo

Posted

QUOTE(MikaelH @ Jul 25 2007, 06:17 PM)

Just use the DataSocket Read, and save the file.

I really like this one :thumbup:

The cool thing is that you can download binary files such as pictures, movies, pdfs, etc. using this method.

*Note! This works great if you're saving a text file (txt,html,xml,etc) but if you're saving a binary file (jpg,pdf,etc) you'll have to write the string to the "write to binary file" primative versus the "write to text file" primative or you'll get data corruption. Just because you are able to download it through the datasocket using the [text] format doesn't mean it's a text file...

Question: does anyone know of a complete list of URL format speficiers ( [text],

, etc) somewhere I can stare at?  I also am curious what is handling these specifiers... is it your default browser, or some core TCP/IP functions of the OS?
Posted

QUOTE(orko @ Jul 26 2007, 11:52 AM)

, etc) somewhere I can stare at?  I also am curious what is handling these specifiers... is it your default browser, or some core TCP/IP functions of the OS?

If I understand your question, it's your browser that is handling this; they are referred to as Mime types[/post]

What does the internet have to do with Marcel Marceau? I have no idea :P

(OK I was a mime; and can still do the walk and box and rope shtick.)

Posted

QUOTE(LV Punk @ Jul 27 2007, 03:25 AM)

(OK I was a mime; and can still do the walk and box and rope shtick.)

This I gotta see! Are you coming to NI-Week? :D

Posted

QUOTE(LV Punk @ Jul 26 2007, 10:25 AM)

Okay, thanks. I understand mime types, and how they are used in HTTP headers, but I'm a little confused on how they are used in the datasocket implementation.

Am I wrong in assuming that the Datasocket Read VI only supports a limited subset of these mime types? Which ones does it support, if this is true? I know [text] and

 work, but are there others?

One thing I noticed is that even though "html" is accepted, it isn't a real mime type since it's actually "text/html". In fact "text" isn't a valid mime type either, its "text/plain". I'm assuming that the Datasocket VI's are coded to only allow certain "keywords" of supported types rather than mime types as valid in the URL. I'm not even sure it really handles "[text]" and "[html]" differently...does it?

Posted

QUOTE(orko @ Jul 26 2007, 10:18 PM)

One thing I noticed is that even though "html" is accepted, it isn't a real mime type since it's actually "text/html". I'm assuming that the Datasocket VI's are coded to only allow certain "keywords" rather than mime types as valid in the URL.

The way I understand it, the [text] part has nothing to do with MIME types and simply tells the primitive to retrieve whatever is at that URL as binary data represented as a string.

Posted

QUOTE(yen @ Jul 26 2007, 12:23 PM)

binary data represented as a string

...as opposed to what? Are there other "[tag]" operations that can be done with the Datasocket primatives to obtain data in another form?

I'm confused because the "type" input of the primative is what I thought defined the data type.

Posted

The type input defines the LabVIEW data type (e.g. you might be sent a 1D array of DBL), but the data you download from the web is not a LabVIEW string. You just want it as a LabVIEW string because that's how your file primitives let you save binary data so you have to tell it "don't do any type checking and conversion. Just get me the data".

Posted

Hmm... that's the way I understood the type input. The "[text]" or "

" endings on the URL's are still are a mystery to me, since they don't seem to behave any differently from one another.

This test returned true (they were equal) no matter what type of file (jpg,pdf,html,txt) I was trying to download from the internet:

http://forums.lavag.org/index.php?act=attach&type=post&id=6466

Since I can't seem to find any documentation on the differences between these two, I'll just assume they react exactly the same. I was just wondering if there were other capabilities (other "[tags]") inside the Datasocket Read primative that behaved differently.

Thanks for your time, and sorry for hijacking this thread a bit ;)

Posted

QUOTE(orko @ Jul 26 2007, 03:18 PM)

" differently...does it?

If you try to retrieve a PDF file (or other binarary file) without the [text] suffix, the Datasocket Read function throws an undefined error (-2146797887).

It's a wild guess on my part, but maybe the text/html tags have something to do identifying extended character sets for non-english versions of LabVIEW.

It would be nice if the Datasocket read returned the content-type as a string; then you could use a case to determine how to process (present) the data. If the content-type was "image/jpeg" you could convert it to a pixmap like crelf did here[/post], you could write the content (string) to a file and then pass it to my IE Dialog box located here, or use [color=#006400]system exec[/color] to open the file.

Posted

QUOTE(LV Punk @ Jul 27 2007, 04:13 PM)

It would be nice if the Datasocket read returned the content-type as a string;

I don't think that can be done. MIME types are relevant if you're handling binary data as an attachment or embedded content. In this case, you're simply asking the server for a file, so there should not be any MIME data.

Posted

QUOTE(orko @ Jul 27 2007, 06:56 PM)

That's one of the reasons I asked if the default browser was handling the requests for files (vs. NI using native TCP and parsing out the headers themselves). If they are just passing the request to IE or Netscape, then they might not have access to all of these server response headers in the Datasocket VIs themselves.

I would highly doubt that any external browser is used, but I don't know much about this. It just would not seem to make much sense.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Unfortunately, your content contains terms that we do not allow. Please edit your content to remove the highlighted words below.
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.