Jump to content

Read national text from MS Word document


_Y_

Recommended Posts

I need to "extract" text from MS Word document (no formatting, just plain text). Unfortunately, methods that I found allow to get only conventional LabVIEW string where all national characters and symbols are lost. Each such a character is replaced with code of question mark.

 

Is there any way to read national text from MS Word? I would be happy to get it in any format; for example as U16 array of Unicode symbols, or U8 array with two values per symbol, or any other. I would also be happy with any Word format: doc or docx.

 

Thank you

Edited by _Y_
Link to comment

Can you interact with Word (using ActiveX) and save as a text file? Then you can read the text file as pure bytes and interpret as UTF-8. I have done something similar whereby I allow a GUI to be translated "on-the-fly" into different languages, stored as UTF-8 text files.

Edited by Neil Pate
Link to comment

Oh.I thought you had already obtained the text since you stated it looks like a series of question marks (so just needed to convert it)

LabVIEW is shipped with some automation examples. The one below (from the examples) interacts with Excel but the principle is the same. I couldn't find any examples of Ms Word without the Report Toolkit because most interaction with MS products is generally the other way - writing reports. 

I don't have M$ products installed to knock up a quick example, unfortunately.

Untitled.png

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.