Jump to content

Read national text from MS Word document


_Y_

Recommended Posts

I need to "extract" text from MS Word document (no formatting, just plain text). Unfortunately, methods that I found allow to get only conventional LabVIEW string where all national characters and symbols are lost. Each such a character is replaced with code of question mark.

 

Is there any way to read national text from MS Word? I would be happy to get it in any format; for example as U16 array of Unicode symbols, or U8 array with two values per symbol, or any other. I would also be happy with any Word format: doc or docx.

 

Thank you

Edited by _Y_
Link to comment

Can you interact with Word (using ActiveX) and save as a text file? Then you can read the text file as pure bytes and interpret as UTF-8. I have done something similar whereby I allow a GUI to be translated "on-the-fly" into different languages, stored as UTF-8 text files.

Edited by Neil Pate
Link to comment

Oh.I thought you had already obtained the text since you stated it looks like a series of question marks (so just needed to convert it)

LabVIEW is shipped with some automation examples. The one below (from the examples) interacts with Excel but the principle is the same. I couldn't find any examples of Ms Word without the Report Toolkit because most interaction with MS products is generally the other way - writing reports. 

I don't have M$ products installed to knock up a quick example, unfortunately.

Untitled.png

Link to comment

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

  • Similar Content

    • By ShaunR
      I'm adding support for NTLMv2 authentication and Hashes to the Encryption Compendium for LabVIEW. I have implemented the NTLM SSP protocol in native LabVIEW (apart from one thing) and it's all working great under Windows, Now I am looking to make sure it works on the other platforms still.
      The "one thing" is the Ascii to UTF16 conversion since the protocol uses unicode. With the older protocols (NTLMv1 etc) that is not an issue since the negotiation can tell the server to use ASCII strings instead of Unicode. However. NTLMv2 requires unicode strings to create the hash in the specifications so there is no negotiating it  away.
      (Note: There is no need to display anything so the bytes just need converting for calculation from a LabVIEW string to a u8 byte array representing the unicode equivalent. No changes to LabVIEW indicators or ini-files is required)
      Under windows, the conversion from ASCII to Unicode is via calls to the OS (kernel32.dll). So am I looking at iconv, mbsrtowcs or something else to achieve the same on Linux and Mac?
      .
    • By Steen Schmidt
      Hi,
       
      I just attempted to use LabVIEW (2014) to make a few custom JPEGs - the customization being definition of image background color and adding some text.
       
      Then I realized that the 'Draw Text at Point.vi' and relatives really does a bad job at font anti aliasing. Comparison with a simple graphics editor (zoomed in view of a very large capital G):
       
      LabVIEW 'Draw Text at Point.vi':

       
      Paint.Net:

       
      And yes, this is very noticable even for smaller text sizes.
       
      Question: Has anybody made any good graphics creation toolkits for LabVIEW, or do I have to live with these built-in anti aliasing algorithms from 1990?
       
      Bonus question: How on earth does those image VIs font point size correlate to the font size in non-LabVIEW image editors? In Paint.Net I used font size 216, while I had to use 350pt for font size with 'Draw Text at Point.vi'. In Paint.Net the image was 96 PPI, and I reckon LV does 72 DPI only (Windows 7 was set at 96 DPI with 100% scaling). So I would've expected to set the user defined font for 'Draw Text at Point.vi' at 288pt (216 * 96/72) to get the same size letter, but I had to go up to 350pt in LabVIEW to match 216pt in Paint.Net. What gives?
       
      Cheers,
      Steen
×
×
  • Create New...

Important Information

By using this site, you agree to our Terms of Use.