Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Hi my colleagues! I'm in trouble when trying to open MS Word files with either TagEditor or Trados Studio 2009. Neither of them are accepting the old MS Word 2003 *.doc format. When trying to upgrade it to DOCX (MS Word 2007), segments will open with what I would call a micro-segmentation (usually showing tags letter by letter).
These files come from previous conversions using many different softwares (converters) like Solid, PDFTools or Gaaiho. Even some of them being converted straight into DOCX format.
Does anyone have a clue about what might be going on and how to solve it? Thanks in advance for your help!
Re: Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Does the document contain a complex formatting? Is it breaking segments due to line breaks? Removing superfluous formatting from .docx files could be tricky even with no visible inline formatting.
I would clear all the styles in Word, save the file and try again, Later you can reformat the file.
Another unconventional solution is opening the file with OpenOffice and save the file as Word. It has been proved that this method cleans tag soups.
You can also find solutions here: this is meant for OmegaT but it would work for trados too.
OmegaT
Re: Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Thanks my friend for your creative solution! I am not an OmegaT savvy but have some colleagues with some experience using that tool. I will ask them to give me a hand and walk me through.
Meanwhile, after minutes of testing, I have discovered that most of this tweak is sorted out by highlighting all the text (ctrl+E), then clicking on the Fonts tab and then clicking again at the Character Spacing tab from the Fonts pop-up window. By choosing the standard values from all drop-down menus (scale=100%, normal expansion, normal position) and then clicking Accept it seems that most of this intra-word internal tagging disappears.
Unfortunately, these are only 2 or 3 pages long documents but with a form-like structure with complex tables, bullets, superscripts, boldtype titles, etc. Hence the need to be especially careful not to touch any formatting present at the first "Font" tab (i.e. font type, size, all different formatting features like striking through, underlining, super or subscripts, etc.).
By doing so, I have managed to decrease some final fine tuning to just a couple of paragraphs from the header.
Re: Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Quote:
Originally Posted by
gentle
Meanwhile, after minutes of testing, I have discovered that most of this tweak is sorted out by highlighting all the text (ctrl+E), then clicking on the Fonts tab and then clicking again at the Character Spacing tab from the Fonts pop-up window. By choosing the standard values from all drop-down menus (scale=100%, normal expansion, normal position) and then clicking Accept it seems that most of this intra-word internal tagging disappears.
Hi gentle,
I tried this with one short document and it worked all right. But then I tried it on a larger document (pdf coming from In Design converted to word) and the result was all the words put together without spacing at all. The most weird thing was the different word counts I got from different tools. I got 17,000 words with trados and 3,000 words with MS Word. Do you know why this can happen?
Thanks!
Re: Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Mmmmm...that seems to be a different issue! ;-) In any case, try to save that MS Word DOC file as an RTF and put it through Trados again. Sometimes it worked out that the Trados word count gets closer to the MS Word one (although not always exactly the same).
Re: Converted DOCX (MS Word 2007) files in Tag Editor or Trados 2009
Here goes the sage to my post...I have finally found what has been causing so much trouble still keeping a few paragraphs split in micro-segments: headers and footers, text boxes and all those features not reached by my reformatting efforts when painting all the content to fix that character spacing. I even had to check all different footers as they were in different sections although they looked exactly the same. But I confirm that once you sort this out you are able to open the file in Trados 2009 as a normal one. ;-)