+ Reply to Thread
Results 1 to 1 of 1

Thread: Very big files and the Tag Editor

  1. #1
    Contributing User
    Join Date
    May 2011
    Rep Power

    Default Very big files and the Tag Editor

    Hi there

    Last week we were given a very big file to translate: 18K lines, 1.2 million words. It was in Excel 2007 (xlsx) format. When we tried to open it with the Tag Editor, it just stayed there for ages and it never seemed to end. At least we didn't have the patience to wait. So we decided to split it into manageable chunks. "What a drag!" I hear you say? Not at all.

    First we saved it as a simple csv. Then we used the unix command "split" to ... well ... to split it into files with a fixed number of lines. It can be done in other ways, just type "man split" in a terminal.

    pabloa$ split -d -l 1234 big_file.csv small_file
    pabloa$ wc -l small_file*
       1234 small_file00
       1234 small_file01
       1234 small_file02
       1234 small_file03
       1234 small_file04
       1234 small_file05
       1234 small_file06
       1234 small_file07
       1234 small_file08
       1234 small_file09
       1234 small_file10
       1234 small_file11
       1234 small_file12
       1234 small_file13
       1139 small_file14
      18415 total
    The -d is used to get numbers instead of letters as suffixes. Otherwise it creates files called small_fileaa, small_fileab, etc. The last word is the prefix we want the files to have. The last file gets the remainder of the lines, as it can be seen in this case.

    And then, to convert them back to xlsx, I created a macro in Excel with the output of "for x in small_file*; do csv2xlsx $x; done" where "csv2xlsx" is the following script:

    echo "    Workbooks.Open Filename:=\"$source\""
    echo "    ActiveWorkbook.SaveAs Filename:=\"$target\" _"
    echo "        , FileFormat:=xlCSV, CreateBackup:=False"
    Here "pwd_windows" is our old friend, described in a previous post: http://www.english-spanish-translato...ows-style.html

    Last edited by pabloa; 09-06-2011 at 08:58 AM.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts