Last week we were given a very big file to translate: 18K lines, 1.2 million words. It was in Excel 2007 (xlsx) format. When we tried to open it with the Tag Editor, it just stayed there for ages and it never seemed to end. At least we didn't have the patience to wait. So we decided to split it into manageable chunks. "What a drag!" I hear you say? Not at all.
First we saved it as a simple csv. Then we used the unix command "split" to ... well ... to split it into files with a fixed number of lines. It can be done in other ways, just type "man split" in a terminal.
The -d is used to get numbers instead of letters as suffixes. Otherwise it creates files called small_fileaa, small_fileab, etc. The last word is the prefix we want the files to have. The last file gets the remainder of the lines, as it can be seen in this case.
pabloa$ split -d -l 1234 big_file.csv small_file
pabloa$ wc -l small_file*
And then, to convert them back to xlsx, I created a macro in Excel with the output of "for x in small_file*; do csv2xlsx $x; done" where "csv2xlsx" is the following script:
Here "pwd_windows" is our old friend, described in a previous post: http://www.english-spanish-translato...ows-style.html
echo " Workbooks.Open Filename:=\"$source\""
echo " ActiveWorkbook.SaveAs Filename:=\"$target\" _"
echo " , FileFormat:=xlCSV, CreateBackup:=False"