Going Rogue: Translating Professionally with a Free CAT Tool (Part 2)

Going Rogue: Translating Professionally with a Free CAT Tool (Part 2)

Apologies for the late follow up on writing this Part 2 of Going Rogue but hey, better late than never, right? Special thanks to all those who expressed their support and interest for Part 1; your comments and feedback were more than welcome, making me push forward and complete this 2nd part.

I have to highlight here that this part won’t be the last one, despite my original planning for it to be so. It seems the process of a handling a real-world translation project using our free tools needed more space. Thus, the final part will be Part 3 which will follow in the coming days (hint: I’ve already drafted most of it…).

As you — hopefully — remember, in Part 1 we established the 5 objectives that kicked off this whole endeavor. In short, we wanted a free CAT Tool that would run on all major Operating Systems and allow a professional translator to deliver his work without major issues. We shortlisted a few free CAT Tools, made our choice and added to the mix a couple of free utilities that were needed to make the process work as expected.

In this part, we’ll kick-off the actual process of handling a translation project from a client. Specifically, we’ll see what steps are included in such a process, and delve into the first step which deals with the source file preparation. This client has streamlined the translation workflow around SDL Trados Studio. Thus, we’ll be receiving a project in SDL Trados Studio’s Project Package format (.sdlppx).

The Process

With our client established, let’s proceed and see how we should handle this case using our free CAT Tool software, Heartsome Translation Studio. We’ll be following the industry’s standard methodology which consists of:

  1. Source file preparation
  2. Analysis and pre-processing
  3. Translation
  4. Proofreading and spell-checking
  5. Verification checks
  6. Delivery to the client

The client sends in (or handoffs) the project in the form of an SDL Trados Studio Project package (.sdlppx). The language pair is English (US) into German (DE). After completion, the instructions mention the delivery (or handback) of an SDL Trados Studio Return package (.sdlrpx). Note that our free CAT Tool cannot handle Project packages directly, nor create Return packages. So, and this is very important, we must inform the client upfront that we won’t be able to deliver a Return package, but can deliver bilingual SDLXLIFFs and TM/TBX exports, which they can use to update their project. In nearly all cases, the client won’t have a problem with this, but it’s crucial to inform them accordingly from the beginning.

So, let’s start the process!

Source File Preparation

Since our chosen free CAT Tool doesn’t support Trados Studio Project packages directly (although it does support the Trados Studio bilingual SDLXLIFF files), we’ll do some tricks to extract the files we need and which are the:

  • SDLXLIFFs (bilingual files)
  • SDLTM (Translation Memory)
  • SDLTB (Terminology Database)

Here are the steps we need to do:

  • First we’ll create a folder structure that will help us deal with the phases that will follow. In the folder you use for your translation work (if you don’t have one, pick a suitable location on your system and create one, i.e., with a name like Projects), create the following folders/subfolders:
  • In the From_Client folder place the .sdlppx file that was provided by the client.

  • Create a copy of the .sdlppx file and paste it into the Work folder. Rename the file extension from .sdlppx to .zip so your system can recognize it as an archive. Afterwards, extract the .zip file (using your preferred archive extractor tool) in the same folder.

  • You should have a new folder with the name of the .zip file. Enter it, and you should see a structure similar to the following:

|   Sample Project EN-DE SDL_Trados-2017618-21h1m9s.sdlproj
|       SamplePhotoPrinter.doc.sdlxliff
|       SamplePresentation.pptx.sdlxliff
|       SecondSample.docx.sdlxliff
|       TryPerfectMatch.doc.sdlxliff
|       SamplePhotoPrinter.doc.sdlxliff
|       SamplePresentation.pptx.sdlxliff
|       SecondSample.docx.sdlxliff
|       TryPerfectMatch.doc.sdlxliff
|       Analyze Files en-US_de-DE.xml
|       Printer.sdltb

We have 5 folders (de-DE, en-US, Reports, Termbases, Tm) and one project file (.sdlproj). Each folder contains the files we’ll be using, either directly (SDLXLIFFs) or by converting them (SDLTM and SDLTB). Note that we won’t need the .sdlproj file nor the contents of the Reports folder (the Reports folder contains the word count analysis of the project in XML format, readable only from within Trados Studio).

Since our target language is German (DE), we’ll be using the SDLXLIFFs in folder de-DE. The existence of this folder denotes that the files have been pre-processed by the client and, thus, contain pre-translated segments (i.e., 100% Matches and/or Perfect Matches) along with potential locked segments. If this folder was missing, then it would’ve meant that there was no pre-processing, and we would’ve used the SDLXLIFFs in folder en-US instead.

With the bilingual files (SDLXLIFFs) available, we’ll move on and convert the Translation Memory (SDLTM) and Terminology Database (SDLTB) to the corresponding formats that are readable by Heartsome Translation Studio. For this process we’ll use the Trados Studio Resource Converter for converting the SDLTM and SDLTB files, along with Heartsome Translation Studio’s internal TBX Maker tool for completing the SDLTB file conversion.

Here are the required steps for the SDLTM conversion:

  • Run the Trados Studio Resource Converter and select the first button, “Convert SDLTM”.

  • Point the “Open” dialog to the folder in which we extracted the .sdlppx file and, specifically, to the Tm folder.

  • Select the .sdltm file and click on the “Open” button. Once the process completes, a message will appear stating the number of translation units that were converted. Additionally, a .tmx file will be created in the Tm folder.

To convert the SDLTB, follow these steps:

  • Run the Trados Studio Resource Converter and select the second button, “Convert SDLTB”.

  • Point the “Open” dialog to the folder in which we extracted the .sdlppx file and, specifically, to the Termbases folder.

  • Select the .sdltb file and click on the “Open” button. In the dialog that appears select the default option, “Comma-separated CSV”, and press “OK”. Once the process completes, a message will appear stating that the termbase was successfully converted. Additionally, a .csv file will be created in the Termbases folder.

  • Launch Heartsome Translation Studio, and from the menu select Tools -> TBX Maker. In the TBX Maker window, select the menu option File -> Open CSV file…

  • In the dialog box that appears, click on “Browse” and select the .csv file we created above (in the Termbases folder). Next, change the “Main Language” to “English (United States) en-US”. Without touching anything else, click “OK”.

  • The window will now be populated with the content of the .csv file. Don’t freak out if you see a lot of columns with strange titles and additional languages. This would be normal since a Termbase can contain extra languages, depending on the number of target languages supported by the project. In our case, we’re only interested in the columns that have to do with the source (English) and target (German) languages.

  • To simplify the process we want to end up with 4 columns, 2 per language which will hold the term and its definition (or description) per language. So, we will scan the columns and identify the ones with the English and German terms, and then find the columns with their descriptions. We need to keep a note of their column numbers for the next steps.

  • From the menu, select Tasks -> Delete Columns…

  • In the “Delete” dialog, select all the other column numbers except the ones we noted further above (which are the 4 columns we need to keep). Click on “Delete specified rows”. You should end up with only the columns we need: Term + Description (per language).

  • From the menu, select Tasks -> Column Property…. This is where we’ll assign the type of each column, so the tool can properly finish the conversion. You should see 4 columns with dropdown boxes, and rows which correspond to the columns we kept in the previous step. In the 1st column (“Type of Column”), assign the value “Term” in all dropdown boxes. In the 2nd column (“Type”), select the value “term” (if the row belongs to a term column), or select the value “descrip” (if the row belongs to a description column). Ignore the 3rd column (“Attribute”). In the 4th column (“Term Language”) select the correct language per term and description. In our case, “en-US English (United States)” and “de-DE German (Germany)”. Click “OK”.

  • From the menu, select File -> Convert to TBX File…. In the dialog box that appears make sure the path in which the .tbx file will be saved to is the Termbases folder. Click “Convert”. Once the process completes, a message will appear stating that the CSV file was successfully converted to a TBX file. Additionally, a .tbx file will be created in the Termbases folder.

If you’re still with me, we now have files ready to be added to a Heartsome Translation Studio project, allowing us to proceed to the next phase.

+++ End of Part 2 +++

In Part 3, we’ll deal with the remaining steps of the process and, most importantly, decide whether this whole approach is viable for a professional freelance translator.

Petro Dudi avatar
About Petro Dudi
Petro Dudi is an American expat residing the last two decades in Athens, Greece. His professional career revolves around the Translation & Localization Industry for more than 17 years, having translated or project managed numerous projects for tech giants such as Microsoft, IBM/Lotus, Adobe, Symantec, GE Energy, Caterpillar, Toshiba, LaCie, Canon, Sony, Nokia, Bosch, Siemens etc. Petro is also the author of "Translation 101: Starting Out As A Translator", and the creator of the "Translation101 Toolkit" software.
comments powered by Disqus