Computer Translation of Japanese Text, Part 1 – Translation from the Internet

Recently, I wrote a blog post about Researching Wasen Remotely, but it was mostly a follow up about the general difficulty of sorting through research information that’s primarily in Japanese and gathered from wide ranging sources. But I’m thinking it might be helpful to go over some of the resources and tools I use in research. This could be pretty involved, so I may need to do this in a few parts.

The most obvious sources of information are going to be books, drawings, photos, web pages, etc. Drawings and photos aren’t language dependent, but books, websites and any text in the drawings and photos, are going to be written in Japanese. If you don’t read Japanese, that’s a big problem, but there are tools that can help.

While I was born in Japan, am half Japanese, and know a small amount of spoken Japanese, my own knowledge of the written language is limited. Here’s how I overcome this limitation.

Electronic Media vs. Printed Media

Translation is much simpler if the material your translating is in electronic text form. Text from websites works the best as it can easily be copied and pasted into online translation tools. PDF Files that can be downloaded from websites or received on CDs usually have selectable text will work as well. Some electronic book formats may work as well, but copyright protection schemes may thwart any attempts to copy the text for translation.

Image files, like jpegs and pdf scans, are only pictures. Even if they are pictures of text, this text is not selectable and can not be copied. Instead the image will need to be run through some kind of character recognition software in order to be useful.

Printed materials require more work than electronic media as pages have to be read using scanner hardware and turned into an electronic image format that can then be run through character recognition software as described above. So, given an option, a pdf version of a book is generally more useful than a printed book.

Unfortunately, text created using character recognition software often contains errors, as Japanese text can be quite complex and software may not correctly recognize all of the Japanese characters. This text often needs to be cleaned up before translating.

I’ll get into the details of scanning and using character recognition another time. For now, keep in mind that electronic text is your best bet for translating into useable information.

Online Translation Tools

There are a number of translation tools that can help you get the best translation of Japanese text. Again, the text needs to be selectable, as you need to be able to copy and paste the text into the translation tools.

If you’re on Facebook, when you encounter a foreign language post, there should be an option to “See Translation”. This can be very helpful in giving you an idea of what the post is about, but frankly, I find these translations to be almost worthless when it comes to Japanese text. A better option is to copy the Japanese text and paste it into either Google or Bing translators. They will provide much clearer translations. Still, the feature makes Facebook a good place to hang out to follow what’s going on in the world of wasen.

Google Translate and Bing Translator

These two competing translators are almost identical in interface, so it’s easy to use them both, taking the best results. You can access them here:

Google: http://translate.google.com

Bing: http://www.bing.com/translator

Any Japanese text that’s not simply an image of text (see what to do about that later), you should be able to select and copy. This text you can simply past into your chosen translator, make your language selections (note that auto-detect feature of the site will sometimes mistake the text for Chinese) and hit the “Translate” button if it doesn’t automatically do it for you.

As I mentioned, the interface of the two are pretty much the same. However, the results are a bit different, and there are a couple feature differences.

Both translators give you speaker buttons which you can press to hear the pronunciation of the Japanese text, but Google Translate also displays the romanized text as well. This can really help you to write in Japanese, which I’ll discuss later on.

Google Translate has an additional feature, that little “あ” icon next to the speaker icon. This button allows you to enter Japanese text by either typing the romanized syllables, or by drawing them using your computer’s mouse or trackpad. I personally haven’t used this feature, as I use a Mac which can provide these same features very easily, though I’m sure you can set up a PC that way too. But, I’ve never had a need to do anything more than play a little with this Google Translate feature, so I can’t say much more about it.

Which is More Useful?

I’ve tended to rely heavily on Google Translate because of the romanized text feature, but lately, I’ve found resulting translations to come out better with Bing Translater. For your own work, one is bound to do a better job than the other, though it may depend on the text you’ve chosen to translate. All I can recommend at this time is to use them both and compare the results and decide which you prefer.

As an example, I took this text from Facebook:

仁淀川川下り遊覧船の建造が始まりました。来春完成に向けてがんばりましょう。

When I pass it through Google Translate, I get this:

The construction of the boat ride boat down the Nidodawa River started.
Let ‘s do our best for completion next Spring.

With Bing Translator, I get this:

The construction of the sightseeing boat on the Niyodo River was started.
Let’s do our best to finish next spring.
Of these translations, Bing Translator’s comes out with more natural sounding sentences than Google Translate’s. Google even managed to misspell Niyodogawa (Nidodawa?) and left an extra space before the apostrophe in “Let’s”.
Now, I mentioned some of the translation services’ features. When you click on the speaker icon with either translator, you can hear the Japanese text read aloud. Unfortunately, when I tried this with Bing, it would only speak the first sentence. Google spoke all of the entered text.
Then, there’s the romanized text I mentioned that Google Translate provides. While it’s handy, oddly enough, the spoken text doesn’t match the written romanized text, though both Bing and Google speak the text the same. So, clearly, this romanized text is limited in usefulness.
And, what about the Facebook translation? It might work nicely for some languages, but for Japanese, most of what it gives you is gibberish. As I stated earlier, the Facebook translation is okay for getting the gist of the text, but you’ll need to take it to one of these other translation tools if you really want to know what was written.

Imperfect Translation

Now, this sample text translated quite well, but that’s not going to happen all the time. You’ll need to get accustomed to translating blocks of text and then be left wondering what the heck it means. You can decipher some of it based on context, and there are some tricks that can also help get a better translation.

If a block of text doesn’t make sense, try individual sentences. If a sentence doesn’t make much sense, try break it into phrases or words. Here is where it helps to be familiar with Japanese, even if you can’t effectively read anything. If you know a little Japanese, then look for phrases that can be separated by conjunctions, the object identifier を, the subject marker は, etc. Translate them separately and piece together the meaning.

Sometimes, you can even take the translated text, in English, and reverse translate them back into Japanese to see if you get something close to your original text. You can even take that text and translate it back into English. This doesn’t help very often, but sometimes you get lucky and can verify the meaning, or verify that the translator isn’t handling the particular words well.

The whole process takes practice, and the more translation you do, the more you get accustomed to the results and the more you’ll understand.

In future posts, I’ll get into other aspects of translation, translating from printed material, written material, etc. Ω

 

 

1 thought on “Computer Translation of Japanese Text, Part 1 – Translation from the Internet

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s