ADELPHI, Md., May 13, 2015 - While leading a medical training team a few years ago in Kabul, Afghanistan, a U.S. Navy commander became frustrated as he faced the challenge of interpreting complex medical information.
Commander Kurt Henry was seeing cases of intestinal tuberculosis that he knew were treatable, but the regional hospital's critical care unit did not have medical manuals to provide treatment instruction for newly assigned doctors.
When he scanned the Internet for documentation about treatment options, he only came across information written in English. His team spoke the native language of the Afghan people, Dari, recalled Steve LaRocca, computer scientist and team chief at the U.S. Army Research Laboratory, or ARL.
Now, almost seven years later, the situation is better for medical trainers because of statistical machine translation methods that cut down on the Army's reliance on human translators in projects that require massive amounts of translation.
By early 2012, the ARL had provided 500 printed English-Dari special trainers' editions of the critical care reference manual to doctors in hospitals and clinics throughout Afghanistan to meet the need for medical teams like Henry's.
More and different manuals have since been translated, printed and shipped, and another priority translation is currently nearing completion.
ARL computer scientists and the newly assigned Afghan doctors have carefully translated and collected more than 6,000 Dari medical phrases over the course of the initial project.
Secondary products, including an Android "Army Phrase Book" app, have been developed to make broader use of the expertise captured in the translated phrases.
Without computational support, translators would speak into a recorder for an hour to extract small bits of data, LaRocca said.
"The challenge was working with a limited pool of potential translators who were familiar with Dari, a less commonly taught language, and who also understood medical jargon," LaRocca said.
Speech recognition technology was LaRocca's specialty when he retired from West Point as a language professor and founding director for the Center for Technology Enhanced Language Learning in 2004.
LaRocca advised military leaders on getting the most from limited translation resources, when he wore the uniform, with the understanding that "there is no way our language-qualified people could give all the capacity we need in theater."
At ARL, his team explores ways to harness the knowledge of linguists by capturing hundreds of hours of translations stored in databases where the translated sentences could be shared and reused.
The laboratory applies statistical machine translation methods to specialized Army problems where there is not a commercially available solution, said Melissa Holland, chief for ARL's multilingual computing research program.
"Computers could never replace the human translator, but we look for ways to relieve some of the burden, especially in less commonly used languages, like Dari, Pashto and Serbian," Holland said.
The multilingual computing group addresses challenges with medical, and also legal and Army training translations, she said. The information used in translating the medical phrases is kept in a database for use across the defense community.
Computer translation breakthroughs in the last decade, along with the Dari datasets, greatly reduced the projects' dependence on the small number of bilingual human translators, and who are also subject matter experts. Computers remember and reuse expert knowledge.
"We've had people translating every day in Korea since about 1951, but we didn't save the datasets over those decades," LaRocca said. "The knowledge generated by all those people over all those years is gone."
He said, "If we had the presence of mind to curate that data or prepare it for the eventual use of technology, we would be so much better off in that language and many others."
LaRocca embraced the idea of capturing and saving datasets from projects in the Dari and Pashto languages.
He is not the only one. Lt. Col. Forest Kim led a team of medical advisors under the surgeon general in Afghanistan from November 2013 to May 2014. His team had seven language translators, but he said there is not enough time or assets to translate large volumes of text.
His team circulated discs and DVDs to train medical trainers in the region.
"We were making a lot of changes, but I knew we were going to leave," Kim said. "We had to get to the point of serving the force in a supporting role."
Kim made it a priority to capture and upload all of the medical advisory documents to one central database. But he did not have a way to translate this information to other languages at the time.
ARL computer translation experts hope to expand the military's ability to translate volumes of critical data, LaRocca said.
The Army Program Office associated with translation technology anticipates an Army need for three new languages a year and expanding domains to include legal, criminal justice, military training and medical, he said, adding that the Army has developed a way to curate data as fast as it's translated.
They have also developed more than one way of capturing and reusing language data, LaRocca said. "Although the manual may be worn in 10 years, the datasets captured from the translations will live on and be valuable for decades to come."
When Kim was in Afghanistan, the physicians gave him a manual as an example of what they use for emergency war surgery that had been translated from Russian at least 40 years earlier.
"When U.S. forces are gone from the region, the U.S. documents will remain. As I see it, what ARL has done translates to tremendous training value to the physicians, as well as goodwill to the nation," he said.