Kalev Hannes Leetaru (George Washington University) wrote a post for The Signal exploring the landscape of statistical machine translation (SMT) and outlining the advances made in the last few years to move toward a more “post-lingual society.” Leetaru provides an overview of mass translation projects with a particular focus on GDELT Translingual (Global Database of Events, Language, and Tone) which “live-translates all global news media that GDELT monitors in 65 languages in real-time, representing 98.4% of the non-English content it finds worldwide each day.” Leetaru reflects:
Machine translation has truly come of age to a point where it can robustly translate foreign news coverage into English, feed that material into automated data mining algorithms and yield substantially enhanced coverage of the non-Western world. As such tools gradually make their way into the library environment, they stand poised to profoundly reshape the role of language in the access and consumption of our world’s information. Among the many ways that big data is changing our society, its empowerment of machine translation is bridging traditional distances of geography and language, bringing us ever-closer to the notion of a truly global society with universal access to information.
This post was produced through a cooperation between Julie Adamo, Jolanda-Pieta (Joey) van Arnhem, Alison Babeu, Rebecca Dowson, Jan Lampaert, and Jeffrey Sabol (Editors-at-large for the week), Caro Pinto (Editor for the week), Sarah Potvin (Site Editor), and Zach Coble and Roxanne Shirazi (dh+lib Review Editors).