5 questions with… Peggy van der Kreeft

Peggy van der Kreeft discussing DW’s human language technology projects at Global Media Forum 2017


The news.bridge consortium consists of four partners: Deutsche Welle (DW), the Latvian News Agency (LETA), the Laboratoire d’Informatique de l’Université du Mans (LIUM), and Priberam. We are all part of the media and tech industry in some way, we’re all fascinated by what cutting-edge human language technology (HLT) can do, and we’re all dedicated to this DNI project. Other than that, we’re actually pretty different organizations and people with a wide range of interests and a distinctive set of skills. So we figured it would be a good idea to sit down, ask the four team leaders a couple of questions — and give you some more insights on who is doing what exactly, and why it’s all worthwile.

For the first of part of this series of posts, we’ve talked to Peggy van der Kreeft, an experienced linguist and innovation manager, who is running news.bridge at DW.

Peggy, when and how did you first get in touch with human language technology (HLT)?

That was is in the early 1980s, believe it or not, during my postgraduate on “Translating at European Level” at the University of Louvain. I tried out early machine translation (MT) systems like SYSTRAN. Later on, in the 90s, I used Babel Fish while working at an American translation and documentation center. The first research project at Deutsche Welle that focused on MT was CoSyne, which ran from 2010 to 2013. Among other things, we experimented with translating DW’s Today in History. The quality wasn’t good enough for direct publication, but there was always post editing, and we already succeeded in making the translation process a little more efficient. My first experience with automated speech recognition (ASR) and speech-to-text (STT) was in the scope of project AXES (2011-2015). This one was about finding novel ways to explore and interact with audiovisual libraries. The platform itself wasn’t sophisticated enough for use in a real production environment, but it certainly showed the power of HLT — which has been a focus topic of our department (DW Research & Cooperation Projects) ever since.

What is the most fascinating aspect about news.bridge?

There are many fascinating aspects, so it’s really hard to choose one. Perhaps the most striking thing is that the platform is so powerful, even though it’s based on a fairly simple concept. news.bridge covers virtually any language, and through the use of external tools, it remains state of the art.

What is the project’s biggest challenge?

Well, the overall challenge is to make news.bridge stable, scalable and provide a smooth workflow for the entire process of creating subtitles (and voice-overs). A challenge we’re currently focusing on is the seamless ingestion of existing scripts — which are rarely standardized. However, using original scripts (instead of ASR/STT) always leads to the best results, so we need to work on this.

Who’s in your team and what are they currently working on?

Here at DW, we’re currently four people: Ruben Bouwmeester, Hina Imran, Alexander Plaum, and myself. Ruben, who joined the project very recently, works on the project’s business development and marketing plan. He also makes sure we always have professionally designed dissemination material. Hina is our developer. She works on customized user interfaces for HLT applications and output, coordinating technical issues with LETA and within DW. She also manages and maintains a local test installation of news.bridge. Alex is in charge of communication and dissemination. He runs our website, this blog, and our Twitter account. He also works on brand design and sometimes acquires new partners. As for me, I am the HLT lead at our department and also the main coordinator of news.bridge. That means I take care of operations, oversee and report progress, organize user testing and plan implementation.

Where do you see news.bridge in five years?

news.bridge has already attracted quite a bit of attention. Many broadcasters, news producers and language technology providers are interested in implementing it as soon as possible. Multilingual content has become very important, and news.bridge will significantly speed up production workflows — with modest investment costs. We hope we’ll be able to offer a reliable service — local installations and software as a service (SaaS) — sometime in 2019. But we are not waiting for that. Our first major test case is here at Deutsche Welle, with its many newsrooms and its international orientation. We’ve started betatesting the platform for automated translation and subtitling of videos in different languages, and we’re taking if from there. It would be great if news.bridge became a standard HLT platform by 2023; it certainly has the potential. In order for that to happen, we need to find the right exploitation partners and strategies, of course. news.bridge is not a startup, but a media innovation project. When it’s finished, the platform will lead its own life.

Meet us in Munich, Alicante, and Strasbourg

We have a couple of conferences and meetings coming up – and it would be great to see you there and discuss news.bridge. Don’t hesitate to get in touch via email oder Twitter if you’re attending one of the following events:

Subtech 1 – Symposium on Subtitling Technology

Munich, Germany | May 24th & 25th
Official Website

21st Annual Conference of the European Association for Machine Translation

Alicante, Spain | May 28th – 30th
Official Website

news.bridge arte showcase

Strasbourg, France | June 15th
Note: This one is a non-public meeting. Please get in touch with arte for more information.

We’re looking forward to seeing you!

Insights from our first user testing sessions

, ,


Getting early input from the people you are designing for is absolutely essential – which is why we invited about a dozen colleagues to give the latest beta version of news.bridge a test run at the DW headquarters in Bonn last month. We had two really inspiring sessions with journalists, project managers and other media people working for DW and associated companies — and we have gained a number of useful insights. While some of them are too project-specific to share (news.bridge is not in public beta yet), there are also more general learnings that should make for an interesting journo tech blog post. Here we go:

Infrastructure and preparation

When inviting people to simultaneously stream and play around with news videos, make sure you have enough bandwidth. This may sound trivial, but it’s important, especially in Germany (which doesn’t even make the Top 20 when it comes to internet connection speed).

To document what your beta testers have to say as quickly and convenient as possible, we recommend to prepare digital questionnaires (e.g. Google Forms) and send out a link well before the end of the session. That way, you get solid feedback from everyone. It’s also a good idea to add a screenshot/comment feature (e.g. html2canvas) to the platform that is being tested. In addition, open discussions and interview-type interactions provide very useful feedback.

Testing automatic speech recognition (ASR) tools

Thanks to artificial neural networks, ASR services have become incredibly sophisticated in the last couple of years and deliver very decent results. Basically all of our test users said the technology will significantly speed up the tiresome transcription process when producing multilingual news videos.

However, ASR still has trouble when:

  • people speak with a heavy dialect and/or in incomplete sentences (like some European football coaches who shall not be named)
  • people speak simultaneously (which frequently happens at press conferences, for example)
  • complicated proper names occur (Aung San Suu Kyi, Hery Rajaonarimampianina)
  • homophones occur (merry, marry, Mary)
  • there is a lot of background noise (which is often interpreted as language and transcribed to gibberish)

As a result, journalists will almost certainly have to do thorough post-editing for a while and also correct (or add) punctuation, which is crucial for the subsequent translation.

Testing machine translation (MT) tools

What has been said about ASR also applies to MT: The tech has made huge leaps, but results are not perfect yet. Especially when you are a professional editor and thus have high standards. Something really important to remember:

The better and more structured your transcript (or uploaded original script),
the better the translation you end up with.

As for the limits of machine translation during our testrun, we found that “exotic” languages like Pashto (which is really important for international broadcasters like DW) are not implemented really well. Few services cover them, and the translation results are subpar. This is not a big surprise, of course, as the corpus used to train the algorithms is so much smaller than that of a major Western language like French or German. This also means that it is up to projects like news.bridge to improve MT services by feeding the algorithms high-quality content, e.g. articles from DW’s Pashto news site.

While MT tools are in general very useful when producing web videos — you need a lot of subtitling in the era of mobile social videos on muted phones — there are some workflows that are hard to improve or speed up. For example: How do you tap into digital information carriers that are an individually branded, hard-coded part of a video created in software like Adobe Premiere? Well, for now we can’t, but we’re working on solutions. In the meantime, running news.bridge in a fixed tab and copy-pasting your translated script bits is an acceptable workaround.

Testing speech synthesis

Sometimes, computer voices are indispensable. For example, when you’re really curious about this blogpost, but can’t read it because you’re on a bike or in a (traditional) car.

In news production however, artificial readers/presenters are merely a gimmick. At least for the time being. That’s because once your scripts are finished, reading/recording them isn’t that time consuming and will provide much nicer results. Besides, synthetic voices aren’t yet available in all languages (once again, Pashto is paragon).

Nevertheless, news.bridge beta testers told us that the voices work fairly well, and even sound pretty natural in some cases. They can be trained, by the way, which is an interesting exercise we will try out at some point.

HLT services and news production in a nutshell

If we had to sum up the assessment of our beta testers in just a few sentences, they would read something like this:

HLT services and tools are useful (or very useful) in news productions these days: They get you decent results and save you a lot of time.

news.bridge is a promising, easy-to-use mash-up platform, especially when it comes to transcribing and translating and creating subtitles (another relevant use case is gisting).

news.bridge is not about complete automation. It’s about supporting journalists and editors. It’s about making things easier.

Hello, world!

We’re live: Our website is up, our Twitter is up, and the platform itself (please register via email) is ready for upgrades, rewiring and in-depth testing.

DNI project news.bridge officially started in January 2018, and we’ll be working on it until the end of June 2019. Stay tuned for updates on new features, new partners, upcoming events, and dispatches from the world of human language technology (HLT).