Captioning services providers work to output transcripts in different formats on today’s devices, namely smartphones, tablets, laptops and now, virtual reality (VR) goggles.
Captions On Multiple Devices
Realtime captions for students or for TV accordingly need to be readable on multiple devices, as do captions recorded for video use, court sessions or for annual reports.
A recent post, The Future of Live Captioning, explores these challenges, while noting that Japan is trialing a live captioning system to meet new accessibility laws for 2016.
Remaining Issues For Captioners To Address
Here’s a quick summary of the pain-points that still exist when captioning:
- Automation: The growing sophistication of machine-captions means one captioner can suffice (not two, as before). However, the human touch is still needed for accuracy.
- Timing: Speed is critical for viewers of captions. Too fast, and meaning gets lost in translation. Too slow, and viewer frustration builds. Block reading is most user-friendly.
- Redirection: Capturing captions or written notes, for future use in another digital format. Workflow planning is critical, notably with new devices entering the user mix.
- Crowd-sourcing: One option for delivering captions in student contexts, including sports grounds. Note: Twitter can serve as a back-channel for live captions at events.
- Personnel: Globally, realtime captioners are scarce with stenography courses a main source of training. At some levels, machine translation will alleviate this staff shortage.
- Translating: In today’s polyglot world, captions need to easily convert into multiple languages. Again, machines are key to this process, as are responses to skills gaps.
According to Media Access Australia, the live captioning trial in Japan may reduce costs to about a tenth of past service delivery. To facilitate the process at Kyoto University, conference speakers are asked to speak slowly and clearly when delivering papers, and to supply their papers upfront to the captioners for prior abstracting before the event.
Elsewhere in the captioning area, Google Docs has a speech to text tool called Voice Typing for users to ‘speak’ content into a Google document being compiled, alone or as a group. If Google (with millions of users) is fine-tuning its voice-recognition tools, this research surely has implications for captioning services providers and for their future outputs or workflows.