Content Copyright © 2012 Bloor. All Rights Reserved.
Mountain Lion is the latest version of OS X, the operating system for Apple Mac, and was delivered on 25 July 2012.
The list of new features includes the following minor accessibility functions:
- Improvements to the system preferences accessibility pane.
- Support of additional braille displays.
- Some extended functions to VoiceOver, the built-in text to speech function of OS X.
The accessibility list does not include Dictation, the new voice recognition feature, which is reasonable because its current functionality cannot be considered sufficient to be an assistive technology. If you need a full function assistive technology dictation system you need to install Dragon Dictate for Mac from Nuance.
In the rest of this article I want to describe the voice recognition built-in to Mountain Lion, its functions, features, limitations and suggested improvements.
Mountain Lion includes two voice recognition functions:
- Speakable Items, also referred to as speech commands, has been available since the first release of OS X and enables a user to speak a command such as ‘switch to safari’ instead of using the keyboard or pointer device.
- Dictation is new in Mountain Lion, and enables a user to dictate and for the text to be inserted at the current cursor position.
When looking at these features I have considered how effectively they do what they are designed to do and then how useful that set of functions might be to different categories of users.
Looking on the Internet it does appear that some people are happy users of Speakable Items, but I am afraid my experience has been that it does not understand my British accent consistently. The results I have had are that: it understands a command and does it, it does not understand and does nothing, or it misunderstands and does something I did not intend. I was not even able to get it to recognise specific commands consistently. The technology has not changed much since its introduction and does not include any learning capability so my persevering is unlikely to improve matters; it is possible that I might be able to change my intonation or accent but I am not convinced it would be worth the effort.
The second question is if Speakable Items works consistently for a user, is it useful? If the user finds using the keyboard difficult then it helps them issue commands; unfortunately having issued a command like ‘switch to Safari’ the user has to enter any further input using the keyboard or maybe the pointer device. If, on the other hand, the user can use the keyboard, like me, then learning a few keyboard shortcuts is more effective than using Speakable Items. For example I continuously use ‘cmd+tab’ to switch between open applications. The only example I have found of a user type who may really benefit from Speakable Items is someone who uses pointer and keyboard intensive applications such as Photoshop. The ability to say ‘brush’ or ‘eraser’ means that the hands can concentrate on the work in hand and not the commands.
Dictation is a new technology and uses the same server based technology as Siri on the iPhone. You access it by pressing the ‘Fn’ key twice, you then dictate a phrase or sentence, press ‘Fn” once, and, with a good Internet connection, the text you dictate is inserted, almost instantaneously, where the cursor was positioned. The dictation can include commands such as ‘new line’ and special characters such as ‘pound sign’ so complete texts can be dictated.
The quality of the voice recognition is high and will improve with use, the software learns about both the acoustics of the particular device used and recognises different voice patterns. In a sense it will recognise that I have a British accent and modify its algorithm to match. Even so, the recognition is not 100%, there are a number of YouTube examples of dictation and they show errors including: a word being stumbled over in dictation being left out completely in the returned text, also ‘Mountain Lion’ coming back as ‘Mountain line’. The difficulty for the user is that Dictation does not include any edit/correction function – if there is an error the user can either correct the error using the keyboard or use the undo command (cmd+z) and dictate the whole phrase again.
So which users might find Dictation attractive in comparison to keyboard entry?
Tweeters may well find it a useful option. Assume that Dictation gets 90%+ tweets right first time then the speed and accuracy may be greater than the same user using the keyboard and this will not be overshadowed by the one in ten times that editing or re-dictation is required. The shortness of a Tweet adds to the attraction.
One finger peckers, or people used to dictating rather than typing, may also find Dictation sufficiently faster than keyboard only entry.
What is certainly true is that Dictation, as it stands, does not help a user who cannot use a keyboard at all. In fairness Apple does not make any claim that it does.
This is the first release of Dictation on the Mac and its limited function should still be attractive to some users in some circumstances. However, as a base for future enhancement it does provide some intriguing possibilities. Here are my suggestions for extensions to the functions that could make it attractive to a much wider audience, including some people with disabilities:
- Build the function of Speakable Items into Dictation, so the user could dictate ‘command switch to safari’ and it would be executed. Firstly this would provide a single voice recognition interface for the user. Secondly the level of recognition should be very high and consistent (I have tried dictating similar commands and the recognition is excellent out of the box) so providing a usable function for most users. Finally it should be possible to add many more Speakable Items, for example ‘command tab’ or ‘command jump to message area’ so a user could easily navigate around applications without using the keyboard.
- Add a specific ‘command undo last input’, this will make it easy for the user to re-dictate the last input but would also indicate to the Dictation engine that it had not properly recognised the input and thus enable the engine to learn faster.
Apple will obviously not comment on any future developments, we will just have to wait for the follow on to Mountain Lion, probably this time next year, to see if and how Dictation is expanded.
In the meantime try it out, it is fun and you may find you prefer it to keyboard entry for some functions.