I tried a bunch of programs over the years. Acquisition costs, learning curves, (yours and your machine’s), unintuitive commands, etc. And once you’ve got it mastered, you get a cold, and it can’t transcribe a word you’ve dictated.
The best I found was dictating on my first Apple I-Phone. It was remarkably accurate getting medical terms down, for example. Later iterations were not quite as good -I recently found trying dictate a short story while in stop and go traffic at the Massey Tunnel didn’t work well - but they were still pretty good. You’re going to proofread, edit, and revise anyway, right?
Even when I still wrote with pen and paper, dictation didn’t interest me much as a way to get handwritten text into the computer. Now that I write directly on the computer for everything, dictation holds even less interest.
I suppose it comes down to how one creates sentences: by talking/verbalizing or by, well, writing, as with a pen or a keyboard.
One thing that probably takes some getting used to is how with dictation, the same tool — your voice — is used both for commands and for content (text), whereas with a keyboard, the commands are via the menu, mouse, arrow keys, etc., whereas anytime you’re using only the alphanumeric keys (with optional shift key), you’re entering content — a nice delineation.
I compared a few applications for my day job and only for computer interfaced-situations (no mobile or tablet) and either free or already paid for. Here are three:
- Microsoft Word with 365subscription, connected to the internet, has a "Dictate" plug-in, and it is the best in terms of accuracy and ease of giving commands (eg. "next line"), in my experience
- getting a free account on Otter.ai is not bad, but not as accurate in my experience as Word, and then you have to copy and paste back into your doc. But it does have a mobile app.
- opening a Google Doc with Chrome allows you to use the "Voice Typing" command under "Tools". It used to be my first choice, but Word has replaced it.
Seconded. I didn't initially realise this question was for mobile devices, but when I'm away from my desk, Microsoft Word on my Android phone works great...as long as you have an internet connection!
I will appreciate all comments on this because first an injury to right hand, then major new arthritis flare-up on left hand (which has probably been overworking to compensate for my right hand) has stopped me in my tracks about three chapters away from finishing novel I have been working on for a year. I started dictating on mac (so thanks for those links) but so many errors (in part because there are all sorts of made-up welsh based words - part of near future world) that are making text difficult to correct without pain. Short texts work ok. So, one of my questions is, would turning on the ability to record the audio help with this? And is the superiority of Dragon based on its ability to learn, and improve the text to speech abilities? I do know authors who use dictation for first draft, and depending on future problems with arthritis, I may have to seriously do this to keep writing, I don't mind editing, but there is editing and there is finding myself totally re-writing sentences that don't make sense, so I would love more detail on how Dragon improves over Apple dictation!! But over all, the more comments by people who have used both the better.
Is it possible to switch mid-dictation and just spell those Welsh words for the software? I would guess that would be useful for all sorts of things like proper names it doesn’t recognize, or words with diacritics — fortunately in English we don’t have to modify letters very often with diacritics, so just spelling out with the basic 26 letters is usually sufficient for almost all English words (or 24 letters if you remember the old joke about L&M getting kicked out for smoking).
I think all dictation software “learns” in the sense that they now are using “AI” to help reduce errors and fix obvious mistakes. For example, I think on-the-fly subtitles are getting better now (YouTube), although they’re still pretty awful for live transcriptions (baseball games), falling behind the speaker, etc. Most TV shows and movies (and many podcasts) have subtitles/transcripts that have been reviewed, not just automated, although the first pass transcription is probably always done with software, then a producer or contractor or someone reviews it.
You can spell things out with Dragon. If it's a word you'll use a lot, you can also teach it new words. You type what you want to say, then say it a couple of times to train it, and you're good to go.
For over a year, my writing partner and I have been using Otter.ai to transcribe conversations for a substack we're about to launch. It began when I was baffled to realize that writing and publishing a memoir has proved to be psychologically transformative and turned to a writer-friend, who is also a neuropsychiatrist, for help figuring out how the process of writing could have had this impact. We came up with the idea of searching for answers by talking through the many issues involved and of recording our talk.
We started with Free which offers 300 min a month and 30mins per conversation. Eventually, we signed up for Pro ($9/month, less if you prepay for a year), which gives you 1200 mins a month and 90 min conversations. I was never able to do the audio through my computer microphone, so I call my partner and just put my cell phone on speaker. The transcriptions are quite accurate. After each, I read through and correct from what I remember or replay the audio, which scrolls along below the text. Real conversations don't translate into well shaped dialogue, but we have the record of our exchanges to work from. It's been terrific.
Our situation is probably unusual, but Otter would be just as good for interviewing or capturing zoom conversations.
Yes, that’s a great idea, rather than, say, taking notes or something. It sounds similar to the automatic transcription that Apple’s Podcasts app offers now. (I find it much faster to read the transcription than listen to the podcast.) A couple of cautions there, I suppose, that probably don’t apply to you, is that the podcast transcription doesn’t appear to distinguish between speakers when there’s more than one, and if someone wanted to quote something said on a podcast, they would probably want to listen to the audio again to make sure the transcription was 100% accurate. These cautions would probably apply, say, to a Zoom transcription.
At least in the case of two interlocutors, Otter.ai does identify each speaker by name, which you can introduce in the beginning. Often in our excitement about a new idea we speak over one another, making it impossible for Otter to identify who's speaking--let alone what we're saying--so it gets it wrong, and I have to go back to the audio to figure it out. The transcription is necessarily imperfect. Very quickly we realized that we could not use it directly in any case, because we repeat ourselves, speak discursively, don't build to a point, etc. It takes a lot of work to create good dialogue, but it's editing work, sweet work that we both enjoy. Of course, with an interview you don't have that latitude.
I appreciate this clarification--yes, it makes sense that this would create such labour. I suppose you could use your names to intro yourselves... but the cost would be losing the rhythm and nature of conversation. And would still need the editorial work.
"Sweet work" is good work; sometimes there are no short cuts!
AudioPen transcribes and also attempts to tidy up or edit or repurpose what you've said, all according to your requirements. So you get a pretty good transcript, as well as a host of other useful tools. The free version may be more than enough, and the paid version was a no-brainer for me.
Assembly AI takes audio files and transcribes them incredibly well. You can ask it to label each separate speaker when it's more than just you talking, filter stuff, summarise, detect topics, and other stuff. All for free. There's a live dictation mode, but that's never been good for me.
Assembly AI also has a chatbot after the transcription for you to ask questions and discuss the text you've transcribed. I tend not to use it because I've got a different workflow, but it seems to work fine for most basic requests.
I've used both with smaller devices and they work just as well for me on those as they do via desktop use.
This is a link to a Free Course that discusses the various dictation tools out there. From simple - just use your phone and the Notes feature - to more sophisticated software packages. The real issue to dictation versus writing or typing, is training yourself to do it and use it. Looking at the screen while you dictate can cause you to fixate on spelling or other things the software does, disturbing your creative flow. Just like other tech it takes training.
So much useful information in response to this questions about speech-to-text choices and possibilities for writers. I'm grateful to all who have shared their experiences! Good community here...
I have now added one month add-on subscriptions to those who have commented here, and added to the knowledge of this issue, both paid and free subscribers. (Frank, I couldn't locate your subscription/email... so please email me to let me know what it is, so I can comp you the month! alison@alisonacheson.com is my address.)
I tried a bunch of programs over the years. Acquisition costs, learning curves, (yours and your machine’s), unintuitive commands, etc. And once you’ve got it mastered, you get a cold, and it can’t transcribe a word you’ve dictated.
The best I found was dictating on my first Apple I-Phone. It was remarkably accurate getting medical terms down, for example. Later iterations were not quite as good -I recently found trying dictate a short story while in stop and go traffic at the Massey Tunnel didn’t work well - but they were still pretty good. You’re going to proofread, edit, and revise anyway, right?
Dictation capabilities have been built into phones and computers for years. For example, here’s Apple’s docs for iPhone, iPad and Mac:
https://support.apple.com/guide/iphone/dictate-text-iph2c0651d2/ios
https://support.apple.com/guide/ipad/dictate-text-ipad997d9642/ipados
https://support.apple.com/guide/mac-help/use-dictation-mh40584/mac
Even when I still wrote with pen and paper, dictation didn’t interest me much as a way to get handwritten text into the computer. Now that I write directly on the computer for everything, dictation holds even less interest.
I suppose it comes down to how one creates sentences: by talking/verbalizing or by, well, writing, as with a pen or a keyboard.
One thing that probably takes some getting used to is how with dictation, the same tool — your voice — is used both for commands and for content (text), whereas with a keyboard, the commands are via the menu, mouse, arrow keys, etc., whereas anytime you’re using only the alphanumeric keys (with optional shift key), you’re entering content — a nice delineation.
I compared a few applications for my day job and only for computer interfaced-situations (no mobile or tablet) and either free or already paid for. Here are three:
- Microsoft Word with 365subscription, connected to the internet, has a "Dictate" plug-in, and it is the best in terms of accuracy and ease of giving commands (eg. "next line"), in my experience
- getting a free account on Otter.ai is not bad, but not as accurate in my experience as Word, and then you have to copy and paste back into your doc. But it does have a mobile app.
- opening a Google Doc with Chrome allows you to use the "Voice Typing" command under "Tools". It used to be my first choice, but Word has replaced it.
Seconded. I didn't initially realise this question was for mobile devices, but when I'm away from my desk, Microsoft Word on my Android phone works great...as long as you have an internet connection!
I will appreciate all comments on this because first an injury to right hand, then major new arthritis flare-up on left hand (which has probably been overworking to compensate for my right hand) has stopped me in my tracks about three chapters away from finishing novel I have been working on for a year. I started dictating on mac (so thanks for those links) but so many errors (in part because there are all sorts of made-up welsh based words - part of near future world) that are making text difficult to correct without pain. Short texts work ok. So, one of my questions is, would turning on the ability to record the audio help with this? And is the superiority of Dragon based on its ability to learn, and improve the text to speech abilities? I do know authors who use dictation for first draft, and depending on future problems with arthritis, I may have to seriously do this to keep writing, I don't mind editing, but there is editing and there is finding myself totally re-writing sentences that don't make sense, so I would love more detail on how Dragon improves over Apple dictation!! But over all, the more comments by people who have used both the better.
Right--there are so many reasons to learn more about this! I'm so pleased to see the knowledge shared...
Is it possible to switch mid-dictation and just spell those Welsh words for the software? I would guess that would be useful for all sorts of things like proper names it doesn’t recognize, or words with diacritics — fortunately in English we don’t have to modify letters very often with diacritics, so just spelling out with the basic 26 letters is usually sufficient for almost all English words (or 24 letters if you remember the old joke about L&M getting kicked out for smoking).
I think all dictation software “learns” in the sense that they now are using “AI” to help reduce errors and fix obvious mistakes. For example, I think on-the-fly subtitles are getting better now (YouTube), although they’re still pretty awful for live transcriptions (baseball games), falling behind the speaker, etc. Most TV shows and movies (and many podcasts) have subtitles/transcripts that have been reviewed, not just automated, although the first pass transcription is probably always done with software, then a producer or contractor or someone reviews it.
You can spell things out with Dragon. If it's a word you'll use a lot, you can also teach it new words. You type what you want to say, then say it a couple of times to train it, and you're good to go.
That really sounds useful!
For over a year, my writing partner and I have been using Otter.ai to transcribe conversations for a substack we're about to launch. It began when I was baffled to realize that writing and publishing a memoir has proved to be psychologically transformative and turned to a writer-friend, who is also a neuropsychiatrist, for help figuring out how the process of writing could have had this impact. We came up with the idea of searching for answers by talking through the many issues involved and of recording our talk.
We started with Free which offers 300 min a month and 30mins per conversation. Eventually, we signed up for Pro ($9/month, less if you prepay for a year), which gives you 1200 mins a month and 90 min conversations. I was never able to do the audio through my computer microphone, so I call my partner and just put my cell phone on speaker. The transcriptions are quite accurate. After each, I read through and correct from what I remember or replay the audio, which scrolls along below the text. Real conversations don't translate into well shaped dialogue, but we have the record of our exchanges to work from. It's been terrific.
Our situation is probably unusual, but Otter would be just as good for interviewing or capturing zoom conversations.
An interesting way to work!
Yes, that’s a great idea, rather than, say, taking notes or something. It sounds similar to the automatic transcription that Apple’s Podcasts app offers now. (I find it much faster to read the transcription than listen to the podcast.) A couple of cautions there, I suppose, that probably don’t apply to you, is that the podcast transcription doesn’t appear to distinguish between speakers when there’s more than one, and if someone wanted to quote something said on a podcast, they would probably want to listen to the audio again to make sure the transcription was 100% accurate. These cautions would probably apply, say, to a Zoom transcription.
At least in the case of two interlocutors, Otter.ai does identify each speaker by name, which you can introduce in the beginning. Often in our excitement about a new idea we speak over one another, making it impossible for Otter to identify who's speaking--let alone what we're saying--so it gets it wrong, and I have to go back to the audio to figure it out. The transcription is necessarily imperfect. Very quickly we realized that we could not use it directly in any case, because we repeat ourselves, speak discursively, don't build to a point, etc. It takes a lot of work to create good dialogue, but it's editing work, sweet work that we both enjoy. Of course, with an interview you don't have that latitude.
I appreciate this clarification--yes, it makes sense that this would create such labour. I suppose you could use your names to intro yourselves... but the cost would be losing the rhythm and nature of conversation. And would still need the editorial work.
"Sweet work" is good work; sometimes there are no short cuts!
Two of my favourite tools at the moment for dictation and transcription are https://audiopen.ai/ and https://www.assemblyai.com/playground
AudioPen transcribes and also attempts to tidy up or edit or repurpose what you've said, all according to your requirements. So you get a pretty good transcript, as well as a host of other useful tools. The free version may be more than enough, and the paid version was a no-brainer for me.
Assembly AI takes audio files and transcribes them incredibly well. You can ask it to label each separate speaker when it's more than just you talking, filter stuff, summarise, detect topics, and other stuff. All for free. There's a live dictation mode, but that's never been good for me.
Assembly AI also has a chatbot after the transcription for you to ask questions and discuss the text you've transcribed. I tend not to use it because I've got a different workflow, but it seems to work fine for most basic requests.
I've used both with smaller devices and they work just as well for me on those as they do via desktop use.
This is a link to a Free Course that discusses the various dictation tools out there. From simple - just use your phone and the Notes feature - to more sophisticated software packages. The real issue to dictation versus writing or typing, is training yourself to do it and use it. Looking at the screen while you dictate can cause you to fixate on spelling or other things the software does, disturbing your creative flow. Just like other tech it takes training.
https://fictioncourses.thrivecart.com/dictation--scrivener---power-combo/
So much useful information in response to this questions about speech-to-text choices and possibilities for writers. I'm grateful to all who have shared their experiences! Good community here...
I have now added one month add-on subscriptions to those who have commented here, and added to the knowledge of this issue, both paid and free subscribers. (Frank, I couldn't locate your subscription/email... so please email me to let me know what it is, so I can comp you the month! alison@alisonacheson.com is my address.)
THANK YOU, ALL for this!