Become a MacRumors Supporter for $50/year with no ads, ability to filter front page stories, and private forums.

slomojoe

macrumors regular
Original poster
Sep 10, 2018
178
132
Canada
I input voice to text on my phone using: Drafts, Office, Evernote for short snippets of dictation and I wonder how this works ?

Do they all process via the same voice to speech software engine or do they each use their own ?

Is there a particular app that does the best job (Drafts seems the best but not by too much) or should they all be about the same ?

My guess is that they all default to Apple's built-in engine, is this right ?
 
  • Like
Reactions: Reggaenald

ericwn

macrumors G5
Apr 24, 2016
12,113
10,899
I believe anything you trigger via the dictation button on the keyboard uses Apple’s own voice to text algorithms.
 

slomojoe

macrumors regular
Original poster
Sep 10, 2018
178
132
Canada
I believe anything you trigger via the dictation button on the keyboard uses Apple’s own voice to text algorithms.
right, thanks, that was my thinking, i know that microsoft and google have very good dictation software, it seems that, if i use the google search app, i get better recognition, so it isn't their speech engine i am activating but all still apple's universal speech to text ?

hard to do a one to one test to see if it really varies or is just my fantasy
 

cynics

macrumors G4
Jan 8, 2012
11,959
2,156
Google Speech is the API Google uses in their iOS apps. iOS devs can use the API in their own apps. Google does have multiple implementations of speech to text (STT) for on device (off line) processing but usually that is used in combination with their objectively better online (Google server) processing. Its get a bit sketchy though when it comes to pricing. If you use Google Speech API they charge you less if you allow google to log your users speech data.....and since app development is a business design to make money and paying less overhead = more profit....well...

Screen Shot 2022-02-14 at 5.38.12 AM.png

Much like Apple they claim its to improve the speech to text. This is Googles wording so my bias doesn't rub off, keep in mind though thats for the developers to opt in and out with their app not the user...

Apple has their Speech Framework available to iOS devs. Its "free" if you ignore the yearly expense for being a dev and Apple tax. To be fair Apple uses everyones voice to "improve their product". However Apple requires devs to be very explicit that the STT is going to be used for product improvement.

Google on the other hand skirts around this requirement by giving the user instructions on how to turn the mic on in settings with no word that their voice is being logged or anything...here is a comparison.

IMG_A1AB283D5DE5-1.jpeg Screen Shot 2022-02-14 at 5.47.53 AM.png

Also if you use Apples Speech you can only ask the user that question one time, subsequent request are blocked. Apple thinks if you bug someone enough they will eventually say Yes...probably right..more info on that..

I should mention that Google is asking for generic microphone permissions while Apple is asking specifically for permission for the app to use their Speech Recognition framework. Google just wants the mic on and its just assumed its to use their Speech API, which is a fair assessment IMO.

Apple and Google aren't the only game in town though. There are a half dozen companies that develop STT SDK's for iOS for high quality on device (offline) use.

I prefer Apple, it seems to work better for me which might just be because I've adjusted too it from using it for so long. I can say they all suck in their own way, so does Alexa. Privacy concerns aside I think what it mostly comes down to is how well the STT handles your language and your accent. Me having a standard east coast American accent both of them do fairly well with Apple picking up my voice over other people and background much better. Oh and it can identify me, vs other people which is nice for HomeKit.

Man...that was long winded...apologies...
 

doobydoooby

macrumors regular
Oct 17, 2011
240
339
Genève, Switzerland
Have a look at Otter, Ive found it to be excellent for transcription. It uses its own servers, and seems to make fewer errors than any other I've tried. There's a free package for up to 600 minutes per month.

 

slomojoe

macrumors regular
Original poster
Sep 10, 2018
178
132
Canada
Google Speech is the API Google uses in their iOS apps. iOS devs can use the API in their own apps. Google does have multiple implementations of speech to text (STT) for on device (off line) processing but usually that is used in combination with their objectively better online (Google server) processing. Its get a bit sketchy though when it comes to pricing. If you use Google Speech API they charge you less if you allow google to log your users speech data.....and since app development is a business design to make money and paying less overhead = more profit....well...

View attachment 1958696

Much like Apple they claim its to improve the speech to text. This is Googles wording so my bias doesn't rub off, keep in mind though thats for the developers to opt in and out with their app not the user...

Apple has their Speech Framework available to iOS devs. Its "free" if you ignore the yearly expense for being a dev and Apple tax. To be fair Apple uses everyones voice to "improve their product". However Apple requires devs to be very explicit that the STT is going to be used for product improvement.

Google on the other hand skirts around this requirement by giving the user instructions on how to turn the mic on in settings with no word that their voice is being logged or anything...here is a comparison.

View attachment 1958698 View attachment 1958697

Also if you use Apples Speech you can only ask the user that question one time, subsequent request are blocked. Apple thinks if you bug someone enough they will eventually say Yes...probably right..more info on that..

I should mention that Google is asking for generic microphone permissions while Apple is asking specifically for permission for the app to use their Speech Recognition framework. Google just wants the mic on and its just assumed its to use their Speech API, which is a fair assessment IMO.

Apple and Google aren't the only game in town though. There are a half dozen companies that develop STT SDK's for iOS for high quality on device (offline) use.

I prefer Apple, it seems to work better for me which might just be because I've adjusted too it from using it for so long. I can say they all suck in their own way, so does Alexa. Privacy concerns aside I think what it mostly comes down to is how well the STT handles your language and your accent. Me having a standard east coast American accent both of them do fairly well with Apple picking up my voice over other people and background much better. Oh and it can identify me, vs other people which is nice for HomeKit.

Man...that was long winded...apologies...
super interesting and helpful thanks, i get then that a developer can choose the engine they want which answers my question (even if it doesn't clear things up completely) that all speech to text is not processed through apple developers can use whatever engine they choose and also, the answer is to experiment and see what works for me thanks !
 

slomojoe

macrumors regular
Original poster
Sep 10, 2018
178
132
Canada
Have a look at Otter, Ive found it to be excellent for transcription. It uses its own servers, and seems to make fewer errors than any other I've tried. There's a free package for up to 600 minutes per month.

also helpful, have bookmarked, thanks!
 
Register on MacRumors! This sidebar will go away, and you'll see fewer ads.