Offline Speech Recognition In Android (JellyBean)

AndroidSpeech RecognitionOfflineSpeech to-TextGoogle Now

Android Problem Overview


It looks as though Google has made offline speech recognition available from Google Now for third-party apps. It is being used by the app named Utter.

Has anyone seen any implementations of how to do simple voice commands with this offline speech rec? Do you just use the regular SpeechRecognizer API and it works automatically?

Android Solutions


Solution 1 - Android

Google did quietly enable offline recognition in that Search update, but there is (as yet) no API or additional parameters available within the SpeechRecognizer class. {See Edit at the bottom of this post} The functionality is available with no additional coding, however the user’s device will need to be configured correctly for it to begin working and this is where the problem lies and I would imagine why a lot of developers assume they are ‘missing something’.

Also, Google have restricted certain Jelly Bean devices from using the offline recognition due to hardware constraints. Which devices this applies to is not documented, in fact, nothing is documented, so configuring the capabilities for the user has proved to be a matter of trial and error (for them). It works for some straight away – For those that it doesn't, this is the ‘guide’ I supply them with.

  1. Make sure the default Android Voice Recogniser is set to Google not Samsung/Vlingo
  2. Uninstall any offline recognition files you already have installed from the Google Voice Search Settings
  3. Go to your Android Application Settings and see if you can uninstall the updates for the Google Search and Google Voice Search applications.
  4. If you can't do the above, go to the Play Store see if you have the option there.
  5. Reboot (if you achieved 2, 3 or 4)
  6. Update Google Search and Google Voice Search from the Play Store (if you achieved 3 or 4 or if an update is available anyway).
  7. Reboot (if you achieved 6)
  8. Install English UK offline language files
  9. Reboot
  10. Use utter! with a connection
  11. Switch to aeroplane mode and give it a try
  12. Once it is working, the offline recognition of other languages, such as English US should start working too.

EDIT: Temporarily changing the device locale to English UK also seems to kickstart this to work for some.

Some users reported they still had to reboot a number of times before it would begin working, but they all get there eventually, often inexplicably to what was the trigger, the key to which are inside the Google Search APK, so not in the public domain or part of AOSP.

From what I can establish, Google tests the availability of a connection prior to deciding whether to use offline or online recognition. If a connection is available initially but is lost prior to the response, Google will supply a connection error, it won’t fall-back to offline. As a side note, if a request for the network synthesised voice has been made, there is no error supplied it if fails – You get silence.

The Google Search update enabled no additional features in Google Now and in fact if you try to use it with no internet connection, it will error. I mention this as I wondered if the ability would be withdrawn as quietly as it appeared and therefore shouldn't be relied upon in production.

If you intend to start using the SpeechRecognizer class, be warned, there is a pretty major bug associated with it, which require your own implementation to handle.

Not being able to specifically request offline = true, makes controlling this feature impossible without manipulating the data connection. Rubbish. You’ll get hundreds of user emails asking you why you haven’t enabled something so simple!

EDIT: Since API level 23 a new parameter has been added EXTRA_PREFER_OFFLINE which the Google recognition service does appear to adhere to.

Hope the above helps.

Solution 2 - Android

I would like to improve the guide that the answer https://stackoverflow.com/a/17674655/2987828 sends to its users, with images. It is the sentence "For those that it doesn't, this is the ‘guide’ I supply them with." that I want to improve.

The user should click on the four buttons highlighted in blue in these images:

Go to your Android Application Settings, select Languages and input, edit Settings of Google Voice typing, select Download Offline speech recognition, select your languages in the ALL tab.

Then the user can select any desired languages. When the download is done, he should disconnect from network, and then click on the "microphone" button of the keyboard.

It worked for me (android 4.1.2), then language recognition worked out of the box, without rebooting. I can now dictates instructions to the shell of Terminal Emulator ! And it is twice faster offline than online, on a padfone 2 from ASUS.

These images are licensed under cc by-sa 3.0 with attribution required to stackoverflow.com/a/21329845/2987828 ; you may hence add these images anywhere along with this attribution.

(This the standard policy of all images and texts at stackoverflow.com)

Solution 3 - Android

A simple and flexible offline recognition on Android is implemented by CMUSphinx, an open source speech recognition toolkit. It works purely offline, fast and configurable It can listen continuously for keyword, for example.

You can find latest code and tutorial here.

Update in 2019: Time goes fast, CMUSphinx is not that accurate anymore. I recommend to try Kaldi toolkit instead. The demo is here.

Solution 4 - Android

In short, I don't have the implementation, but the explanation.

Google did not make offline speech recognition available to third party apps. Offline recognition is only accessable via the keyboard. Ben Randall (the developer of utter!) explains his workaround in an article at Android Police:

> I had implemented my own keyboard and was switching between Google > Voice Typing and the users default keyboard with an invisible edit > text field and transparent Activity to get the input. Dirty hack! > > This was the only way to do it, as offline Voice Typing could only be > triggered by an IME or a system application (that was my root hack) . > The other type of recognition API … didn't trigger it and just failed > with a server error. … A lot of work wasted for me on the workaround! > But at least I was ready for the implementation...

From Utter! Claims To Be The First Non-IME App To Utilize Offline Voice Recognition In Jelly Bean

Solution 5 - Android

I successfully implemented my Speech-Service with offline capabilities by using onPartialResults when offline and onResults when online.

Solution 6 - Android

I was dealing with this and I noticed that you need to install the offline package for your Language. My language setting was "Español (Estados Unidos)" but there is not offline package for that language, so when I turned off all network connectivity I was getting an alert from RecognizerIntent saying that can't reach Google, then I change the language to "English (US)" (because I already have the offline package) and launched the RecognizerIntent it just worked out.

Keys: Language setting == Offline Voice Recognizer Package

Solution 7 - Android

It is apparently possible to manually install offline voice recognition by downloading the files directly and installing them in the right locations manually. I guess this is just a way to bypass Google hardware requirements. However, personally I didn't have to reboot or anything, simply changing to UK and back again did it.

Solution 8 - Android

Working example is given below,

MyService.class

public class MyService extends Service implements SpeechDelegate, Speech.stopDueToDelay {

  public static SpeechDelegate delegate;

  @Override
  public int onStartCommand(Intent intent, int flags, int startId) {
    //TODO do something useful
    try {
      if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
        ((AudioManager) Objects.requireNonNull(
          getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
      }
    } catch (Exception e) {
      e.printStackTrace();
    }

    Speech.init(this);
    delegate = this;
    Speech.getInstance().setListener(this);

    if (Speech.getInstance().isListening()) {
      Speech.getInstance().stopListening();
    } else {
      System.setProperty("rx.unsafe-disable", "True");
      RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
        if (granted) { // Always true pre-M
          try {
            Speech.getInstance().stopTextToSpeech();
            Speech.getInstance().startListening(null, this);
          } catch (SpeechRecognitionNotAvailable exc) {
            //showSpeechNotSupportedDialog();

          } catch (GoogleVoiceTypingDisabledException exc) {
            //showEnableGoogleVoiceTyping();
          }
        } else {
          Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
        }
      });
    }
    return Service.START_STICKY;
  }

  @Override
  public IBinder onBind(Intent intent) {
    //TODO for communication return IBinder implementation
    return null;
  }

  @Override
  public void onStartOfSpeech() {
  }

  @Override
  public void onSpeechRmsChanged(float value) {

  }

  @Override
  public void onSpeechPartialResults(List<String> results) {
    for (String partial : results) {
      Log.d("Result", partial+"");
    }
  }

  @Override
  public void onSpeechResult(String result) {
    Log.d("Result", result+"");
    if (!TextUtils.isEmpty(result)) {
      Toast.makeText(this, result, Toast.LENGTH_SHORT).show();
    }
  }

  @Override
  public void onSpecifiedCommandPronounced(String event) {
    try {
      if (VERSION.SDK_INT >= VERSION_CODES.KITKAT) {
        ((AudioManager) Objects.requireNonNull(
          getSystemService(Context.AUDIO_SERVICE))).setStreamMute(AudioManager.STREAM_SYSTEM, true);
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
    if (Speech.getInstance().isListening()) {
      Speech.getInstance().stopListening();
    } else {
      RxPermissions.getInstance(this).request(permission.RECORD_AUDIO).subscribe(granted -> {
        if (granted) { // Always true pre-M
          try {
            Speech.getInstance().stopTextToSpeech();
            Speech.getInstance().startListening(null, this);
          } catch (SpeechRecognitionNotAvailable exc) {
            //showSpeechNotSupportedDialog();

          } catch (GoogleVoiceTypingDisabledException exc) {
            //showEnableGoogleVoiceTyping();
          }
        } else {
          Toast.makeText(this, R.string.permission_required, Toast.LENGTH_LONG).show();
        }
      });
    }
  }


  @Override
  public void onTaskRemoved(Intent rootIntent) {
    //Restarting the service if it is removed.
    PendingIntent service =
      PendingIntent.getService(getApplicationContext(), new Random().nextInt(),
        new Intent(getApplicationContext(), MyService.class), PendingIntent.FLAG_ONE_SHOT);

    AlarmManager alarmManager = (AlarmManager) getSystemService(Context.ALARM_SERVICE);
    assert alarmManager != null;
    alarmManager.set(AlarmManager.ELAPSED_REALTIME_WAKEUP, 1000, service);
    super.onTaskRemoved(rootIntent);
  }
}

For more details,

> https://github.com/sachinvarma/Speech-Recognizer

Hope this will help someone in future.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionrmooneyView Question on Stackoverflow
Solution 1 - AndroidbrandallView Answer on Stackoverflow
Solution 2 - Androiduser2987828View Answer on Stackoverflow
Solution 3 - AndroidNikolay ShmyrevView Answer on Stackoverflow
Solution 4 - AndroidLeon JoosseView Answer on Stackoverflow
Solution 5 - AndroidP. StresowView Answer on Stackoverflow
Solution 6 - AndroidAkinoView Answer on Stackoverflow
Solution 7 - AndroidRiju ChatterjeeView Answer on Stackoverflow
Solution 8 - AndroidSachin VarmaView Answer on Stackoverflow