<< go_back -Main Site- >> How to train properly / Common errors

Some things to know

Usage

It is most likely that you need to train Simon to your voice, otherwise it will not recognise your speech.
To do that, use the provided training sentences in the modules.

A good recognition rate (in the Vocabulary tab of the scenario) is about 30-50 but more is even better (100-120).
This depends also on the spoken word and your way of saying it, shorter words are harder to train.
For very short words who are not clearly spoken sometimes a recognition rate of 100 or more is necessary.

Also the more scenarios you use, the more you need to train. Think of it as tree branches, if there are more branches to go to Simon needs more direction from the starting point on what is the right direction.

How much do I need to train ?

Well this depends on many factors:
How good is the base language model ?
-> More hours are better!
Do you speak a dialect who is only spoken in three villages ?
-> This will make it harder to mix your speech with the base model and can lead to many false-positives.
Do you use long Sentences in the commands or short words with a lot of "non-spoken" letters ?
-> Longer is better.
Do you use shot-cuts like: "recon" instead of "recommend"
-> Short-cuts are not a problem for Simon, if the language model has that short-cut word built in. However as it is shorter it is harder to recognise.

For a good recognition rate I would recommend the following:
- Check that your mic is loud enough but not too loud either. More information
- Train the basemodel at Voxforge if possible with your voice. Anything from 120 to 500 minutes is good. Read a book or something so it gets a "feel" for your general voice.
- Simon commands should be trained till the recognition rate is above 50-70

If you get a lot of false-positives, train MORE!!!

Other things of importance

Be aware that training and computing the model can take many hours (6-12 hours and more!!!) dependent on CPU and some RAM.
Here on my quad 3.ghz cpu with all modules installed and properly trained for me it takes like 10 hours to recalculate to model.

This obviously depends heavily on the amount of trainings samples you speak into Simon.

As for now (10/2015) there is a bug in Simon(4.0 and 4.1) (using US ASCII) which leads to a malformation of the sample-filenames and also of the prompt-texts if you use German Umlauts and ß in the scenarios.

As this is heavily the case for a German speaking community. I wrote a fixing-script for this. Sadly the simon development has stopped and the successor who is being built could lack many of the features of Simon.

For more information see: How to train properly.

Also quite annoying and misleading is the fact, that the progress bar on the lower right side inside of simon (Compiling model)) does not always speak the truth.
This means, when it is fished compiling it may show 100% but writes still "Compiling model". To make sure if the model is finished, restart Simon.

During model compile Simon may activate the "listening" part and deactivates it, numerous times. I don't know what is causing this but it still compiles the model successfully.

It is very advisable to backup your training data regularly! It takes quite some time to build it up and a loss would be a shame!
You can use my backup script in the "others" folder, but you need to edit it.

You need to backup this folder and this file:

   .kde4/share/apps/simon/model/training.data

   .kde4/share/apps/simon/model/prompts

Also make sure you used the repair script beforehand otherwise a recompile with the new (but corrupted) data might not be possible.

Another thing you should be aware of is, that it is not beneficial to train only words without the grammar structure.
You can do it but it will be not as good as if you would speak the whole sentence. Also this can lead to strange SPHINX errors (who are ignored, to a certain extend but if they are to much, sphinx just crashes!)!

While a new model is compiling, no speech-recognition is possible, even if the "Activated" button is pressed, it just does not work while the compilation is in progress.

Simon does all the compilation work and caching in /tmp. This can lead to many strange errors in Simon itself, if there is no space left!
I would advise you to have at least 1-2G of space left in /tmp for one decent model compilation (the compilation of previous models get's cleaned up by the fixing script).

If you start a model compilation and disconnect from Simond server too soon, the compilation will fail.
To be sure do not disconnect while the compilation is running.

The relevance in the Trainingcolumn in Simon is the combined relevance off all words in that scenario, not the individual recognition rate.

What is the GOOGLE API key ?

It is an unique key that allows you as user to use sepcial Google services like translation, Geolocation and many more WITHOUT the need to use a Webbrowser. This enables you to use these services in your programms and scripts.

It allows Google to monitor your usage of these services, so they can block you or charge you for using more than they allow freely.

Why do I need one ?

For the use of the voicecontrol/speech-recognition you don't have to have one but without it you will not be able to use these modules. (This list will most likely grow over time as new words will enter the language model)

Internet search with your voice (Amazon, Google etc)
Translation (translate words in other languages)
Music (the ability to select a specific track by just saying its name)

How to get one?

Follow this Google HowTo -> http://www.chromium.org/developers/how-tos/api-keys

Note that the keys you have now acquired are not for distribution purposes and must not be shared with other users.

Restrictions

Google restricts the use of their Speech API to 50 uses per day per API-Project.

How to use the API key with linjark ?

Copy the API Key (and only the API KEY) and paste it into a bash terminal with this command:
echo PASTE_KEY_HERE > ~/.linjark/google_api
Thats it! The scripts who need access to the Google API will look into that file and get the key.