cvoicecontrol

Differences between revisions 1 and 2
Revision 1 as of 2010-05-12 17:06:20
Size: 6173
Editor: bas15-toronto12-1168007490
Comment:
Revision 2 as of 2010-05-12 18:45:19
Size: 5509
Editor: bas15-toronto12-1168007490
Comment:
Deletions are marked like this. Additions are marked like this.
Line 13: Line 13:
  * i.e., w* Runtime Gotchas:
** if your unix command does not terminate immediately, i.e. if you launch vlc, you must postfix the fork & ampersand on the unix command when specifying it in model_editor. Failure to do this will cause cvoicecontrol to wait for the child process to terminate before continuing to process voice commands.
** the speach matching is pretty liberal, and may (especially during initial configuration) accidentally execute the incorrect unix command. Only use transient read-only unix commands with cvoicecontrol (no rm : - ).
** model_editor requires absolute paths for unix commands. ~/bin/flashing-lights -> /home/me/bin/flashing-lights
ith gnome-voice-control, changing 'star trek' to 'make it so' requires a src change
  * i.e., with gnome-voice-control, changing 'star trek' to 'make it so' requires a src change

cvoicecontrol project overview

  • cvoicecontrol original home page http://www.kiecza.net/daniel/linux/index.html

  • cvoicecontrol maps speech to unix commands
  • you say 'star trek' and cvoicecontrol executes the defined unix command 'vlc /home/me/startrek/*.avi'
  • as of 5/2010, no alternative package exists to map custom speech to custom unix commands
  • gnome-voice-control hard-codes the speech commands as well as the unix commands.
    • i.e., with gnome-voice-control, changing 'star trek' to 'make it so' requires a src change
  • cvoicecontrol defines the map (speech, unix-command) in configuration, and is specifically designed to update the map quickly and easily (cool!)

cvoicecontrol is currently not in service...

  • Why cvoicecontrol is not used widely as of 5/2010:
    • cvoicecontrol was written in 2000 by Daniel Kiecza
    • bugs/problems have blocked potential users of cvoicecontrol for years
    • looks like no one has used it since at least 2007

Overview of cvoicecontrol architecture

  • cvoicecontrol compiles three binaries:
    • microphone_config - one-time microphone setup, outputs config file
    • model_editor - manager of (speech, unix command) maps, each map stored in distinct *.cvc file
    • cvoicecontrol - actual voice control daemon, loads a single *.cvc file, usage ./cvoicecontrol my-voice-commands.cvc
  • voice recognition works as follows:
    • model_editor stores four or more samples of your voice for the same voice command
      • i.e. to have "make it so" launch your startrek dvd, you will record "make it so" four times in model_editor
    • cvoicecontrol matches utterances with the voice samples you recorded in model_editor, and then executes the associated unix command
    • the whole voice recognition process relies on the microphone config file generated by microphone_config

Problems preventing release of cvoicecontrol-0.9alpha

  • bug in microphone_config binary causes hard crash:
    • after finishing configuration, microphone_config writes a config file out. this fails.
    • in microphone_config.c:1155, the implementation constructing a valid path, "$HOME/.cvoiceconfig/config", to write out the file, causes a hard crash
    • temporary fix: comment this code out and write the config file out to /tmp
    • because microphone_config is run once-per-microphone, moving the config file manually from /tmp to "$HOME/.cvoiceconfig/" is ok-for-now
  • depends on ncurses4. Ubuntu 10.04 ships with ncurses5
    • libncurses4 and libncurses4-dev install ok with libncurses5 present
    • possible issue with symlink, that may interfere with ncurses5: /usr/lib/libncurses.so -> /lib/libncurses.so.4

  • microphone_config and model_editor binaries, both which use libncurses, work in the rxvt terminal. xterm and Eterm do not work at all.

Happy Path to building and getting cvoicecontrol working

  • !!! NOTE: this does not include the aforementioned changes to microphone_config.c, therefore microphone_config will crash when it attempts to write config file !!!
  • building:
    • get src http://www.kiecza.net/daniel/linux/index.html

    • install libncurses4 and libncurses4-dev (available from http debian package repository)
    • ./configure
    • edit cvoicecontrol-0.9alpha/cvoicecontrol/Makefile:
      • note, this is *not* the root Makefile located at cvoicecontrol-0.9alpha/Makefile
      • search LIBNCURSES,
      • old < LIBNCURSES =

      • new > LIBNCURSES = -lncurses

    • make
    • all three binaries should now be located in cvoicecontrol-0.9alpha/cvoicecontrol/
  • voice commands setup:
    • see documentation for all stages of this process at the original project website http://www.kiecza.net/daniel/linux/index.html

    • install rxvt
    • run rxvt, run microphone_config
    • proceed through microphone configuration
      • Select Audio Device, /dev/dsp1 is generally your usb headset
      • Write Configuration, this step will fail and hard crash unless you make the changes to microphone_config.c mentioned above
    • exit microphone_config, mv the config file to $HOME/.cvoiceconfig/config
    • run model_editor, see original documentation mentioned above
    • in model_editor, Load/Save Speaker Model, be sure to include the .cvc extension in the file name
  • running the cvoicecontrol daemon:
    • cvoicecontrol your-voice-map.cvc
    • at this point, mine "just worked". I spoke "play music" and my vlc unix command was executed. Note I had to speak very loudly because my generated microphone_config file tells cvoicecontrol to wait until I speak loudly :-P.

Runtime Gotchas

  • if your unix command does not terminate immediately, i.e. if you launch vlc, you must postfix the fork & ampersand on the unix command when specifying it in model_editor. Failure to do this will cause cvoicecontrol to wait for the child process to terminate before continuing to process voice commands.

    • i.e. my voice command map contains ("play music", "/home/me/bin/music-start&"), and music-start is a script which launches vlc

  • the speach matching is pretty liberal, and may (especially during initial configuration) accidentally execute the incorrect unix command. Only use transient read-only unix commands with cvoicecontrol (no rm : - ).
  • model_editor requires absolute paths for unix commands. ~/bin/flashing-lights -> /home/me/bin/flashing-lights

cvoicecontrol (last edited 2010-05-12 18:45:19 by bas15-toronto12-1168007490)