ConsolidateSpellingLibs
Launchpad Entry: https://launchpad.net/distros/ubuntu/+spec/consolidate-spell-checkers
Packages affected: hunspell, myspell, aspell, ispell and depending packages
Summary
Reduce the number of spelling libraries used in main; modify / extend applications currently not using the One True Spellchecker to do so.
Rationale
Supporting only one implementation and set of dictionaries eases long-term maintenance. Also, words which the user teaches to the spellcheckers will be consistently available throughout the system.
Use cases
Jane spell-checks a document in OpenOffice and adds a few new ones to her personal dictionary. Half an hour later she discusses the document with a colleague in ICQ. Pidgin's spell checking automatically knows about the newly added words.
- Hans pays by the byte for his internet connection. When installing Intrepid, it downloads a lot less for language support than Hardy did. He is happy that it installs faster, too.
Bob is a user who is affected with anger management issues. When he sees that a word he told Pidgin to remember isn't known by OpenOffice the next day, his Ubuntu installation might suffer physical/virtual damage, posing a risk to his health.
Scope
This affects all supported packages in Ubuntu which have spell-checking capability.
Design
Status quo
Currently four implementations are in use in current Intrepid:
- ispell: still used by some applications, it's getting replaced by aspell. Not used in Ubuntu main any more, and just 4 reverse dependencies in Universe (spell, sqwebmail, liblingua-ispell-perl, gnumed-client)
- aspell: ispell replacement. Biggest users in main are Gnome and KDE. Other reverse dependency in main: ekg, pan, php5-pspell, python-gnome2-extras. About 20 reverse dependencies in universe.
- myspell: ispell replacement. Not used by anything in Ubuntu any more.
- hunspell: fork/replacement for myspell, dictionaries being compatible with myspell. Currently used by openoffice, Thunderbird, and Firefox.
- Finnish currently uses the voikko system, which computes the multitude of possible word permissions from a common stem. Altaic (Turkish and related languages) and Ugric (Estonian, Finnish, Hungarian and relatives) languages with complex prefix and suffix systems generally work badly with above systems. So we will continue to support Voikko for Finnish. Since the supported languages of Voikko and hunspell do not overlap, this does not cause large compatibility problems.
ispell and aspell provide a C interface, myspell and hunspell provide a C++ interface. The enchant library provides a way to abstract from the underlying spell checker (aspell or myspell) and can be used to convert applications to a common interface.
Dictionary support is splitted out in its own spec ConsolidateDictionaries.
Goal in Intrepid
hunspell is the most modern implementation and considered the best choice in the free software world.
- Change GNOME, KDE, and other packages in main to use hunspell.
Drop ispell/aspell dictionaries from language-support-* and from main.
- Demote ispell to universe. Remove myspell from the archive.
Drop myspell dictionaries from language-support-* where a hunspell dictionary is available. Keep the myspell one for other languages.
Implementation
GNOME
- Fedora has patches to support hunspell through enchant now;
The http://fedoraproject.org/wiki/Releases/FeatureDictionary spec has pointers to bugzilla.
Szilveszter Farkas's PPA has test packages with those patches applied.
KDE
KDE4 upstream already uses enchant/hunspell. Above Fedora specification has patches for KDE 3, but since Intrepid will only ship KDE 4.1, we do not need to do anything in particular for KDE for this specification.
Outstanding issues
The demotion of aspell has not been discussed. It would require porting php and ekg to hunspell, which hasn't happened upstream yet. Thus we will keep the aspell library itself in main for the time being.
Comments
Can you please also look into installing a dictionary for English (Canada) by default? When en_CA.UTF8 is selected, no Canadian dictionary is selected for OpenOffice. For Firefox, I believe the Great Britian English lib is installed. This is okay but not great. It would be nice to have that for a minimum with OpenOffice instead of having to manually install a dictionary for our language. But what would be best, of course, is the full blown Canadian English lib. Where is it anyway? I'm fully prepared to work on getting this packaged, or creating it myself if need be, just tell me what to do! -- brett.jr.alton@gmail.com (I forget how to sign my name).
Why not use enchant (http://www.abisource.com/enchant/) everywhere? It support all these *spell and Voikko, plus it also supports Zemberek (Turkish spellchecking library) Currently, Debian Lenny's gtkspell uses it and it rocks!
ConsolidateSpellingLibs (last edited 2008-08-06 16:16:08 by localhost)