ConsolidateSpellingLibs

Differences between revisions 1 and 11 (spanning 10 versions)
Revision 1 as of 2005-10-31 16:57:46
Size: 1401
Editor: 187_220_103_66-WIFI_HOTSPOTS
Comment: new page
Revision 11 as of 2006-12-21 11:35:19
Size: 4288
Editor: Home04482
Comment: point to other similar efforts
Deletions are marked like this. Additions are marked like this.
Line 4: Line 4:
 * '''Created''': 2005-10-30 by MatthiasKlose
 * '''Contributors''':
Line 10: Line 8:
Replace three of the spelling libraries by one implementation and use that one for all applications supported in main. Currently up to four implementations are in use: Reduce the number of spelling libraries used in main; modify / extend applications currently not using the "system spellchecker" to use the system spell checker as well. Currently up to four implementations are in use:
Line 12: Line 10:
 * ispell: still used by some applications, its getting replaced by aspell.  * ispell: still used by some applications, it's getting replaced by aspell (no application in main uses it any more).
Line 14: Line 12:
 * aspell: ispell replacement, biggest users in main are gnome and KDE  * aspell: ispell replacement, biggest users in main are Gnome and KDE
Line 16: Line 14:
 * myspell: ispell replacement, currently used by thunderbird, mozilla, openoffice.  * myspell: ispell replacement, currently used by thunderbird, mozilla.
Line 18: Line 16:
 * hunspell: fork/replacement for myspell, dictionaries beeing compatible with aspell.  * hunspell: fork/replacement for myspell, dictionaries beeing compatible with myspell, used by openoffice.
Line 20: Line 18:
Dictionaries are currently packaged either separately for each spelling checker, or in some cases, derived from one common dictionary source. ispell and aspell provide a C interface, myspell and hunspell provide a C++ interface. The enchant library provides a way to abstract from the underlying spell checker (aspell or myspell) and can be used to convert applications to a common interface.
Line 22: Line 20:
ispell and aspell do provide a C interface, myspell and hunspell do provide a C++ interface. Dictionary support is splitted out in its own spec ConsolidateDictionaries.
Line 26: Line 24:
Support only one implementation in the long term, and more importantly, only support the dictionaries for on spelling library. Support only one implementation in the long term. More importantly, only support one source of dictionaries for one spelling library.
Line 29: Line 27:

Jane wonders why some words she writes in Gaim are marked as wrong, although they aren't marked as wrong in OpenOffice.

Asdf Qwert worries about the fact that no program knows his name. He is very depressed, because there is no easy way to teach it to all programs.

Karl from Germany, who knows that Thunderbird uses ispell, wonders why Thunderbird isn't able to spellcheck his emails using a German dictionary, although he has installed ingerman and iogerman. He also wonders why OpenOffice finds his dictionaries and Thunderbird doesn't. (TB uses an older version of ispell and dictionaries are installed via xpis.)

{{{ tfheen: please provide use cases which show how things are supposed to work, not how they are currently broken }}}
Line 36: Line 42:
=== Code === The system spellchecker used in both Gnome and KDE is aspell. Two applications are not yet using aspell:
Line 38: Line 44:
=== Data preservation and migration === === Firefox ===

 * First step: use hunspell instead of myspell with the goal to demote myspell from main. myspell and hunspell are supposed to be API compatible.

 * Convert to use enchant.

=== OpenOffice.org ===

OOo makes use of hunspell-based dictionaries and hyphenation patterns. Hypenation patterns are not (?) supported by aspell, so it may prove difficult to completely replace the use of hunspell by aspell. In a first step, write a plugin to additionally read/write the aspell system and user dictionaries in addition to using the aspell dictionaries.
(janimo: Why replace hunspell with aspell in OO? They made the switch the other way around starting with 2.0.2)
Line 43: Line 58:

Additional requests from the 2005 spec:

* It would be universally cool if we could somehow tie in Rosetta, language packs and this unified spellchecking library. This way translators and users would be using the same de-facto list, and translators could on the fly correct the list if there happened to be a problem. This would definitely make my life as a translator a tad easier.

* It should be easy to switch between languages for spell checking. People have a local language set but often need to use other languages to engage on-line (like I'm doing here). (I'm thinking of programs like Xchat/gaim/firefox that only check the local system language.) Spell checking should be enabled throughout the system for all languages selected in "language support". There should be a standard way to switch between languages in applications, but perhaps it is even possible to detect what language is being used in a particular field. - finalbeta

* Fedora people want this as well:
 https://www.redhat.com/archives/fedora-devel-list/2006-September/msg00225.html
 http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=207571
 and mozilla considers going to hunspell from aspeell
 https://bugzilla.mozilla.org/show_bug.cgi?id=319778
 -janimo
----
CategorySpec

Summary

Reduce the number of spelling libraries used in main; modify / extend applications currently not using the "system spellchecker" to use the system spell checker as well. Currently up to four implementations are in use:

  • ispell: still used by some applications, it's getting replaced by aspell (no application in main uses it any more).
  • aspell: ispell replacement, biggest users in main are Gnome and KDE
  • myspell: ispell replacement, currently used by thunderbird, mozilla.
  • hunspell: fork/replacement for myspell, dictionaries beeing compatible with myspell, used by openoffice.

ispell and aspell provide a C interface, myspell and hunspell provide a C++ interface. The enchant library provides a way to abstract from the underlying spell checker (aspell or myspell) and can be used to convert applications to a common interface.

Dictionary support is splitted out in its own spec ConsolidateDictionaries.

Rationale

Support only one implementation in the long term. More importantly, only support one source of dictionaries for one spelling library.

Use cases

Jane wonders why some words she writes in Gaim are marked as wrong, although they aren't marked as wrong in OpenOffice.

Asdf Qwert worries about the fact that no program knows his name. He is very depressed, because there is no easy way to teach it to all programs.

Karl from Germany, who knows that Thunderbird uses ispell, wonders why Thunderbird isn't able to spellcheck his emails using a German dictionary, although he has installed ingerman and iogerman. He also wonders why OpenOffice finds his dictionaries and Thunderbird doesn't. (TB uses an older version of ispell and dictionaries are installed via xpis.)

 tfheen: please provide use cases which show how things are supposed to work, not how they are currently broken 

Scope

Design

Implementation

The system spellchecker used in both Gnome and KDE is aspell. Two applications are not yet using aspell:

Firefox

  • First step: use hunspell instead of myspell with the goal to demote myspell from main. myspell and hunspell are supposed to be API compatible.
  • Convert to use enchant.

OpenOffice.org

OOo makes use of hunspell-based dictionaries and hyphenation patterns. Hypenation patterns are not (?) supported by aspell, so it may prove difficult to completely replace the use of hunspell by aspell. In a first step, write a plugin to additionally read/write the aspell system and user dictionaries in addition to using the aspell dictionaries. (janimo: Why replace hunspell with aspell in OO? They made the switch the other way around starting with 2.0.2)

Outstanding issues

BoF agenda and discussion

Additional requests from the 2005 spec:

* It would be universally cool if we could somehow tie in Rosetta, language packs and this unified spellchecking library. This way translators and users would be using the same de-facto list, and translators could on the fly correct the list if there happened to be a problem. This would definitely make my life as a translator a tad easier.

* It should be easy to switch between languages for spell checking. People have a local language set but often need to use other languages to engage on-line (like I'm doing here). (I'm thinking of programs like Xchat/gaim/firefox that only check the local system language.) Spell checking should be enabled throughout the system for all languages selected in "language support". There should be a standard way to switch between languages in applications, but perhaps it is even possible to detect what language is being used in a particular field. - finalbeta

* Fedora people want this as well:


CategorySpec

ConsolidateSpellingLibs (last edited 2008-08-06 16:16:08 by localhost)