This is a document outlining the Linux Desktop accessibility stack, known as at-spi, a little on how it works, and its low level requirements for input and determining on-screen position of widgets and windows.
The at-spi stack workings in a nutshell
The free desktop accessibility stack has been designed to be as desktop environment agnostic as possible. The stack is made up of several pieces, some of which only apply to GNOME. A good explanation of how the pieces fit together can be found in blog post, although only some of the post is aplicable here, since its about accessibility in WebKit GTK. With regards to Qt, it contains its own code to communicate via dbus with the registry, so in the generic examples given in that post, you can replace atk/Gtk with Qt. Like Gtk, Qt has its own internal system for tracking accessibility information for on-screen widgets called QAccessible. Given the cross-platform nature of Qt, the accessibility framework has multiple backends depending on the OS, so on Linux its at-spi, Windows is IAccessible2, etc.
Input event consumption
One key task performed by the at-spi registry daemon is to deal with input events, and allow assistive technologies to act on them if required. At this point the registry daemon only deals with keyboard and mouse events. Upstream doesn't currently appear to have any plans to add support for touch events. Keyboard and mouse events are both dealt with differently, as explained below.
Mouse events are received by use of the X input framework. The mouse events are used by assistive technologies to track the position of the mouse, so they can take action based on the mouse location. For example, the Orca screen reader allows the widget under the mouse cursor to be described as the mouse is moved. Assistive technologies can also request that the mouse cursor be moved to a particular on screen location. For example, if you use Orca's mouse click command on a toolbar button, the mouse cursor will be moved to the on screen position of that particular toolbar button.
At the moment, keyboard events are managed quite differently, due to the lack of a deacent API from X to subscribe to them. In the early days, the XEVIE X extension was going to be used to deal with keyboard events, but this never eventuated due to various bugs, and XEVI was eventually dropped. As a result, keyboard events are retrieved via the toolkit of the currently focused application. Taking GTk as an example, GTK has methods to snoop for keyboard activity. when a GTk app has focus. GTK detects the key presses by the user, and before doing anything with them, sends them over to the registry daemon for processing. If an assistive technology such as Orca is registered with the registry daemon, the registry daemon sends these key events to the assistive technology, so that it can act on them if required. If the key presses are something the assistive technology wants to act on, it does so, and effectively swallows the key presses associated with its action. Otherwise, the assistive technology signals that it doesn't intend to act on the key presses, thereby allowing the registry to signal GTK to pass the key presses through to the application. Due to deficiencies in the X input API as per the above, Qt also does the same thing as Gtk, to make sure assistive technologies can process keyboard and mouse events.
The long term plan is to have the registry listen for all keyboar dand mouse events to allow assistive technologies to act on them if desired, and in the case of keyboard events, swallow and act on key presses, such that the end-user application doesn't even see them. There have been discussions between accessibility stack maintainers and wayland devs to resolve this issue, but a concensus has not yet been reached. As explained above, touch event support is not yet implemented in the registry daemon, however this is a desirable feature, given our target platforms.
On-screen position of windows and widgets
One of the key components of the accessibility registry, is to track currently running applications, their widgets, and the current state of those widgets. The registry also tracks the current on-screen position of windows/frames, and widgets. The accessibility APIs allow for retrieving of these coordinates relative to the current window, and the whole screen. As mentioned above, Orca can move the mouse cursor to the current on-screen coordinates of any widget. Orca also uses these coordinates to visually highlight on-screen contents when its flat review feature is used. Flat review allows an Orca user to get an idea of the on-screen layout of a window by moving from top to bottom, or vise versa. If Orca cannot get coordinates for one or more widgets, these widgets are effectively invisible to the flat review.
The GNOME shell magnifier also uses the accessibility stack to provide additional functionality in the form of text tracking. When a text entry field getss focus, the magnifier uses the known coordinates of the text entry widget to reposition the magnifier, allowing the user to read the contents of the field as its entered. The magnifier also uses the known coordinates of the text cursor to move the magnifier along with the text entry, to allow continued review of entered text.
Gtk and Qt both have code to report the current position of their windows and widgets. This data is updated whenever widgets get added, removed, or resized. Both use X in various ways to determine window position, and X may also be used to determine widget position relative to the top left corner of the window.
As mentioned above, a currently unimplemented, but desirable feature for the accessibility stack is to process, and act on multi-touch events. This would be the same as keyboard events, in that the registry daemon and assistive technologies would receive, act, and swallow events. Swallowing of touch events is required because when an assistive technology such as Orca is enabled, all existing gestures cannot be used. This is because the user needs to be able to explore on-screen content with their finger, and only once they have found what they are looking for, will they want to act on the item in question. Interracting with on-screen content when Orca is used does require a different set of gestures to be developed, but such gesture processing would be implemented in Orca itself, and not the registry.
Mir and at-spi requirements
Given the above situation, the following is required for at-spi and the accessibility stack as a whole to function under Mir.
- The at-spi registry daemon needs to be able to subscribe to input events for keyboard, mouse, and touch.
- The registry daemon needs to be able to request that the mouse cursor be moved to any point on screen.
- The registry daemon needs to be able to prevent keyboard and touch events from reaching end-user applications.
- The Ui toolkits, i.e Gtk and Qt need to be able to request window position on screen for use with the accessibility stack.