Spec

Differences between revisions 3 and 4
Revision 3 as of 2013-03-04 18:33:49
Size: 15976
Editor: popey
Comment: clarify the porting bit..
Revision 4 as of 2013-03-05 08:39:25
Size: 16311
Editor: ip-88-152-243-182
Comment:
Deletions are marked like this. Additions are marked like this.
Line 37: Line 37:
An obvious clarification first: Wayland is a protocol definition that defines how a client application should talk to a compositor component. It touches areas like surface creation/destruction, graphics buffer allocation/management, input event handling and a rough prototype for the integration of shell components. However, our evaluation of the protocol definition revealed that the Wayland protocol suffers from multiple problems, including: An obvious clarification first: Wayland is a protocol definition that defines how a client application should talk to a compositor component. It touches areas like surface creation/destruction, graphics buffer allocation/management, input event handling and a rough prototype for the integration of shell components. However, our evaluation of the protocol definition revealed that the Wayland protocol does not meet our requirements. First, we are aiming for a more extensible input event handling that takes future developments like 3D input devices (e.g. Leap Motion) into account. Please note though that Wayland's input event handling does not suffer from the security issues introduced by X's input event handling semantics (thanks to Daniel Stone and Kristian Høgsberg for pointing this out). With respect to mobile use-cases, we think that the handling of input methods should be reflected in the display server protocol, too. As another example, we consider the shell integration parts of the protocol as privileged and we'd rather avoid having any sort of shell behavior defined in the client facing protocol.
Line 39: Line 39:
 * The input event handling partly recreates the X semantics and is thus likely to expose similar problems to the ones we described in the introductory section.
 * The shell integration parts of the protocol are considered privileged from our perspective and we'd rather avoid having any sort of shell behavior defined in the protocol.

However, we still think that Wayland's attempt at standardizing the communication between clients and the display server component is very sensible and useful, but it didn't fit our requirements and we decided to go for the following architecture w.r.t. to protocol-integration:
However, we still think that Wayland's attempt at standardizing the communication between clients and the display server component is very sensible and useful, but due to our different requirements we decided to go for the following architecture w.r.t. to protocol-integration:

Mir

Summary

We are developing a next generation display server known as Mir. A system-level component targeted as a replacement for the X window server system to unlock next-generation user experiences for devices ranging from Linux desktop to mobile devices powered by Ubuntu. This document outlines the motivation for the project, describes the high level design, summarizes the scope, and provides the roadmap of the Mir display server.

The purpose of Mir is to enable the development of the next generation Unity.

Motivation - Why Mir?

In recent years, the sophisticated user experience offered by mobile devices like the iPhone or Android-powered devices changes the expectations of users regarding a “fast’n’fluid” (f’n’f) way of interacting with their devices. Historically, graphical user interfaces on the Linux platform have been powered by the X windowing system. X has a long and successful history and it has served the purposes of both system level and application level UI well for more than 3 decades. However, users nowadays expect a more consistent and a more integrated user experience than what is possible to offer on top of the X window system. Even more recent developments like the introduction of compositors to the X stack does not fully solve the situation and both shell and application development do have to deploy workarounds to overcome issues with the X rendering model. With respect to shell development (Unity), three major shortcomings of the X stack prevent us from delivering the user experience (f’n’f) we have in mind:

  • X shares a lot of system state across process boundaries. This is obviously not a problem in itself but a system-level UI that is meant to provide a beautiful and consistent user experience is likely to require tight control over the overall system state.
  • X's input model is complex and allows applications to spoof on input events they do not own. On the one hand, this raises serious security concerns, especially regarding mobile platforms. On the other hand, adjusting and extending X's input model is difficult and supporting features like input event batching and compression, motion event prediction together with associated power-saving strategies or flexible synchronization schemes for aligning input event delivery and rendering operations is (too) complex.
  • The compositor hierarchy ends on the session level, and no tight integration into the system from boot time onward is available. For that reason, there is a visible glitch when transitioning the system from a VT-level to the graphical shell level.

In addition to the points mentioned before, X's graphics driver model lacks focus and its adoption throughout the industry has been problematic. Again, focusing on the mobile use-cases, more consistent driver models like the Android graphics driver model offer much better support and adoption by SOC and GPU vendors. For this reason, we decided to go for a well-defined driver model and we stated the following requirements:

  • Tailored towards an EGL/GL(ES) world.
  • Minimal assumptions regarding the underlying driver model.
  • Ability to leverage existing drivers implementing the Android driver model.
  • Ability to leverage existing hardware compositors.

In summary, we want to provide a graphics stack that works across different platforms and driver models by limiting our assumptions to a bare minimum. The graphics stack and its display server component should be easily integrateable with the shell and act as a model that allows a shell to inject/define custom behavior easily. Here, our focus on security plays an important role: We want to avoid the need to expose a privileged protocol that would need to guarded by additional security means like AppArmor. To this end, we prefer an in-process approach that allows a shell implementation to interact with the display server model in a much more flexible way.

Finally, we want to emphasize our focus on quality and enforce a test-driven development approach for the display server component. We require every component of the system to be under test to ensure its correct functionality and to provide us with a test harness that allows us to evolve the system efficiently and safely.

Why Not Wayland / Weston?

An obvious clarification first: Wayland is a protocol definition that defines how a client application should talk to a compositor component. It touches areas like surface creation/destruction, graphics buffer allocation/management, input event handling and a rough prototype for the integration of shell components. However, our evaluation of the protocol definition revealed that the Wayland protocol does not meet our requirements. First, we are aiming for a more extensible input event handling that takes future developments like 3D input devices (e.g. Leap Motion) into account. Please note though that Wayland's input event handling does not suffer from the security issues introduced by X's input event handling semantics (thanks to Daniel Stone and Kristian Høgsberg for pointing this out). With respect to mobile use-cases, we think that the handling of input methods should be reflected in the display server protocol, too. As another example, we consider the shell integration parts of the protocol as privileged and we'd rather avoid having any sort of shell behavior defined in the client facing protocol.

However, we still think that Wayland's attempt at standardizing the communication between clients and the display server component is very sensible and useful, but due to our different requirements we decided to go for the following architecture w.r.t. to protocol-integration:

  • A protocol-agnostic inner core that is extremely well-defined, well-tested and portable.
  • An outer-shell together with a frontend-firewall that allow us to port our display server to arbitrary graphics stacks and bind it to multiple protocols.

In summary, we have not chosen Wayland/Weston as our basis for delivering a next-generation user experience as it does not fulfill our requirements completely. More to this, with our protocol- and platform-agnostic approach, we can make sure that we reach our goal of a consistent and beautiful user experience across platforms and device form factors. However, Wayland support could be added either by providing a Wayland-specific frontend implementation for our display server or by providing a client-side implementation of libwayland that ultimately talks to Mir.

Objectives

In general, we have the following attributes in mind when developing the system:

Well-Defined Functionality

We develop the system based on requirements and use-cases. We want to avoid the situation of unnecessary feature-bloat, with the system evolving on its own time-line without actual need for it.

Efficiency

The system should fulfill all of the requirements as efficiently as possible, with a focus on CPU cycles, GPU cycles, memory and power consumption. We want to establish a set of benchmarks that make sure that the system lives up to this attribute.

Test-Driven

The system should be under test as much as possible. We consider all three levels of testing-detail (unit, integration and acceptance tests) to ensure a high quality and to deliver a product that just works (tm). More to this, any development should only happen starting with a well-defined acceptance test available. Any feature that we cannot test for cannot be implemented in a high quality.

Versatile & Flexible

The system should easily be adaptable and portable to different platforms and use-cases (within the range of the well-defined functionality mentioned before). Running the system on a mobile device, exposing only a limited functionality like a system-level compositor should not be a special-case but a requirement easily fulfilled by the system.

Security

We want to avoid exposing any sort of privileged protocol to client applications. In particular, we want to prevent (malicious) client applications from spoofing on the input event stream or capture the screen content without at least a prior authorization/authentication step. To this end, we restrict the set of non-privileged operations.

Toolkit Integration & Legacy X Application Support

Mir's client library should be easy to integrate with existing toolkits. Application authors relying on Qt/QML, GTK3, XUL etc. should not be required to perform additional porting as we will work on providing Mir integration for the most prominent toolkit choices. In reality though, certain legacy applications will not be able to transition away from X completely, and we will provide an in-session rootless X server that is integrated with Mir. It acts as an on-demand compatibility layer between legacy X applications and the session-level Unity/Mir instance.

Scope

This section gives a high-level overview of the functionality that the final version of the system should provide. Please refer to the section “Roadmap” for time estimates and targeted release version for the individual features.

Mir Project

The majority of the Mir software is in the Mir project on Launchpad. This project produces two libraries:

  • libmir-server - A library containing the server side components of Mir. This is used to implement a compositor.
  • libmir-client - A library to allow applications to communicate with Mir servers. This is used by toolkits.

In addition to the Mir project there are some associated projects that build on Mir technology:

  • QMir - Qt bindings for Mir

  • unity-system-compositor - A Mir server that compositites between sessions, greeters and boot screens
  • Unity - A user shell implemented using Mir

  • Unity Greeter - A greeter implemented using Mir

A full Mir based display stack looks like this: Compositor_Cascade.png

Mir Internals

mir-scope.png

Compositor

The compositor is responsible for presenting the final scene consisting of all application and shell surfaces (windows) on screen. It contains a renderer that takes care of applying effects (e.g., drop shadows) to the individual surfaces. The compositor is synchronized to vblank to avoid tearing and wasting cycles.

Input Management

The system should support reading measurements (coordinates, keys, acceleration values …) from arbitrary input devices, pre-processing the event stream, presenting it to a chain of server-side filters (e.g., to support shell-level gesture recognition or keyboard interaction) and finally delivering it to client applications. We want the server-side input stack to be flexible in that it should support reading from arbitrary input devices, with a focus on the evdev kernel subsystem.

Finally we want to make sure that the input stack is as efficient as possible with respect to power consumption. Most importantly, we want to be able to throttle down event propagation to client applications to match vblanc and account for the loss in sampling accuracy by means of predicting future motion events.

We have looked at multiple candidate input stacks and have chosen the one included in Android for its efficiency, clear design and flexibility. We adapted the stack to compile outside of the Android source tree, only relying on the STL and boost.

Output Management

The system should support monitoring connected physical display devices, without assuming a certain type of connector. More to this, the system should provide means for shell components to react to changes in the configuration of the physical display devices, to:

Support common multi-monitor use-cases and to Support seamless transitions between different form factors (thinking about the convergence device here)

Another important area of functionality is support for multiple GPUs with different characteristics running in the same system. High-end laptops with discrete graphics powering games or 3D-intensive applications and featuring an on-chip graphics solution for low power consumption scenarios are a prominent example here. We want to be able to seamlessly transition between both GPUs and move application and their respective EGL contexts from one GPU to the other.

Application Management

Applications should be first class citizens in our display server. An application is named and consists of an arbitrary number of surfaces. The shell components can access the set of currently running/registered applications and operate on top of the collection to provide e.g. Alt-Tab functionality. Shell

The shell, or system level UI, will be a first-class citizen of the display server, at least in terms of well-defined interfaces that are used to communicate back and forth between the shell and the other components of the display server. We do consider an in-process shell approach right now, but we might revisit this decision in the future.

Inter-app Data Exchange

Exchanging data between running applications is very limited in the X world. We have basic support for copy’n’paste and drag’n’drop operation, but the experience that is currently offered is very limited and barely functional. For this reason, we want the display server to provide an advanced way for applications to exchange arbitrary data, together with a seamless user experience when initiating and carrying out the actual data exchange.

Mir Today

On Android Drivers

Currently the Phablet image uses SurfaceFlinger to render. But very soon, this will be replaced with Mir and eventually the tablet will use the same infrastructure as the desktop image:

13_05_Ubuntu_Touch.png

Mir on the Free Graphics Driver Stack

Right now, Mir is able to run on top of the free graphics driver stack, leveraging GBM, DRM and KMS to integrate with existing graphics hardware.

13_05_Free_Driver_stack.png

Mir on HW Supported By Closed Source Drivers

Right now, Mir does not run on desktop hardware that requires closed source drivers. However, we are in contact with GPU vendors and are working closely together with them to support Mir and to distill a reusable and unified EGL-centric driver model that further eases display server development in general and keeps cross-platform use-cases in mind.

Roadmap

The full roadmap can be seen in the client-1303-mir-converged blueprint. Key milestones are:

May 2013

Finish the first step towards integrating Unity Next with Mir and provide enough facility to start iterating the actual shell development, providing developers with a solid platform and designers with means for rapid prototyping.

October 2013

Unity Next & Mir window management are completely integrated with the rest of the system to support an Ubuntu Phone product. For the desktop/laptop form-factor, we want to fully replace X in user sessions and provide a legacy mode that allows to run legacy X clients against an on-demand rootless X server. A cascade of display servers/shells is implemented, with the session-level instances talking to a global system compositor instance, providing a flicker-free, tightly integrated and beautiful UX.

April 2014

Complete convergence across the form factors is achieved, with Mir serving as the carrier across form factors, powering a seamless transition between different use-cases and devices.