DesktopCouchWishList

Differences between revisions 1 and 27 (spanning 26 versions)
Revision 1 as of 2011-04-15 02:51:29
Size: 1724
Editor: 173-14-15-225-Colorado
Comment:
Revision 27 as of 2011-04-29 16:43:11
Size: 12087
Editor: 173-14-15-225-Colorado
Comment: reordered sections based on priority
Deletions are marked like this. Additions are marked like this.
Line 1: Line 1:
Describe DesktopCouchWishList here.

This is where [[https://launchpad.net/~jderose|Jason]] is keeping notes on changes he would like to see in [[https://launchpad.net/desktopcouch|desktopcouch]] based on his experience working on [[https://launchpad.net/dmedia|dmedia]].
This is where I'm keeping my notes on changes I would like to see in [[https://launchpad.net/desktopcouch|desktopcouch]] based on my experience working on [[https://launchpad.net/dmedia|dmedia]]. Note that this is written out of love of desktopcouch, despite being a list of pain points I would like to see changed.

 -- [[LaunchpadHome:jderose]] <<DateTime(2011-04-15T15:56:07-0700)>>

''May Stuart and Jason's bromance continue.''
Line 7: Line 9:
== Abstract away the Desktop part ==

In addition to using desktopcouch, Both Novacut and dmedia ''must'' be able to run on headless servers and talk to a system-wide CouchDB. Although as of dmedia 0.6 this will finally be possible, it took a while to get there, plus there is still one fundamental issue: the desktopcouch reconnection hack for working around CouchDB hanging after resuming from suspend.

=== abstractcouch.get_env() ===

There needs to be a single entry point from which an app can get information about the CouchDB environment it is running in. If you were running against desktopcouch, it would return something like this:

== Intro ==

Both dmedia and Novacut use an architecture like this:

{{attachment:novacut-dmedia.png}}

Key features being:

 * Backend components that must talk to both desktopcouch and vanilla CouchDB

 * HTML UI components that must run in both embedded !WebKit and regular browsers, must talk to both desktopcouch and vanilla CouchDB

To do this, dmedia has rolled it's own way to abstract away the "desktop" part of desktopcouch, something which is basically possible, but is a bit fragile and hacky. In the O cycle, I'd really love for the following to happen:

 1. Remove the reconnection hack (ie, have desktopcouch port remain stable throughout session)... this is the one thing dmedia can't abstract away

 2. Move dmedia abstractions to standalone project/package so this pattern can be reused by other projects, and so the desktopcouch team has a stable, simple target as far as not breaking these abstractions

 3. Standardize where static webUI files are stored, how they are accessed through CouchDB


== Reconnection hack must go ==

To have the reconnection hack work you must use a `desktopcouch.records.database.Database`, which makes it basically impossible to abstract away the desktop.

My #1 request for Oneiric is for this to change. Instead, the port ''must'' remain the same throughout the desktop session. It's fine if the port is randomly chosen at the start of the session, but it must remain the same once desktopcouch starts.

When it comes down to it, the API needs to be plain HTTP. Requiring a specialized wrapper library like desktopcouch closes off too many cool use cases (like having the same app run both on the web and on desktopcouch more or less transparently).

For what it's worth, `desktopcouch.Database` isn't close enough to the CouchDB REST API to be usable by dmedia (DC makes too many assumptions, certain functionality isn't available). In fact, even [[http://packages.python.org/CouchDB/|python-couchdb]] has been a constant headache for dmedia... again, certain functionality isn't exposed, and there is way too much magic/ambiguity behind the scenes.

Not that this sort of wrapper isn't probably a perfect fit for some applications. But dmedia and Novacut are pretty demanding, and something close to the metal like [[https://launchpad.net/microfiber|microfiber]] would make my life much easier.


== abstractcouch ==

The hypothetical `abstractcouch` package is my proposal for moving the way dmedia abstracts aways the "desktop" part of destkopcouch into a standalone Python package. This way the pattern can be easily used by other apps, and it gives the desktopcouch team a simple, stable target as far as not breaking the abstraction. For some background, see [[https://bugs.launchpad.net/dmedia/+bug/722035|lp:722035]].

The idea is to have a single API call like `abstractcouch.get_env()` that returns information about the CouchDB environment, be it desktopcouch or system-wide CouchDB. We want this API call to be as quick as possible, cause as few modules to be imported as possible.

For example, if you were running against desktopcouch, it would return something like this:
Line 18: Line 55:
    "consumer_key": "oRTyTHKiKu", 
    "consumer_secret": "bdXSzITryM", 
    "token": "lyrygLlsbk", 
    "consumer_key": "oRTyTHKiKu",
    "consumer_secret": "bdXSzITryM",
    "token": "lyrygLlsbk",
Line 22: Line 59:
  }, 
  "port": 45484, 
  },
  "port": 45484,
Line 32: Line 69:
  "port": 5984,    "port": 5984,
Line 39: Line 76:

=== reconnection hack must go ===
When talking to CouchDB from Python, [[http://bazaar.launchpad.net/~dmedia/dmedia/trunk/view/head:/dmedia/abstractcouch.py#L53|dmedia.abstractcouch.get_server()]] is used to create an appropriately configured `couchdb.Server` based on the ''env'' it is passed.

When talking to CouchDB from !JavaScript (UI running in embedded !WebKit), the dmedia [[http://bazaar.launchpad.net/~dmedia/dmedia/trunk/view/head:/dmedia/gtkui/widgets.py#L32|CouchView]] is used to transparently sign OAuth requests. This would be a great piece to standardize and upstream into desktopcouch.


== Make web apps first class citizens ==

dmedia and Novacut make heavy use of an architecture like this:

{{attachment:webtastic.png}}

The idea is that a large part of the user experience is implemented as an HTML5 UI, which can run as a native app in embedded !WebKit, or be delivered over the web to standard browsers. When running as a native app, we're integrating with all the Ayatana niceties, so the experience isn't exactly the same, but close. Although certainly not the only way to take advantage of desktopcouch, running in embedded !WebKit has some big advantages:

 * Can build a very snappy UI by making XMLHttpRequest directly to CouchDB, dynamically updating DOM
 * A plastic tool that gives designers sweeping freedom, yet allows quick implementation
 * Can build reusable user experiences consumable both in native apps and over the web

With help from Stuart Langridge, I have the dmedia [[http://bazaar.launchpad.net/~dmedia/dmedia/trunk/view/head:/dmedia/gtkui/widgets.py#L32|CouchView]] working as a nice proof of concept. Importantly, `CouchView` will transparently sign the OAuth requests when connecting to desktopcouch, plus works equally well connecting to the system wide CouchDB.

I would like to further refine `CouchView` and upstream it into desktopcouch. Although this isn't critical for dmedia/Novacut (we already have it), I think this would really enhance the appeal of building new apps specifically for desktopcouch.


== Standard location for static webUI files ==

Small pain point that needs to be addressed as part of making web apps first class citizens.

Futon (the CouchDB web admin UI) is served from static files. In the '''httpd_global_handlers''' section of the config, you'll see something like this:

 {{{
_utils {couch_httpd_misc_handlers, handle_utils_dir_req, "/usr/share/couchdb/www"}
}}}

Many desktopcouch/hosted CouchDB apps will need to deliver static HTML, CSS, and !JavaScript files accessible from CouchDB, so it would be nice to have standard location where these files are installed, say:

 {{{
/usr/share/couchdb/apps/PACKAGE/*
}}}

And a standard handler configured by default in desktopcouch (but perhaps commented out by default in system wide CouchDB), say:

 {{{
_apps {couch_httpd_misc_handlers, handle_utils_dir_req, "/usr/share/couchdb/apps"}
}}}

So that the `couch.js` file that ships with dmedia would be available at, say:

 {{{
http://localhost:39846/_apps/dmedia/couch.js
}}}

This also makes it easy for dmedia apps to utilize !JavaScript etc shipped with dmedia.

Currently when `dmedia-service` starts, it will save the webUI assets as attachments in the '''/dmedia/app''' doc. Although this was a quick way to get things working, it's a dirty, dirty hack. It doesn't make sense to replicate the UI around... different devices will have different versions installed, or totally different user interfaces, etc. These static files should be shipped in the Debian packages (or equivalent for the platform in question).
Line 46: Line 133:
== Standard location for static webUI files == Developing with desktopcouch I've found it annoying that all the databases are synced to !UbuntuOne by default... I frequently create test databases for testing dmedia (not just unit tests, but also using dmedia for an extended period, while not wanting to hose up my production dmedia DB). !UbuntuOne sync is, of course, 100% awesome, but having it op-in rather than opt-out would make life easier for developers. It also avoids potential unexpected privacy oopses... when I was first learning desktopcouch I was surprised everything was synced by default. In my case, nothing in the database was sensitive, but for some, that wont be the case.

So:

 * At the very least, desktopcouch should not sync databases whose names start with '''test_'''
 * Preferably, change from opt-out to opt-in, eg have '''included_names''' rather than '''excluded_names'''

The 2nd point needs some explanation. It doesn't make sense to sync all databases to all devices as not all devices will have the apps that use those databases. This is particularly an issue with phones and tablets, where syncing unnecessary databases will needlessly burn through precious storage, bandwidth, and CPU time.

A device can easily supply the databases it wants synced as that's local information based on the apps installed, etc; however, a device has no way of knowing what databases it should opt-out of.


== enforcing record_type is too restrictive ==

Although the '''record_type''' convention addresses an important need (standardizing document schema for use across apps), it's rather awkward and verbose. In dmedia/Novacut, the type (or record_type) needs to be referenced very often, and in many contexts (view functions, UI code, etc), so I needed something more concise.

For example, the desktopcouch spec dictates something like this:

 {{{
{
    "_id": "ZZZATIZG6IA3DJOEANQCFT3FHR4IU2FC",
    "record_type": "http://www.freedesktop.org/wiki/Specifications/dmedia/file"
}
}}}

But dmedia uses something much less verbose:

 {{{
{
    "_id": "ZZZATIZG6IA3DJOEANQCFT3FHR4IU2FC",
    "type": "dmedia/file"
}
}}}

This is a very typical dmedia view function:

 {{{
function(doc) {
    if (doc.type == 'dmedia/file') {
        emit(doc.mtime, null);
    }
}
}}}

Which becomes awkward and far less readable if using the desktopcouch convention:

 {{{
function(doc) {
    if (doc.record_type == 'http://www.freedesktop.org/wiki/Specifications/dmedia/file') {
        emit(doc.mtime, null);
    }
}
}}}

== Performance Improvements ==

This is slower than it should be:

 {{{
from desktopcouch.records.server import CouchDatabase
dc = CouchDatabase('dmedia', create=True)
}}}

The above takes around 0.2 seconds on my 3.0GHz Phenom II X4. From the bit of profiling I've done, this time could be halved simply by avoiding importing the Python `uuid` module, which is painfully slow to import:

 {{{
45923 function calls (44515 primitive calls) in 0.215 CPU seconds

Ordered by: cumulative time
List reduced from 846 to 10 due to restriction <10>

ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.000 0.000 0.215 0.215 ./dc-benchmark.py:8(run)
    1 0.001 0.001 0.129 0.129 /usr/lib/pymodules/python2.7/desktopcouch/records/__init__.py:22(<module>)
    1 0.001 0.001 0.125 0.125 /usr/lib/python2.7/uuid.py:45(<module>)
    2 0.000 0.000 0.123 0.061 /usr/lib/python2.7/ctypes/util.py:235(find_library)
    2 0.000 0.000 0.123 0.061 /usr/lib/python2.7/ctypes/util.py:207(_findSoname_ldconfig)
    2 0.000 0.000 0.107 0.053 /usr/lib/python2.7/re.py:139(search)
    2 0.100 0.050 0.100 0.050 {built-in method search}
    1 0.000 0.000 0.064 0.064 /usr/lib/pymodules/python2.7/desktopcouch/records/server.py:1(<module>)
    1 0.001 0.001 0.064 0.064 /usr/lib/pymodules/python2.7/desktopcouch/application/server.py:23(<module>)
   36 0.001 0.000 0.034 0.001 /usr/lib/python2.7/re.py:229(_compile)
}}}

This is where I'm keeping my notes on changes I would like to see in desktopcouch based on my experience working on dmedia. Note that this is written out of love of desktopcouch, despite being a list of pain points I would like to see changed.

May Stuart and Jason's bromance continue.

Intro

Both dmedia and Novacut use an architecture like this:

novacut-dmedia.png

Key features being:

  • Backend components that must talk to both desktopcouch and vanilla CouchDB
  • HTML UI components that must run in both embedded WebKit and regular browsers, must talk to both desktopcouch and vanilla CouchDB

To do this, dmedia has rolled it's own way to abstract away the "desktop" part of desktopcouch, something which is basically possible, but is a bit fragile and hacky. In the O cycle, I'd really love for the following to happen:

  1. Remove the reconnection hack (ie, have desktopcouch port remain stable throughout session)... this is the one thing dmedia can't abstract away
  2. Move dmedia abstractions to standalone project/package so this pattern can be reused by other projects, and so the desktopcouch team has a stable, simple target as far as not breaking these abstractions
  3. Standardize where static webUI files are stored, how they are accessed through CouchDB

Reconnection hack must go

To have the reconnection hack work you must use a desktopcouch.records.database.Database, which makes it basically impossible to abstract away the desktop.

My #1 request for Oneiric is for this to change. Instead, the port must remain the same throughout the desktop session. It's fine if the port is randomly chosen at the start of the session, but it must remain the same once desktopcouch starts.

When it comes down to it, the API needs to be plain HTTP. Requiring a specialized wrapper library like desktopcouch closes off too many cool use cases (like having the same app run both on the web and on desktopcouch more or less transparently).

For what it's worth, desktopcouch.Database isn't close enough to the CouchDB REST API to be usable by dmedia (DC makes too many assumptions, certain functionality isn't available). In fact, even python-couchdb has been a constant headache for dmedia... again, certain functionality isn't exposed, and there is way too much magic/ambiguity behind the scenes.

Not that this sort of wrapper isn't probably a perfect fit for some applications. But dmedia and Novacut are pretty demanding, and something close to the metal like microfiber would make my life much easier.

abstractcouch

The hypothetical abstractcouch package is my proposal for moving the way dmedia abstracts aways the "desktop" part of destkopcouch into a standalone Python package. This way the pattern can be easily used by other apps, and it gives the desktopcouch team a simple, stable target as far as not breaking the abstraction. For some background, see lp:722035.

The idea is to have a single API call like abstractcouch.get_env() that returns information about the CouchDB environment, be it desktopcouch or system-wide CouchDB. We want this API call to be as quick as possible, cause as few modules to be imported as possible.

For example, if you were running against desktopcouch, it would return something like this:

  • {
      "oauth": {
        "consumer_key": "oRTyTHKiKu",
        "consumer_secret": "bdXSzITryM",
        "token": "lyrygLlsbk",
        "token_secret": "QbqvZaiBGV"
      },
      "port": 45484,
      "url": "http://localhost:45484/"
    }

Or if you were running against the system-wide CouchDB, it would return something like this:

  • {
      "port": 5984,
      "url": "http://localhost:5984/"
    }

dmedia.core.get_env() implements the above behavior (but the point is apps shouldn't have to abstract away the desktop part on their own).

When talking to CouchDB from Python, dmedia.abstractcouch.get_server() is used to create an appropriately configured couchdb.Server based on the env it is passed.

When talking to CouchDB from JavaScript (UI running in embedded WebKit), the dmedia CouchView is used to transparently sign OAuth requests. This would be a great piece to standardize and upstream into desktopcouch.

Make web apps first class citizens

dmedia and Novacut make heavy use of an architecture like this:

webtastic.png

The idea is that a large part of the user experience is implemented as an HTML5 UI, which can run as a native app in embedded WebKit, or be delivered over the web to standard browsers. When running as a native app, we're integrating with all the Ayatana niceties, so the experience isn't exactly the same, but close. Although certainly not the only way to take advantage of desktopcouch, running in embedded WebKit has some big advantages:

  • Can build a very snappy UI by making XMLHttpRequest directly to CouchDB, dynamically updating DOM
  • A plastic tool that gives designers sweeping freedom, yet allows quick implementation
  • Can build reusable user experiences consumable both in native apps and over the web

With help from Stuart Langridge, I have the dmedia CouchView working as a nice proof of concept. Importantly, CouchView will transparently sign the OAuth requests when connecting to desktopcouch, plus works equally well connecting to the system wide CouchDB.

I would like to further refine CouchView and upstream it into desktopcouch. Although this isn't critical for dmedia/Novacut (we already have it), I think this would really enhance the appeal of building new apps specifically for desktopcouch.

Standard location for static webUI files

Small pain point that needs to be addressed as part of making web apps first class citizens.

Futon (the CouchDB web admin UI) is served from static files. In the httpd_global_handlers section of the config, you'll see something like this:

  • _utils    {couch_httpd_misc_handlers, handle_utils_dir_req, "/usr/share/couchdb/www"}

Many desktopcouch/hosted CouchDB apps will need to deliver static HTML, CSS, and JavaScript files accessible from CouchDB, so it would be nice to have standard location where these files are installed, say:

  • /usr/share/couchdb/apps/PACKAGE/*

And a standard handler configured by default in desktopcouch (but perhaps commented out by default in system wide CouchDB), say:

  • _apps    {couch_httpd_misc_handlers, handle_utils_dir_req, "/usr/share/couchdb/apps"}

So that the couch.js file that ships with dmedia would be available at, say:

  • http://localhost:39846/_apps/dmedia/couch.js

This also makes it easy for dmedia apps to utilize JavaScript etc shipped with dmedia.

Currently when dmedia-service starts, it will save the webUI assets as attachments in the /dmedia/app doc. Although this was a quick way to get things working, it's a dirty, dirty hack. It doesn't make sense to replicate the UI around... different devices will have different versions installed, or totally different user interfaces, etc. These static files should be shipped in the Debian packages (or equivalent for the platform in question).

Per DB sync should be opt-in, not opt-out

Developing with desktopcouch I've found it annoying that all the databases are synced to UbuntuOne by default... I frequently create test databases for testing dmedia (not just unit tests, but also using dmedia for an extended period, while not wanting to hose up my production dmedia DB). UbuntuOne sync is, of course, 100% awesome, but having it op-in rather than opt-out would make life easier for developers. It also avoids potential unexpected privacy oopses... when I was first learning desktopcouch I was surprised everything was synced by default. In my case, nothing in the database was sensitive, but for some, that wont be the case.

So:

  • At the very least, desktopcouch should not sync databases whose names start with test_

  • Preferably, change from opt-out to opt-in, eg have included_names rather than excluded_names

The 2nd point needs some explanation. It doesn't make sense to sync all databases to all devices as not all devices will have the apps that use those databases. This is particularly an issue with phones and tablets, where syncing unnecessary databases will needlessly burn through precious storage, bandwidth, and CPU time.

A device can easily supply the databases it wants synced as that's local information based on the apps installed, etc; however, a device has no way of knowing what databases it should opt-out of.

enforcing record_type is too restrictive

Although the record_type convention addresses an important need (standardizing document schema for use across apps), it's rather awkward and verbose. In dmedia/Novacut, the type (or record_type) needs to be referenced very often, and in many contexts (view functions, UI code, etc), so I needed something more concise.

For example, the desktopcouch spec dictates something like this:

  • {
        "_id": "ZZZATIZG6IA3DJOEANQCFT3FHR4IU2FC",
        "record_type": "http://www.freedesktop.org/wiki/Specifications/dmedia/file"
    }

But dmedia uses something much less verbose:

  • {
        "_id": "ZZZATIZG6IA3DJOEANQCFT3FHR4IU2FC",
        "type": "dmedia/file"
    }

This is a very typical dmedia view function:

  • function(doc) {
        if (doc.type == 'dmedia/file') {
            emit(doc.mtime, null);
        }
    }

Which becomes awkward and far less readable if using the desktopcouch convention:

  • function(doc) {
        if (doc.record_type == 'http://www.freedesktop.org/wiki/Specifications/dmedia/file') {
            emit(doc.mtime, null);
        }
    }

Performance Improvements

This is slower than it should be:

  • from desktopcouch.records.server import  CouchDatabase
    dc = CouchDatabase('dmedia', create=True)

The above takes around 0.2 seconds on my 3.0GHz Phenom II X4. From the bit of profiling I've done, this time could be halved simply by avoiding importing the Python uuid module, which is painfully slow to import:

  • 45923 function calls (44515 primitive calls) in 0.215 CPU seconds
    
    Ordered by: cumulative time
    List reduced from 846 to 10 due to restriction <10>
    
    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.215    0.215 ./dc-benchmark.py:8(run)
        1    0.001    0.001    0.129    0.129 /usr/lib/pymodules/python2.7/desktopcouch/records/__init__.py:22(<module>)
        1    0.001    0.001    0.125    0.125 /usr/lib/python2.7/uuid.py:45(<module>)
        2    0.000    0.000    0.123    0.061 /usr/lib/python2.7/ctypes/util.py:235(find_library)
        2    0.000    0.000    0.123    0.061 /usr/lib/python2.7/ctypes/util.py:207(_findSoname_ldconfig)
        2    0.000    0.000    0.107    0.053 /usr/lib/python2.7/re.py:139(search)
        2    0.100    0.050    0.100    0.050 {built-in method search}
        1    0.000    0.000    0.064    0.064 /usr/lib/pymodules/python2.7/desktopcouch/records/server.py:1(<module>)
        1    0.001    0.001    0.064    0.064 /usr/lib/pymodules/python2.7/desktopcouch/application/server.py:23(<module>)
       36    0.001    0.000    0.034    0.001 /usr/lib/python2.7/re.py:229(_compile)

DesktopCouchWishList (last edited 2011-05-09 16:06:31 by 137)