Validation

Dev Week -- Setting up a small validation dashboard -- ZygmuntKrynicki -- Tue, Jul 13th, 2010

(04:03:43 PM) zyga: first of all thanks for joining, I don't know how many people are with me today
(04:03:47 PM) zyga: I prepared some rough notes and a bzr branch for those of you who will find this topic interesting
(04:04:14 PM) zyga: also I'm not sure how classbot and questions work so if anyone could ask me a QUESTION in the #ubuntu-classroom-chat channel I would appreciate that
(04:04:41 PM) zyga: if not I'll just start talking...
(04:04:49 PM) zyga: == About ==
(04:04:57 PM) zyga: Validation dashboard is a tool for both application developers and system integrators.
(04:05:23 PM) zyga: At system scope (which is probably not that interesting to most people here) it
(04:05:24 PM) zyga: aids development of a distribution or an distribution spin-off. More specifically it aids in seeing how low-level changes affect the whole system.
(04:05:39 PM) zyga: At application developer scope is aids in visualizing performance across time
(04:05:40 PM) zyga: (different source revisions), operating system versions and target hardware.
(04:06:10 PM) zyga: it all sounds nice but it's worth pointing out that dashboard is still under development and very little exists today
(04:06:43 PM) zyga: still it's on schedule to be usable and useful for maverick
(04:07:18 PM) zyga: I have a branch with some code that is worth using today, I will talk about it later during this session
(04:07:38 PM) zyga: == How it works ==
(04:07:44 PM) zyga: Validation dashboard is based on tapping into _existing_ tests and benchmarks
(04:07:44 PM) zyga: that provide data interesting to the target audience (you, developers). Most
(04:07:45 PM) zyga: projects have some sort of tests and already use them for CI (continuous
(04:07:45 PM) zyga: integration), some have test suites but no CI system as some are difficult to
(04:07:45 PM) zyga: set up and require effort to maintain.
(04:08:16 PM) zyga: Dashboard takes CI a step further. First by allowing you to extend a CI system
(04:08:16 PM) zyga: into a user-driven test system. When users can submit test results you get much
(04:08:17 PM) zyga: more data from a wide variety of hardware. Second you can test your unchanging
(04:08:17 PM) zyga: software (a stable release branch) against a changing environment. This will
(04:08:17 PM) zyga: allow you to catch regressions caused by third party software updates that you
(04:08:17 PM) zyga: depend on or that affect your runtime behaviour by being active in the system.
(04:08:42 PM) zyga: Finally the biggest part of launch control is the user interface. While I'm
(04:08:42 PM) zyga: just giving promises here the biggest effort will go into making the data easy
(04:08:42 PM) zyga: to understand and easy to work with. Depending on your project you will have a
(04:08:42 PM) zyga: different requirements. The dashboard will allow you to
(04:09:31 PM) zyga: The dashboard will allow you to show several kinds of pre-defined visualizations, depending on the kind of data your tests generates
(04:10:10 PM) ClassBot: porthose asked: is it working
(04:10:23 PM) zyga: it works, thanks
(04:11:02 PM) zyga: and will also allow you to make custom queries (a variation of the pre-defined really) that will show some specific aspect of the data such as comparing one software version to another or comparing results from different hardware, etc
(04:12:00 PM) zyga: so that's the good part, next I'll talk about how the dashboard operates internally and what is required to set one up (once it's ready to be used)
(04:12:35 PM) zyga: so the bard part is you need to put some effort to use the dashboard, an initial investment of sorts
(04:12:54 PM) zyga: you have to do some work to translate your test results into a format the dashboard will understand
(04:13:29 PM) zyga: Dashboard understands a custom data format that encapsulates software profile, hardware profile and test results (both tests and benchmarks). You get half of the information for free but you have to invest in writing a translator or other glue code from whatever your test suite generates into dashboard-friendly format.
(04:13:59 PM) zyga: A python library (that already exists) has been created to support this. Anyone can get it from my bzr branch by executing this command: bzr get lp:~zkrynicki/launch-control/udw launch-control.udw
(04:14:56 PM) zyga: you can get the branch now, I'll use it for one example later on
(04:16:10 PM) zyga: so a little back story now, dashboard is a project created for the arm world, arm hardware is really cool because it is so diverse and can scale from tiny low power microcontrollers all the way up to the multicore systems that have lots of performance
(04:17:06 PM) zyga: one of the thing that is not good about such diversity is validating your software stack on new hardware configuration, you really need some tools to make it efficient and worth your effort in the first place
(04:17:41 PM) zyga: so a couple of people in the linaro group are working on a set of tools that will make it easier, dashboard (aka launch-control) is one of them
(04:17:52 PM) zyga: so enough with the back story
(04:18:31 PM) ClassBot: ktenney asked: url to look at while waiting?
(04:20:12 PM) zyga: ktenney, I made a presentation about the initial assumptions of what the dashboard is about, I'm not sure if that is what you asked for. The presentation is here: http://ubuntuone.com/p/6fE/
(04:21:06 PM) zyga: okay so at the really low level the dashboard is about putting lots of lots of samples (measurements of something) into context
(04:21:23 PM) zyga: you can think of samples as simple test results
(04:21:59 PM) zyga: samples come in two forms one for plain tests and other for benchmark-like tests
(04:22:33 PM) zyga: most of the work you have to do to start using this is to translate your test results into this sample thing, fortunately it's quite easy
(04:23:14 PM) zyga: if you check out the branch I posted earlier (bzr get lp:~zkrynicki/launch-control/udw launch-control.udw
(04:23:46 PM) zyga: if you look around you'll see the examples directory
(04:24:05 PM) zyga: inside I wrote a simple script that takes the default output of python's unittest module output
(04:24:18 PM) zyga: and converts that into a dashboard samples
(04:24:50 PM) zyga: and packages all the samples into something I call bundle that you will be able to upload to a dashboard instance later on
(04:24:55 PM) zyga: so let's have a look at that code now
(04:25:20 PM) zyga: note: this is developed on maverick so if you have lucid and hit a bug, just let me know and I'll try to help you out
(04:25:52 PM) zyga: I'm sorry for saying this but I'm on holiday and I'm away from my workstation where I have a much better infrastructure
(04:26:32 PM) zyga: the client side code will run on a wide variety of linux distributions and will require little more than python2.5
(04:27:00 PM) zyga: this branch might fail but it's just a snapshot of work in progress developed on maverick
(04:27:02 PM) zyga: okay
(04:27:19 PM) zyga: so first thing is to get some test output
(04:27:59 PM) zyga: if you just run the test case (test.py) it will hopefully pass and print a summary
(04:28:50 PM) zyga: if you run it with -v (.test -v) it will produce a much more verbose format
(04:29:16 PM) zyga: that's the format we'll be using, redirect it to a file and store it somewhere
(04:29:50 PM) zyga: (by default unittest prints on sys.stderr so to capture that using a bash-like shell you must redirect the second file descriptor: ./test.py -v 2>test-result.txt )
(04:30:16 PM) ClassBot: penguin42 asked: What's the flow? Is the idea the dashboard runs somewhere central (like launchpad) or that each developer might have his own copy for his own test runs?
(04:30:23 PM) zyga: penguin42, great question thanks
(04:30:37 PM) zyga: penguin42, so the flow is kind of special
(04:31:06 PM) zyga: penguin42, we have decided NOT to run the centralized dashboard instance ourselves as it would defeat the linaro-specific requirements
(04:31:19 PM) zyga: penguin42, so to cut to the chase, you host your own dashboard,
(04:31:26 PM) zyga: it's going to be trivial to set one up
(04:31:28 PM) zyga: on a workstation
(04:31:32 PM) zyga: or a virtual machine
(04:31:35 PM) zyga: or some server you have
(04:32:20 PM) zyga: we'll make the deployment story as easy and good as possible as we expect (we == linaro) to get this inside corporations that develop software for the arm world and we want them to have a good experience
(04:33:13 PM) zyga: that said it's still possible that in the future launchpad or other supporting service will grow a dashboard derived system, there is a lot of interest for having some sort of tool like this for regular ubuntu QA tasks
(04:33:21 PM) zyga: but the answer is: currently you run your own
(04:33:38 PM) ClassBot: penguin42 asked: Is it possible to aggregate them - i.e. if there are a bunch of guys each doing this, or if a bunch of organisations are each doing it?
(04:34:04 PM) zyga: sorry for loosing context, could you specify what to aggregate
(04:35:47 PM) zyga: currently I see this being used (during the M+1 cycle) by linaro and some early adopters that will want to evaluate it for inclusion into their tool set, so I expect project-centric deployments
(04:36:08 PM) ClassBot: penguin42 asked: n-developers each working on an overlaping set of packages, each running their set of tests; is there a way multiple dashboards can aggregate test results to form a n overview of all of their tests?
(04:36:46 PM) zyga: penguin42, yes multiple developers can use a single instance to host unrelated projects and share some data (possibly)
(04:37:39 PM) zyga: penguin42, so to extend on your example, you can have a couple of developers working on some packages in some distribution (one for simplicity but this is not required)
(04:38:15 PM) zyga: penguin42, and while each developer sets up something that will upload test results (daily tests are our primary target)
(04:38:34 PM) zyga: penguin42, you can go to the dashboard and see a project wide overview of how your system is doing
(04:38:34 PM) zyga: penguin42, if there are any performance regressions
(04:38:47 PM) zyga: penguin42, new test failures
(04:39:00 PM) zyga: penguin42, overall test failures grouped by test collection (my term for "bunch of tests")
(04:39:24 PM) zyga: penguin42, our current targets are big existing test projects such as LTP or phoronix
(04:39:32 PM) zyga: they have lots of tests that look at the whole distribution
(04:40:23 PM) zyga: so many people can upload results of running those tests on their software/hardware combination
(04:40:23 PM) zyga: and you can look at that on one single page
(04:41:16 PM) zyga: on the opposite spectrum you can have multiple projects (such as "my app 1" and "my app 2")
(04:41:16 PM) zyga: that for some reason share a dashboard instance
(04:41:16 PM) zyga: and have totally unrelated data inside the system
(04:42:19 PM) ClassBot: tech2077 asked: Will this be available stable before maverick, i heard it would be stable at the time, but what is the stable release time frame
(04:42:31 PM) zyga: sorry I'm on GSM internet here and I have some lags
(04:42:40 PM) zyga: we have to speed up a little
(04:42:54 PM) zyga: tech2077, it will be available by the time maverick ships in a PPA
(04:43:09 PM) zyga: tech2077, our target is inclusion in N
(04:43:22 PM) zyga: I'll get back to the session now
(04:43:33 PM) zyga: so we ran the test suite I have written for the client side code
(04:43:48 PM) zyga: and placed the results in a test-result.txt file
(04:44:02 PM) zyga: the results themselves are a simple line-oriented (more less) format
(04:44:18 PM) zyga: the interesting bits are lines that end with " ... ok" and " ... FAIL"
(04:44:35 PM) zyga: parsing that should be easy
(04:45:15 PM) zyga: If you run ./examples/parse-pyunit -i test-result.txt -o test-result.json
(04:45:25 PM) zyga: you'll get a test-result.json file, go ahead and inspec it
(04:46:07 PM) zyga: there is some support structure but the majority of the data is inside the "samples" collection
(04:46:25 PM) zyga: so this is the easiest way of translating test results
(04:46:39 PM) zyga: tests have no individual identity
(04:46:45 PM) zyga: and all you get is a simple pass/fail status
(04:47:00 PM) zyga: everything else is just optional data, like message we harvested in this case
(04:47:29 PM) zyga: this is very weak as we cannot, for example, see a history of a particular test case
(04:47:32 PM) zyga: but it was very easy to set up
(04:48:08 PM) zyga: the next thing we'll make is to turn on a feature I commented away
(04:48:40 PM) zyga: in examples/parse-pyunit find the line that says bundle.inspect_system() and remove the comment # sign
(04:49:11 PM) zyga: if you run the parser again you'll get lots of extra information
(04:49:40 PM) zyga: this is the actual data you'd submit to a dashboard instance
(04:49:40 PM) zyga: your test results (samples)
(04:49:51 PM) zyga: software profile (mostly all the software the user had installed)
(04:50:12 PM) zyga: hardware profile (basic hardware information, cpu, memory and some miscellaneous bits like usb)
(04:50:53 PM) zyga: the profiles will make it possible to construct specialized queries and to filter data inside the system
(04:51:00 PM) zyga: okay so I have 10 minutes
(04:51:17 PM) zyga: I'd like to talk a tiny bit about samples again to let you know what is supported during this cycle
(04:51:22 PM) zyga: and spend the rest on questions
(04:51:35 PM) zyga: so there are qualitative samples (pass/fail) tests
(04:51:47 PM) zyga: they have test_result (mainly pass and several types of fail)
(04:51:54 PM) zyga: and test_id - the identity
(04:52:47 PM) zyga: if you start tracking test identity you need to make sure your tests have an unique identity that will not change as you develop your software
(04:52:57 PM) zyga: the primary use case for this is specialized test cases and benchmarks
(04:53:11 PM) zyga: a test that checks if the system boots is pretty important
(04:53:42 PM) zyga: a benchmarks that measures rendering performance needs identity to compare one run to all the previous runs you already stored in the system
(04:53:58 PM) zyga: identity is anything you like but it's advised to keep it to a reverse domain name scheme
(04:54:22 PM) zyga: the only thing the system enforces is a limited (domain name like) characters available
(04:54:51 PM) zyga: if you look at the pydoc documentation for launch_control.sample.QualitativeSample you can learn about additional properties
(04:55:08 PM) zyga: the next important thing is QuantitativeSample - this is for all kinds of benchmarks
(04:55:30 PM) zyga: and differs by having a measurement property that you can use to store a number
(04:56:18 PM) zyga: if you have benchmarks or want to experiment with adapting your test results so that they can be pushed to the dashboard please contact me, I'd love to hear your comments
(04:56:55 PM) ClassBot: tech2077 asked: is python coverage available for lucid, it seems to depend on it
(04:57:30 PM) zyga: tech2077, python-coverage is not strictly required, it's just for test coverage of the library
(04:58:03 PM) ClassBot: dupondje asked: when running it I get ImportError: No module named launch_control.json_utils, whats the right package I need to install ?
(04:58:39 PM) zyga: dupondje, none, just make sure to run this from the top-level package directory, or set PYTHONPATH accordingly
(04:59:00 PM) zyga: in general if you want to make sure you have all the dependencies see debian/control
(04:59:23 PM) ClassBot: penguin42 asked: Has it got any relation to autotest (autotest.kernel.org)
(04:59:30 PM) zyga: penguin42, no
(05:00:02 PM) zyga: penguin42, actual test frameworks are not really related to the dashboard
(05:00:28 PM) zyga: penguin42, dashboard is just for visualizing the data and for having a common upload "document" (here a .json file)
(05:00:39 PM) zyga: time's up
(05:00:59 PM) dupondje: you have some visual example ?
(05:01:37 PM) zyga: dupondje, just mockups, I'm working on the visual parts really
(05:01:48 PM) zyga: dupondje, (actually I will next week, I'm still on holiday)
(05:02:05 PM) dupondje: h?h? ok :)
(05:02:23 PM) ***zyga keeps getting bitten by mosquitoes just to get internet here
(05:02:45 PM) zyga: dupondje, what I can tell you today is that I'll probably use open flash charts for the on-screen rendering
(05:02:52 PM) penguin42: a whole new meaning to bugs with your net connection
(05:03:09 PM) zyga: hehe
(05:03:43 PM) zyga: the hard part with visualizing is to make it really easy to make custom graphs that you want to show (as in asking for the right data)
(05:04:22 PM) zyga: some of this is really skewed to linaro and arm world but much I hope will apply to general PCs and upstreams that develop software
(05:04:54 PM) zyga: if upstreams start producing more tests and start to look at feedback from various runtime environments (during their daily development process) then the dashboard project will succeed
(05:05:10 PM) zyga: that's all from me, if you want please contact me at zygmunt.krynicki@linaro.org

MeetingLogs/devweek1007/Validation (last edited 2010-07-13 21:28:04 by pool-71-123-28-183)