Dev Week -- Writing good test-cases -- jam -- Wed, Jan 27


   1 [16:00] <dholbach> First up is the fantastic John Arbash Meinel - he'll talk about "Writing good test-cases"!
   2 [16:00] <dholbach> jam: the floor is yours!
   3 [16:01] <jam> greetings all!
   4 [16:01] <jam> I'm happy to see so many names on the channel.... hopefully you all aren't just lurking
   5 [16:01] <jam> waiting for the next speaker :)
   6 [16:01] <jam> I'm going to be pretty loose with the discussion
   7 [16:01] <jam> so feel free to ask questions
   8 [16:02] <jam> cjohnston has graciously offered to help me keep track
   9 [16:02] <cjohnston> o/
  10 [16:02] <jam> So what is a 'good' test case
  11 [16:02] <jam> L
  12 [16:02] <jam> ?
  13 [16:03] <jam> Generally speaking, a test case is meant to verify that something is happening correctly.
  14 [16:04] <jam> Or looking at it differently, when something *isn't* happening correctly, the test case should fail.
  15 [16:04] <jam> However, there are still a lot of subtle aspects.
  16 [16:04] <jam> I would say that the desirable features of a test case break down into
  17 [16:04] <jam> 1) Sensitivity (the likelyhood that the test case will break when something goes wrong)
  18 [16:05] <jam> 2) Specificity (the likelyhood that the test case will break and what you are testing has not gone wrong)
  19 [16:05] <jam> 3) Performance
  20 [16:05] <jam> It is easy to focus on say 1, but to have a *good* test, all 3 are important
  21 [16:06] <jam> It is great to have a huge test suite that covers every possible aspect of your code, and all permutations
  22 [16:06] <jam> but if it takes 3 weeks to run, it is often not very useful for development
  23 [16:06] <jam> Likewise, a test with low specificity will be breaking all the time
  24 [16:06] <jam> and won't really help you isolate the problem
  25 [16:07] <jam> You can argue that 4) coverage is a property, but I would argue that it is a desirable property of the whole test suite, and not a great property of a single test
  26 [16:10] <jam> Personally, I think there are a range of 'good' tests, generally dependent on what aspect you are focusing on
  27 [16:10] <jam> I personally think that having lots of focused tests is better than having a single 'test' that is testing lots of aspects
  28 [16:11] <jam> but integration tests are still useful an needed
  29 [16:11] <jam> and
  30 [16:12] <jam> So how about we put together an example
  31 [16:13] <jam> I'll use python, since that is my preferred language
  32 [16:13] <jam> and it has a decent unittest suite
  33 [16:13] <cjohnston> QUESTION: Even after weeks of testing by most experience testers, very robust apps breakdown all of a sudden, what might be the reasons behind it?
  34 [16:14] <jam> ANSWER: I think that complexity is generally a major factor causing 'bugs' in software.
  35 [16:15] <jam> It is often considered inherent in any sufficiently developed program.
  36 [16:15] <cjohnston> < Omar871> QUESTION: according to what I learned in college, a 'good' test is that one that makes that system break/crash, to show where the problem is, how true could that be?
  37 [16:15] <jam> Generally, this means there will be some sort of permutation of objects which has not been tested
  38 [16:15] <jam> directly
  39 [16:16] <jam> A goal of development is to manage complexity (generally by defining apis, separation of concerns, etc)
  40 [16:16] <jam> good software can then have good tests that test a subset, without *having* to manage the permutation problem
  41 [16:16] <jam> (but generally, abstractions can 'leak', and ... boom)
  42 [16:17] <jam> ANSWER <omar>: I think I understand you to mean the "inverse"
  43 [16:17] <jam> which is that a good test is one that provokes a problem in the system
  44 [16:18] <jam> I think that for a regression style test, it is certainly important that the test would trigger the bug that you are trying to fix
  45 [16:18] <jam> However, once a bug is fixed, you would certainly expect it to not provoke the problem anymore
  46 [16:19] <jam> So it is certainly important that a test exposes a weakness
  47 [16:19] <jam> I guess put another way...
  48 [16:19] <jam> If I had a bug, and wrote a test case, and it *didn't* trigger the bug, that test doesn't have a lot of worth (for fixing that bug)
  49 [16:20] <jam> (Which often accidentally happens if you fix the bug before writing the test)
  50 [16:22] <jam> n3rd: We were just discussing in our group about "coding creates bugs... are we reducing the number of bugs faster than we are creating them?"
  51 [16:24] <jam> n3rd also brings up a decent point
  52 [16:24] <jam> users often find bugs that developers don't think of
  53 [16:24] <jam> often because they use software in ways that weren't anticipated
  54 [16:25] <jam> often this goes back to the permutation issue
  55 [16:25] <jam> it isn't possible to test every possible permutation
  56 [16:25] <jam> (well, rarely possible)
  57 [16:26] <cjohnston> < n3rd> jam, so the users are passive tester?
  58 [16:26] <jam> A file with just 20 bytes has 8^20 = ~10^48 permutations
  59 [16:27] <jam> Well, I would say that users are often pretty *active* testers
  60 [16:27] <jam> however, they don't make good automated test suites
  61 [16:27] <jam> I suppose that would be my:
  62 [16:27] <jam> 4) Reproducible
  63 [16:28] <jam> (the chance that running the same thing now will give you the same thing it gave you before)
  64 [16:28] <jam> It is somewhat related to specificity
  65 [16:28] <jam> As an unreproducible test has low specificity
  66 [16:28] <jam> (it breaks for reasons that you aren't trying to test)
  67 [16:29] <jam> I guess I should also mention... if a user used every intermediate version of my program, they'd run into a lot more bugs
  68 [16:29] <jam> As a developer I fix a huge amount of things before it gets released
  69 [16:29] <jam> it is just that often the set of *remaining* bugs
  70 [16:30] <jam> are ones that I had not anticipated very well
  71 [16:34] <jam> Anyway, I think it is useful to remember what the strengths of a given style of testing are.
  72 [16:34] <jam> You can have automated tests (unit and integration tests), manual (interactive) testing, foisting off the software on your users
  73 [16:34] <jam> etc
  74 [16:34] <jam> I do think that testing at each level is useful
  75 [16:35] <jam> and trying to test things at a the wrong level introduces more pain than benefit
  76 [16:35] <jam> Having an absolutely bulletproof piece of software that doesn't do what your users want isn't particularly beneficial
  77 [16:36] <jam> So having user testing in your feedback loop is certainly a good thing
  78 [16:36] <jam> However, giving them a buggy PoS is also not going to make them very happy
  79 [16:36] <jam> I'm certainly a fan of multi-tier testing, including automated testing
  80 [16:36] <jam> having a small fast test suite that is likely to expose bugs is nice on 'must pass before making it to trunk'
  81 [16:37] <jam> having a slower but more thorough "must pass before releasing to users"
  82 [16:37] <jam> and for some systems adding a "must be tested by a human interaction" can be inserted in there as well
  83 [16:38] <jam> If the first one takes much more than 5 minutes, it often causes grief when trying to get development done
  84 [16:38] <jam> but the second can take a day, and still not slow down your release cycle
  85 [16:44] <cjohnston> < Omar871> QUESTION: Could the efficiency and effectivenes of the testing process depend on the licensing type of the software we are making?
  86 [16:45] <jam> ANSWER: I don't think it would change dramatically
  87 [16:46] <jam> If the license is open, it does allow users to do introspective testing which would just not be possible otherwise
  88 [16:46] <jam> however, few users can really develop your software anyway, so it certainly shouldn't be relied upon as a source of improving correctness
  89 [16:46] <jam> even if your users are very intelligent they almost always
  90 [16:46] <jam> 1) don't know your code
  91 [16:46] <jam> 2) don't have time to be doing your work for you :)
  92 [16:47] <jam> I think Linus gets a bit of a boost, because there are lots of developers on the project, not just users
  93 [16:48] <jam> Certainly a "lot of eyeballs" requires eyeballs that can see the code
  94 [16:49] <jam> and with enough, you can have a person to specialize for any given subset, which also helps in defect localization (hence 'shallow')
  95 [16:50] <cjohnston>  < hggdh> QUESTION: are there any considerations for *usability* testing (this is one thing that users would certainly be able to perform)?
  96 [16:51] <jam> ANSWER: I think that usability testing is certainly important
  97 [16:51] <jam> (10:35:23 AM) jam: Having an absolutely bulletproof piece of software that doesn't do what your users want isn't particularly beneficial
  98 [16:51] <jam> There is an argument whether this specifically falls under the standard category of 'testing'
  99 [16:52] <jam> (market research is certainly important to developing software, but it isn't "testing" :)
 100 [16:56] <cjohnston> < strycore89> QUESTION : how is testing of graphical applications done ? (For example PyGTK apps)
 101 [16:56] <jam> IME, you can test PyGTK (and PyQt) without actually showing dialogs
 102 [16:57] <jam> both of them support updating widgets by programattically setting values
 103 [16:57] <jam> and then triggering events
 104 [16:57] <jam> In that case, they can be tested in the same fashion as any other unit testing
 105 [16:57] <jam> however, it doesn't test the visual representation
 106 [16:57] <jam> which is a valid level to test
 107 [16:58] <jam> There are also gui testing suites that can be used
 108 [16:59] <jam> I forget the exact name of one (suikili?)
 109 [16:59] <jam> Which uses screenshots
 110 [16:59] <jam> and some work for marking the regions you care about
 111 [17:01] <jam> http://groups.csail.mit.edu/uid/sikuli/
 112 [17:02] <dholbach> alrightie!
 113 [17:02] <dholbach> thanks jam for giving this great talk!
 114 [17:02]  * dholbach hugs jam

MeetingLogs/devweek1001/WriteTests (last edited 2010-01-29 10:05:54 by i59F765F3)