Ubuntu Open Week - Writing Secure Code - KeesCook - Tue, Nov 3, 2009

  • utc

[21:00] <kees> Hello!
[21:01] <kees> so, if I understand correctly, discussion and questions are in #ubuntu-classroom-chat
[21:01] <kees> I'll be watching in there for stuff marked with QUESTION:  so feel free to ask away.  :)
[21:01] <kees> this session is a relatively quick overview on ways to try to keep software more secure when you're writing it.
[21:01] <kees> I kind of think of it as a "best-pratices" review.
[21:02] <kees> given that there is a lot of material in this area, I try to talor my topics to langauges people are familiar with.
[21:02] <kees> as a kind of "show of hands", out of HTML, JavaScript, C, C++, Perl, Python, Ruby, SQL, what are people familiar with?  (just shout out on the -chat channel)
[21:03] <kees> 21:02 < openweek1_> QUESTION: will this presentation be language specific?
[21:03] <kees> some of it will be, but I'm trying to tailor the examples to stuff people are familiar with
[21:03] <kees> there are some "general" best-practices that apply to any language, which I'll cover first
[21:04] <kees> 21:03 < openweek5__> NONE - where do start?
[21:04] <kees> I find that programming is easiest to learn when you find something specific you want to change.  so in that case, learn the language of the project you want to change/improve.  :)
[21:05] <kees> okay, cool, looks like a pretty wide variety.  :)
[21:06] <kees> I'm adapting this overview from some slides I used to give at talk at the Oregon State University.
[21:06] <kees> you can find that here:
[21:06] <kees> the main thing about secure coding is to take an "offensive" attitude when testing your software.
[21:06] <kees> if you think to yourself "the user would never type _that_", then you probably want to rethink it.  :)
[21:07] <kees> I have two opposing quotes: "given enough eyeballs all bugs are shallow" - Eric Raymond, and "most people ... don't explicitly look for security bugs" - John Viega
[21:07] <kees> I think both are true -- if enough people start thinking about how their code could be abused by some bad-guy, we'll be better able to stop them.
[21:07] <kees> so, when I say "security", what do I mean?
[21:08] <kees>  mean a bug with how the program functions that allows another person to change the behavior against the desire of the main user
[21:08] <kees> ^I
[21:08] <kees> if someone can read all my cookies out of firefox, that's bad.
[21:08] <kees> if someone can become root on my server, that's bad, etc.
[21:08] <kees> so, I tend to limit this overview to stuff like gaining access, reading or writing someone else's data, causing outages, etc.
[21:09] <kees> there are plenty of other security topics (permissions, role separation, MAC, etc etc).  but I'm trying to show how to avoid security bugs.
[21:10] <kees> I'll start with programming for the web.
[21:10] <kees> when handling input in CGIs, etc, it needs to be carefully handled.
[21:10] <kees> the first example of mis-handling input is "Cross Site Scripting" ("XSS").
[21:10] <kees> if someone puts <b>hi</b> in some form data, and the application returns exactly that, then the bad-guy can send arbitrary HTML
[21:10] <kees> output needs to be filtered for html entites.
[21:10] <kees> luckily, a lot of frameworks exist for doing the right thing: Catalyst (Perl), Smarty (PHP), Django (Python), Rail (Ruby).
[21:11] <kees> e.g.  putting in  <blink>blah</blind>  as the password, and after submit, it's blinking
[21:12] <kees> if you use  you'll see it gets escaped
[21:12] <kees> another issue is Cross Site Request Forgery (CSRF).
[21:12] <kees> the issue here is that HTTP was designed so that "GET" (urls) would be for reading data, and "POST" (forms) would be used for changing data.
[21:12] <kees> if back-end data changes as a result of a "GET", you may have a CSRF.
[21:12] <kees> I have a demo of this here:
[21:12] <kees> lets users add "favorite" movies to their lists.
[21:12] <kees> but it operates via a URL
[21:12] <kees> so, if I put that URL on my website, and you're logged into imdb, I can make changes to your imdb account.
[21:13] <kees> as a result, it should use POST forms, not GET or direct URLs
[21:13] <kees> (or use form "nonces", though I won't go into that for the moment)
[21:14] <kees> another form of input validation is SQL.
[21:14] <kees> if SQL queries aren't escaped, you can end up in odd situations
[21:14] <kees> for example:    SELECT secret FROM users WHERE password = '$password'
[21:14] <kees> with that SQL, what happens if the supplied password is    ' OR 1=1 --
[21:14] <kees> SELECT secret FROM users WHERE password = '' OR 1=1 --'
[21:15] <kees> it'll be true and will allow logging in.
[21:15] <kees> my rule of thumb is to _always_ use the SQL bindings that exist for your language, and to never attempt to manually escape strings.
[21:16] <kees> every language will have a good way to pass variables into SQL
[21:16] <kees> for perl
[21:16] <kees> my $query = $self->{'dbh'}->prepare(
[21:16] <kees>     "SELECT secret FROM users
[21:16] <kees>      WHERE password = ?");
[21:16] <kees> $query->execute($password);
[21:16] <kees> this lets the SQL library you're using do the escaping.  it's easier to maintain, and it's much safer in the long-run.
[21:16] <kees> some examples of SQL injection issues are here too:
[21:17] <kees> try that with the password as   ' OR 1=1 --    and you'll see the "secret"  :)
[21:17] <kees> 21:17 < playya_> QUESTION: what about prepared statements?
[21:18] <kees> that's basically another way to say "bound variables", IIUC.  in the above perl example, $query is a prepared query, and gets execute with the $password variable
[21:18] <kees> static SQL and program variables should never mix -- using prepared statements/bound variables should always be used.
[21:19] <kees> another thing about web coding is to think about where files live
[21:19] <kees> yet another way around the sql-bad.cgi example is to just download the SQLite database it's using.
[21:20] <kees> so, either keeping files out the documentroot, or protecting them:
[21:20] <kees> moving from web to more language agnostic stuff ...
[21:20] <kees> when your need to use "system()", go find a better method.
[21:20] <kees> if you're constructing a system()-like call with a string, you'll run into problems.  you always want to implement this with an array.
[21:20] <kees> python's for example.
[21:20] <kees> this stops the program from being run in a shell (where arguments may be processes or split up)
[21:20] <kees> for example,
[21:20] <kees> no good: system("ls -la $ARGV[0]");
[21:21] <kees> better: system("ls","-la",$ARGV[0]);
[21:21] <kees> best: system("ls","-la","--",$ARGV[0]);
[21:21] <kees> in array context, the arguments are passed directly.  in string context, the first argument may be processed in other ways by the shell.
[21:22] <kees> if the first argument is  ;cat /etc/passwd     then the first runs   ls -la ;cat /etc/passwd   and the shell splits it into two commands
[21:23] <kees> in the latter, "ls" gets an argument of ";cat /etc/passwd" which isn't a valid file name, and it correctly screams about it
[21:23] <kees> and "--" is used to indicate to ls that option arguments have finished to stop $ARGV[0] from leading with a "-" and having ls blow up again.
[21:24] <kees> handling temporary files is another area to be careful with.
[21:24] <kees> static files or files based on process id, etc, shouldn't be used since they are easily guessed.
[21:24] <kees> all languages have some kind of reasonable safe temp-file-creation method.
[21:24] <kees> File::Temp in perl, tempfile in python, "mktemp" in shell, etc.
[21:24] <kees> i.e. bad:  TEMPFILE="/tmp/kees.$$"
[21:24] <kees> good: TEMPFILE=$(mktemp -t kees-XXXXXX)
[21:24] <kees> examples of this as well as a pid-racer are in
[21:25] <kees> programmers should think about the sensitivity of what they have in memory.  normally, it's not a big deal, but what about passwords?
[21:25] <kees> keep data that is normally encrypted out of memory.
[21:25] <kees> so things like passwords should be erased from memory (rather than just freed) once they're done being used
[21:25] <kees> example of this is
[21:25] <kees> once the password is done being used:
[21:25] <kees>     fclose(stdin);               // drop system buffers
[21:25] <kees>     memset(password,0,PASS_LEN); // clear out password storage memory
[21:25] <kees> then you don't have to worry about leaving it in core-dump files, etc
[21:26] <kees> if it's not cleared, it could hang around in memory.
[21:26] <kees> 21:26 < erUSUL> QUESTION: for passwords use mlocked mem ?
[21:27] <kees> this is an even better approach, yes.  this keeps the memory from ever being written to disk, in the case of swapping.
[21:27] <kees> note, however, that if you hibernate to an unencrypted partition, mlock() won't save you.  :(
[21:27] <kees> for details on mlock, see "man mlock".  (also note that there is only so much room for mlock memory)
[21:28] <kees>
[21:28] <kees> for encrypted communications, using SSL should actually check certificates.
[21:28] <kees> clients should use a Certificate Authority list (apt-get install ca-cerificates, and use /etc/ssl/certs)
[21:28] <kees> servers should get a certificate authority.
[21:28] <kees> the various SSL bindings will let you define a "check cert" option, which is, unfortunately, not on by default.  :(
[21:28] <kees> this is very language-specific, though, so it requires some level of research.  but SSL without cert checking isn't very protective.
[21:29] <kees> one item I mentioned early on as a security issue is blocking access to a service, usually through a denial of service.
[21:29] <kees> one accidental way to make a server program vulnerable to this is to use "assert()" or "abort()" in the code.
[21:29] <kees> normally, using asserts is a great habit to catch errors in client software.
[21:29] <kees> unfortunately, if an assert can be reached while you're processing network traffic, it'll take out the entire service.
[21:29] <kees> those kinds of programs should abort on if absolutely unable to continue (and should gracefully handle unexpected situations)
[21:30] <kees> switching over to C/C++ specific issues for a bit...
[21:30] <kees> one of C's weaknesses is its handling of arrays (and therefore strings).  since it doesn't have built-in boundary checking, it's up to the programmer to do it right.
[21:30] <kees> as a result, lengths of buffers should always be used when performing buffer operations.
[21:30] <kees> functions like strcpy, sprintf, gets, strcat should not be used, because they don't know how big a buffer might be
[21:30] <kees> using strncpy, snprintf, fgets, etc is much safer.
[21:30] <kees> though be careful you're measureing the right buffer.  :)
[21:30] <kees> char buf[80];
[21:30] <kees> strncpy(buf,argv[1],strlen(argv[1]))    is no good
[21:30] <kees> you need to use buf's len, not the source string.
[21:30] <kees> it's not "how much do I want to copy" but rather "how much space can I use?"
[21:31] <kees> since Ubuntu 8.10 (Intrepid), the compiler will attempt to fix up unsafe function usage with safe ones, if it can.
[21:31] <kees> 21:30 < playya_> QUESTION: isn't it better to loose a service for some time because of assert instead of having an owned machine?
[21:32] <kees> certainly, but the assert catches an issue -- the code should just say "oh dear" and drop the connection, etc.  i.e. graceful error handling instead of just shutting down.
[21:32] <kees> if there is genuinely no way to recover, then it makes sense to shut down.  but those situations are rare.
[21:32] <kees> another tiny glitch is with format strings.  printf(buffer);  should be done with  printf("%s", buffer);  otherwise, whatever is in buffer would be processes for format strings
[21:32] <kees> instead of "hello %x"  you'd get  "hello 258347dad"
[21:33] <kees> I actually have a user on my system named %x%x%n%n just so I can catch format string issues in Gnome more easily.  :)
[21:33] <kees> format strings get passed from high-level functions, though, so it's not always obvious.
[21:34] <kees> gtk's dialogs, for example, will call sprintf on passed buffers, so those dialogs should use  "%s", msg);  instead of  msg);
[21:34] <kees> since Ubuntu 8.10 (Intrepid), the compiler will yell about possible format string issues, so those warnings are good to pay attention to.
[21:35] <kees> (for more details on the security features of the C compiler, see )
[21:35] <kees> 21:34 < kennethvenken> QUESTION: are there tools that check for these kind of pitfalls?
[21:35] <kees> the compiler is in the best position to do it, so that's where I've spent the most time looking
[21:36] <kees> the last bit to go over for C in this overview is calculating memory usage.
[21:36] <kees> (this actually applies to C++ too)
[21:36] <kees> if you're about to allocate memory for something, where did the size come from?
[21:36] <kees> malloc(x * y)  could wrap around an "int" value and result in less than x * y being allocated.
[21:36] <kees> this one is less obvious, but the example is here:
[21:36] <kees> malloc(5 * 15) will be safe, but what about malloc (1294967000 * 10)
[21:37] <kees> checking for this is simple, but many people aren't in a habit of testing for it:
[21:37] <kees>     if (x >= (INT_MAX / y - 1) ) {  ....
[21:37] <kees>     x=atoi(argv[1]);
[21:37] <kees>     y=atoi(argv[2]);
[21:37] <kees> that "if" will be true when  x * y  would overflow and wrap back around.
[21:38] <kees> C++ has a similar issue with the "new" operator, when allocating an array of objects
[21:38] <kees> if an object is "x" bytes big, and you need an array "y" long, this is effectively doing a  malloc( x * y ) all over again.
[21:38] <kees> so, the biggest thing to help defend against these various glitches is testing.
[21:39] <kees> try putting HTML into form data, URLs, etc
[21:39] <kees> see what kinds of files are written in /tmp
[21:39] <kees> try putting giant numbers through allocations
[21:39] <kees> put format strings as inputs
[21:39] <kees> try to think about how information is entering a program, and how that data is formulated.
[21:39] <kees> 21:39 < erUSUL> QUESTION: Do not most people use malloc wrappers to avoid this kind of things like xmalloc ?
[21:39] <kees> it'd be nice if it was more of a standard practice, yes.
[21:39] <kees> the xmalloc's I've seen just check return codes, though, they don't validate the math.
[21:40] <kees> i.e.  xmalloc(x * y) still has the same problem.
[21:40] <kees> something like xallow_array(x, y) which then did the INT_MAX tests would be better.  (this is what image libraries have moved to do it, since they're constantly allocating 2d buffers, etc)
[21:40] <kees> there are a lot of unit-test frameworks (python-unit, Test::More, CxxTest, check)
[21:41] <kees> give them a try.  :)
[21:41] <kees> writing tests is great for finding bugs in general (and security bugs too)
[21:41] <kees> as for projects in general, it's great if a few days during a development cycle can be dedicated to looking for security issues.
[21:42] <kees> that's about it from me; it was a quick overview.  :)
[21:43] <kees> I've left some time for questions, if there are any?
[21:43] <kees> anything about secure coding, security in ubuntu, security in general?
[21:43] <kees> I can also answer questions about video formats and ubuntu development processes.  ;)
[21:44] <kees> 21:44 < amik> any particualr dangers in Java (other than sql stuff)?
[21:45] <kees> I don't know of any language-specific issues with Java, but the all the stanard stuff (tempfiles, ssl, etc) applies.
[21:45] <kees> 21:45 < jtatum> QUESTION: can you recommend any reading material for various types of security testing?
[21:46] <kees> specific to testing, it's a little iffy.  security testing tends to either fall into "software testing" or "pentration testing", which can be very different things.
[21:46] <kees> as for books, I recommend:
[21:46] <kees> Secure Programming Cookbook for C and C++ by Viega, Messier
[21:46] <kees> The Art of Software Security Assessment by Dowd, McDonald, Schuh
[21:46] <kees> Fuzzing for Software Security Testing and Quality Assurance by Takanen, DeMott, Miller
[21:46] <kees> and for online, Secure Programming for Linux and Unix HOWTO by David Weeler
[21:46] <kees> 21:45 < sebsebseb> QUESTION: Apparantly  OpenBSD is the most secure OS is this true?
[21:47] <kees> this obviously depends on one's definition of "most secure OS".  At DefCon in CTF, considered by some to be the most dangerous network in the world, the organizers traditionally use OpenBSD.
[21:48] <kees> that said, OpenBSD sure misses a lot of features I like to use on my desktop.  :)
[21:48] <kees> 21:46 < kennethvenken> 'QUESTION: what kind of code analysis techniques are applied to source code written for ubuntu?
[21:49] <kees> there is no specific code analysis that happens for all ubuntu software, but lots of people use a lot of different systems to doing code analysis.
[21:49] <kees> for example, coverity runs their tests frequently, "sparse" is used on the linux kernel.
[21:50] <kees> when I do audits, it's a combination of manual investigation (looking at how data passes through the code) and looking at the build logs to see what the compiler is screaming about with the various warnings for format strings, unchecked return codes, etc.
[21:50] <kees> 21:49 < openweek1_> QUESTION: some python specific materials about security programing?
[21:50] <kees> I don't have anything handy, though I think David Wheeler's website has good general stuff in it
[21:51] <kees> 21:50 < amik> QUESTION: any thoughts on how to get developers to get into the 'defensive security' state of mind in daily work?
[21:51] <kees> this is, I think, a matter of making people more paranoid.  ;)  thinking about how software can be misused is key.
[21:51] <kees> that said, it is sometimes very hard to do this when coding to a dealine, etc.
[21:51] <kees> 21:51 < sebsebseb> kees: What's CTF  and Defcon?   well  Defcon is I guess some kind of security conference or somethign
[21:52] <kees> sorry, I should have explained that more.  DefCon in an annual computer security conference (it follows the more corporate "Blackhat" conference).  CTF is Capture the Flag, a challenge that runs the entire conference where teams try to break into eachother's systems to earn points.
[21:52] <kees> great fun.  :)
[21:53] <kees> 21:53 < kennethvenken> QUESTION: are you Dutch? ;) I'm from Belgium
[21:54] <kees> I'm half-dutch (hence my americanized last name "Cook").  I'm named after my grandfather.  :)
[21:55] <kees> thank you everyone for listening!
[21:55] <kees> I'll clear out of the way now.  if other questions pop to mind, feel free to catch me on freenode or via email  thanks!

MeetingLogs/openweekKarmic/SecureCode (last edited 2009-11-04 19:04:52 by pool-71-182-105-84)