dw2

16 September 2008

The practicalities of producing open source software

Filed under: books, Open Source — David Wood @ 10:53 am

I’ve been reading books and articles about open source software since at least June 1998 – the first time the famous phrase “The cathedral and the bazaar” (*) made an appearance in Symbian’s principal internal discussion database (which was, at the time, still called “Psion Software General”). However, I remain keen to keep on testing out my thinking and understanding of open source issues.

After all, there are many different angles to this subject. There’s potential huge upside to taking full advantage of the best principles of open source methods – but there’s also many risks from applying some of these ideas in a misguided manner.

For that reason, I continue to pick up books on open source software and think hard about what they say. Recently, I accepted the advice of several of my Symbian colleagues, and started reading “Producing open source software” by Karl Fogel.

I’m very pleased that I listened to that advice. In my view, this book is in a class of its own.

It has the great merit of being an intensely practical book. It’s clear from numerous examples in the book that the author has extensive real-world experience at the heart of development teams of open source projects that are both significant and successful – including CVS and Subversion.

Some of the chapters contain ideas that have been covered before – like the history of free and open source, advice on how to choose a licence, and aspects of the technical infrastucture of open source projects. However, other chapters delve into material that (to my knowledge) has been covered much less often:

  • Social and political infrastructure of successful open source projects;
  • “Money”: Working in open source projects in which companies are sponsoring parts of the work;
  • “Money can’t buy you love” – excellent advice on how to avoid corporate sponsorship dampening the enthusiasm of volunteer contributors;
  • Communications – including how to communicate with “difficult people”;
  • Packaging, releasing, and daily development – including dealing with different kinds of codeline, and the special considerations that apply to integration and to making releases;
  • Managing volunteers- including the typical roles that probably need to be filled in projects.

Several times while reading a chapter, I found myself thinking: “Yes, this is highly practical material – and interesting too. But I can see there’s still another 20+ pages in this chapter. What else is there to say about this subject?” But then as I turned over more pages, I thought, “Oh yes, this is something that really belongs here too – it’s what happens in real projects!

The writing style was pleasant and clear throughout – with an engaging mix of actual examples (both good and bad) and a discussion of the broader lessons to be drawn from these examples.

My recommendation is that project teams should regard this book as a kind of “bible” – it’s something that should be regularly dipped into, and the many salient points shared and debated in group discussion. The lessons will make sense on several levels – some apply during early phases of projects, and others as the project becomes more complex.

(Many of the same principles apply in non-open source projects too, by the way! I felt lots of resonance with my own observations over the years about successful software projects inside the “community source” world of Symbian OS development.)

Perhaps the most sobering part of the book is contained in its introduction:

Most free projects fail…

… it’s impossible to put a precise number on the failure rate. But anecdotal evidence from over a decade in open source, some casting around on SourceForge.net, and a little Googling all point to the same conclusion: the rate is extremely high, probably on the order of 90–95%. The number climbs higher if you include surviving but dysfunctional projects: those which are producing running code, but which are not pleasant places to be, or are not making progress as quickly or as dependably as they could.

Happily, the introduction continues:

This book is about avoiding failure. It examines not only how to do things right, but how to do them wrong, so you can recognize and correct problems early…

If that whets your appetite, note that you can read the entire book online, by following the links above. Alternatively, check out the reviews on Amazon.com.

(*) Footnote: The June 1998 “Psion Software General” discussion contained a link that is, sadly, now dead. I’d like to belatedly thank Symbian lead Software Engineer Joe Branton for coming to my room, on several occasions, and encouraging me to pay more attention to the ideas in The Cathedral and the Bazaar.

31 August 2008

Intellectual property and open source

Filed under: books, GPL, Intellectual property, Open Source — David Wood @ 7:17 pm

I’ve just finished reading a third book, within two months, on the topic of open source licensing. The three books are:

  1. Heather Meeker’s “The Open Source Alternative: Understanding Risks and Leveraging Opportunities” – which I reviewed here;
  2. Lawrence Rosen’s “Open Source Licensing: software freedom and intellectual property law” – which I reviewed here;
  3. Van Lindberg’s “Intellectual property and open source: a practical guide to protecting code“.

My headline summary is that all three books are well worth reading. They overlap to an extent, but they come at their shared subject from very different viewpoints, so each book has lots of good material that you won’t find in the others.

Van Lindberg targets his book at software engineers. He uses many analogies between legal concepts and deeply technical software engineering concepts. For example (to give a flavour of many of the clever pieces of writing in the book):

“One way to think about private goods is to analogize them to locks or mutexes in a multithreaded program. A number of different threads may want to use a protected resource, but control of the lock around the resource is rivalrous…”

Somewhat unexpectedly, the first half of the book hardly mentions open source. There’s good reason for this. The first seven chapters of the book cover the basic principles of intellectual property (IP), including patents, copyrights, trademarks, trade secrets, licences, and contracts. I found the very first chapter to be particularly engrossing, as it set out the philosophical foundations for IP. Van Lindberg highlighted the utilitarian justification for IP, in terms of legal measures to counter what would otherwise be two sorts of market failures:

  • The cost of creating knowledge is high, but the cost of consuming it is low…. Therefore there is a societal incentive to not create as much knowledge as we would ideally like to have” (hence the utilitarian rationale for copyright)
  • Secrets are more valuable to you personally, but shared knowledge is more valuable to society…. The resource is valuable to you because you have a key, but it is worthless to everyone else” (hence the utilitarian rationale for patents).

As I said, the very first chapter was particularly engrossing, but I thought the other early chapters dragged a bit. Although all the material was interesting, there were rather too many details for my liking.

Chapter eight (“The economic and legal foundations of open source software”) went back to philosophical principles, in an attempt to pinpoint what makes open source different from proprietary software. The difference, according to Van Lindberg, is that:

  • Proprietary software is driven by corporate business goals (which inevitably involve profit-maximisation, and therefore – he claimed – a tension between what’s best for the customers and what’s best for the shareholders)
  • Open source software is driven by cooperative goals, in which the goals of the customers have primacy. (Note the difference between the similar-looking words corporate and cooperative.)

This chapter also runs a pretty compelling extended comparison between proprietary software and open source software, on the one hand, and banks and credit unions, on the other hand. Again, the first member of each pair is driven by shareholder goals, whereas the second member of each pair is driven by customer goals (the legal owners are the same people as the customers).

The primary task of open source licences, according to this analysis, is to support cooperation. In more detail, Van Lindberg says that open source licences are intended to solve the “Programmer’s Dilemma” version of the famous and well-known “Prisoner’s Dilemma” problem from game theory:

“Open source licences serve two functions in a game-theoretic context. First, they allow programmers to signal their cooperative intentions to each other. By placing their code under a licence that allows cooperation, programmers indicate to their peers that they are willing to participate in a cooperative solution. Second… licences are based in copyright law, which allows the original developer to dictate (to some extent) the users and uses of his code. The legal penalties associated with copyright violations change the decision matrix for other programmers, leading to a stable cooperative (and optimal) solution.”

This (like everything else in the book) is thought-provoking. But I’m not fully convinced. I think this puts too much importance onto the licence aspect of open source. Yes, picking a good licence is important – but it’s insufficient to guarantee the kind of cooperative behaviour that will make an open source project a real success. And as I’ve argued elsewhere, picking the right licence is no guarantee against the software fragmenting. But despite this quibble, I still think the ideas in this chapter deserve wide readership.

The second half of the book changes gear. With the first eight chapters having carefully outlined the underlying legal framework, the remaining six chapters walk through the kind of real-life IP concerns that will face someone (whether an individual developer, or a company) who wants to become involved in an open source project:

  • Issues with standard employment contracts that probably specify that everything you work on – even in your spare time – belongs to your company, and which you therefore are not free to assign to an open source project
  • General guidelines on choosing between some of the more popular open source licences
  • Legal complications over how to accept patches and other contributions, from outsiders, into your project
  • Particular issues with the GPL
  • Reverse engineering
  • Creating a non-profit organisation or foundation (recommended if your project becomes larger).

There’s lots of good advice here. Every chapter of this part of the book has important material – but I was slightly disappointed with some parts. For example, given the careful attention to patents in the first half of the book (where two chapters were devoted to this topic), I was expecting more analysis of how some of the major open source licences differ in their approach to patent licences and patent retaliation clauses. On reflection, that’s something that the other two books (ie by Meeker and Rosen) handle better.

The chapter on the issues with the GPL confirmed and extended the opinion about that licence which I’d picked up from my previous reading: the interpretation of the GPL is subject to great uncertainty over ambiguities. The chapter includes a lengthy “Questions and answers” section, to which the answer to nearly every question is “Maybe” or “It depends”. (Apart from the last question, which is “Can I depend on the answers in this Q&A to keep me out of trouble?”; the answer to this is “No, this is our best understanding of copyright law as it stands right now, but it could change tomorrow – and nobody really knows…”)

Giving more evidence for this view of the ambiguities surrounding the GPL, Van Lindberg mentions an essay by Matt Asay, “A Funny Thing Happened on the Way to the Market“. Here’s an extract from that essay:

“I asked two prominent representatives of the Free Software Foundation – Eben Moglen, general counsel, and Richard Stallman, founder – to clarify thorny issues of linkage to GPL code, and came up with two divergent opinions on derivative works in specific contexts…”

“…it is telling how widely their responses diverge – there appear to be no definitive answers to the question of what constitutes a derivative work under the GPL, not even from the holders of the licenses in question.”

This looks decisive, but it could be argued that this quote from Matt Asay is itself misleading, since Matt’s article goes on to state that:

“Fortunately, as I will detail below, this issue has largely gone away, as it has become accepted practice to dynamically link to GPL code [without that code becoming part of the GPL program]. Linus Torvalds helped to build momentum for such a reading of the GPL. While some argue that kernel modules, including device drivers, must be GPL, Torvalds has stated: This [GPL] copyright does *not* cover user programs that use kernel services by normal system calls – this is merely considered normal use of the kernel, and does *not* fall under the heading of ‘derived work.’

However, Van Lindberg seems to be right that the official FAQ about the GPL, maintained by the Free Software Foundation, advocates a stricter interpretation:

“Q: Can I release a non-free program that’s designed to load a GPL-covered plug-in?

“A: It depends on how the program invokes its plug-ins. For instance, if the program uses only simple fork and exec to invoke and communicate with plug-ins, then the plug-ins are separate programs, so the license of the plug-in makes no requirements about the main program.

If the program dynamically links plug-ins, and they make function calls to each other and share data structures, we believe they form a single program, which must be treated as an extension of both the main program and the plug-ins. In order to use the GPL-covered plug-ins, the main program must be released under the GPL or a GPL-compatible free software license, and that the terms of the GPL must be followed when the main program is distributed for use with these plug-ins.

“If the program dynamically links plug-ins, but the communication between them is limited to invoking the ‘main’ function of the plug-in with some options and waiting for it to return, that is a borderline case.

Using shared memory to communicate with complex data structures is pretty much equivalent to dynamic linking.”

Do these ambiguities over the GPL really matter? It’s hard to be sure, but I’m personally glad that the Symbian Foundation plans to adopt a licence – the EPL – which avoids these issues.

I’m also glad to have taken the time to read this book – it’s helped my understanding grow, in many ways.

Footnote: My thanks go to Moore Nebraska for drawing my attention to the Van Lindberg book.

« Newer Posts

Blog at WordPress.com.