Wookie: A case study in sustainability

At OSS Watch we periodically review all the resources on our main website to make sure they’re accurate and up to date. Last week it was time to revise our case study on Apache Wookie, which is a project I’ve been involved with for some time.

Wookie graffiti

OSS Watch became involved with Wookie while I was working in an EU project based at the University of Bolton. The project as a whole had done lots of interesting stuff, but as with many large projects the whole was somewhat less than the sum of its parts; the central joined-up platform wasn’t really going to take off after the project finished. However, in the process we had built quite a promising system for adding functionality to the core portal shell using the W3C Widgets specification.

Towards the end of the project I went to an OSS Watch event, and spoke with Ross Gardler about what we were doing. Ross explained the Apache Incubator model to me, and from there on I was hooked.

Fast forward to 2013, and Apache Wookie is out of the incubator and a top-level Apache project, and is now on its seventh official release (the last one was in April). Its not a huge project – the team is still small, though its far more diverse than when we started out.

The tempo of development has also slowed in recent years. However, in part thats due to the maturing of the software to a point where code churn for its own sake has a negative impact on the projects that depend on it. Most recent updates have been fixing bugs affecting deployment in various unusual configurations, driven largely from reports by users. So this isn’t necessarily a bad thing!

Something that has also had a very positive impact on the project is having a very active downstream project – Apache Rave. This has driven a lot of improvements to Wookie to improve integration and deployment.

Two major EU projects have been working with Wookie and Rave over the past two years, and are coming towards their end – one this year, and the other in 2014.

Unlike previous projects they have focussed on working with existing software projects rather than going it alone, and have contributed code, user studies and content.  This has been a great experience, and hopefully future projects can learn from this approach.

Wookie stands as an example of how OSS Watch can help take work from within the HE sector and turn it into a sustainable open source software project; and as a beneficiary of this approach I’m keen to offer the same help I received to others.

Do you think your University-based project has the potential to go further? If so, get in touch!

Read the updated case study on Apache Wookie at OSS Watch.

(Photo by Silus Grok, used under CC-BY-SA license)

More open source options for education

As part of OSS Watch’s regular review of our website’s content, I’ve taken a look through the publicly editable version of our Open Source Options for Education list and added some new contributions to our website.

The response from the educational community has been overwhelming in helping us find both alternatives to common proprietary software and real-world examples of these alternatives being used.  I’d like to extend my thanks to everyone who’s contributed.

I’m particularly pleased this time to include a new category for Management Information System (MIS) software.  These tools often represent a significant investment to an institution and requirements for compatibility with these systems which perform a key administrative role can be a strong influence over procurement of related software such as VLEs.

You can find the updated version of the Open Source Options for Education document on the OSS Watch website, and continue to contribute to the public version on Google Docs.

4 Tips for Keeping on Top of Project Dependencies

Almost any software project involves working with dependencies – from single-purpose libraries to complete frameworks. When you’re working on a project it’s tempting to bring in libraries, focus on meeting the user need, and figure out the niceties later. However, a little thought early on can go a long way.

Photo of a stack of cards

This is because every dependency can bring its own licensing obligations that affect how you are able to distribute your own software. In some cases, in order to release the software under a particular license you may end up having to rewrite substantial amounts of software to remove reliance on a library or framework that is distributed under an incompatible license.

So there is a tradeoff between being agile and productive in the short term, against the risk of needing to do a costly refactoring triggered by a compatibility check before – or even worse, after – a release.

For larger projects, and organisations with multiple projects, this starts to stray into the territory of open source policies and compliance processes, but for this post lets just focus on the basics for small projects.

1. Make it routine

A good strategy is to build good dependency management practices into your general software development practices – similar to the concept of building in quality or building in security.

In other words, given that the cost of fixing things later can be significant, it’s worth investing in the practices and tools that can ensure potential issues are spotted and fixed earlier.

At its simplest, this can just mean developing a greater awareness as an individual developer of where your code comes from,  knowing that what you reuse can limit your choices for how you license and distribute your own code.

So in practical terms, this means being careful about copying and pasting code from the web, and making sure you know the licenses of any dependencies, preferably before working with them, but certainly before building any reliance on them into your code.

It may also make sense to handle any required attribution notices for inclusion in a NOTICE and README as you go along, rather than just rely on a release audit to always pick them up.

2. Let tools take some of the strain

There are also tools that can help make things easier. For example, if you use Maven for Java projects, there is a License Validator plugin that can help flag up problems as part of your compile and build process.

Alternatively, Ninka is an Open Source tool for scanning files for licenses and copyrights. While it can’t follow import declarations or dynamically linked libraries, it can be useful to periodically check builds. A similar project is Apache RAT (Release Audit Tool) which was originally created for use within the Apache Software Foundation for reviewing releases made in the Apache Incubator.

For larger projects and organisations there are also complete open source policy compliance solutions like Protek from Black Duck, or Discovery from OpenLogic.

It’s also worth pointing out that, while tools can be a part of the solution – and can be invaluable for large projects – ultimately it’s still your responsibility to make sure you meet the obligations of the software you are reusing.

3. Remember to check more than just the licences!

If a dependency has a compatible licence, thats great. But what about if the project that distributes it doesn’t bother checking their own dependencies?

This is where it’s good to have an idea about the governance and processes of projects you depend on.

There aren’t just licensing risks associated with dependencies – if you rely heavily on a library that has only one or two developers then you also run the risk that it may become a “zombie” project with implications for the rest of your code, for example, if security patches are no longer being applied.

A zombie

Beware of zombie projects!

The commercial tools mentioned above are also typically backed by a knowledge base that can also flag up other issues with dependencies, such as governance or sustainability problems.  However, just having a check for the project on Ohloh is often good enough for most smaller projects to check that a library is still “live”.

If you need to know more about the sustainability of a particular project, OSS Watch can carry out an Openness Review to check its viability using a range of factors – get in touch with us if you want to know more.

4. Keep track of past decisions and share knowledge with colleagues

Some organisations make use of component registries to keep track of which components they approve on in their software projects. This can save time spent by developers researching the same libraries, but makes most sense when you have a lot of projects that probably need the same kinds of components, in which case focussing on reusing the same set of libraries makes sense.

Another reason for using a registry is where you need to perform more detailed evaluations, for example for security, and so checking a dependency is more involved than just figuring out which license it uses, and that the project isn’t dead.

Some examples of commercial registries are Sonatype Component Lifecycle Management   and Black Duck Code Center. Again, for a smaller project or an organisation with a relatively small set of projects this can be overkill, and just having a shared document somewhere where you can keep note of which libraries you’ve used can be effective.

For example, you could share a spreadsheet with colleagues containing some basic information on each library like what version you’re using, what license it’s under and the date and results of any investigations you’ve done into sustainability, security or risk assessment.

Is it worth it?

Reusing code is good practice and should save you time and expense – so it’s annoying if the administration associated with it starts affecting your productivity.

You can make a judgement call about what level of risk you feel is acceptable; for example, on an internal-only research project the risk of having to undergo a major refactoring should the project be successful may be one worth taking.

However, for a production system, or a component that is itself intended for reuse, you may just have to accept that you have to be a bit more diligent in how you reuse code.

Photo by DieselDemon used under CC-BY-2.0.

Guardian recommends open source skills as an employability bonus

The Guardian Careers site published an article yesterday discussing which skills you should have on your CV to ensure your application is “at the top of the pile” when applying for IT jobs.

Among the usual traits such as being able to program (they suggest Java, but with a willingness to learn new languages), one of the recommendations is “Open up to open source”.

In a succinct paragraph the article manages to introduce the idea of open source, as well has explaining both its benefits to the public (in terms of having access to zero-cost versions of software) and why IT companies and departments would be looking for it.

Engaging with an open source community provides you with the opportunity to gain practical experience in working on projects with a distributed team from diverse backgrounds.  Any skills relevant to the IT industry would be desirable to an open source project – not just programming but also skills like project management and technical writing.

The public nature of open source projects also means that your work will be open for potential employers to examine.  Code you’ve written for a previous job may be locked up in a company’s version control system, but by contributing open source code you give a potential employer the opportunity to see evidence of your competence in the field.

Of course, beyond the benefits of the general IT skills you can acquire, specific experience in open source engagement can be of value to IT companies who are increasingly taking advantage of open source software.  To get the full value from open source implemented in an organisation, that organisation should be prepared to engage with the community process, allowing them to get bugs fixed, contribute to the project, and possibly influence the project’s direction in their favour.  To make this possible, they’ll need people with experience of community engagement.

Trademarks and FOSS

On April 19th the United States Patent and Trademark Office finally rejected an application for the trademark ‘Open Source Hardware’. The grounds for the rejection were that the term was ‘merely descriptive’. Trademarks are intended to identify a specific source of goods or services, protecting that source from confusion in the minds of consumers with other sources. Naturally then, if you try to obtain a trademark which is just a description of a type of product or service, it is proper that you should be refused; it would not be distinctive and it would distort the market by allowing one source to control the generic term. If I market a car for a hamster, I should not be able to get a trademark for the name ‘hamster car’, as that would improperly restrain competitors from bringing their own hamster cars to market. So should we be pleased that the application was rejected? After all there is no trademark ‘open source software’ (although the Open Source Initiative do hold one for their own name and logo which acts as a kind of accreditation mark for their approved licences and projects that use them). In this case it’s a little confusing, because the applicants do not seem to have been actually looking to use the mark to describe what is usually understood by the phrase ‘open source hardware’ at all. In fact they were looking to protect their offering comprising:

Computer services, namely, providing an interactive web site featuring technology that allows users to consolidate and manage social networks, accounts, and connections to existing and emerging application programming interfaces (APIs)

Reading the decision it seems that the services relate to providing and managing services for children on a variety of devices, and that the trademark is supposed to imply the ‘general freedom’ of open source software but applied to one’s hardware devices in a surprising new way:

In support of registration, applicant maintains in Section 1 of its brief that the mark is not merely descriptive because OPEN SOURCE was used initially with the Open Source Software Movement; that applicant’s use of “open source” would associate that term with the provision of software and that “this causes a jarring effect that is overcome by the user’s imagination to the play on words.”… Additionally, applicant argues that joining HARDWARE next to OPEN SOURCE causes consumers to think of “physical artifacts of technology designed and offered in the same manner as free and open source software,” citing to the wikipedia.com definition of “open source hardware.”

So, I would argue, this is really not an application to use the term ‘open source hardware’ on what is normally understood to be open source hardware, so it’s not merely descriptive. This is more like the the Irish company that holds the trademark ‘open source’ for use on dairy products. Indeed, the decision does have a strong dissenting opinion which argues that the trademark ought to be allowed as non-descriptive but then properly obstructed by complaints from the actual ‘open source hardware’ community before its final grant.

What this shows, I think, is a couple of things. Firstly, that bodies like the USPTO have trouble understanding phrases like ‘open source’ where they relate to technology. Secondly, that terms that the community relies on to describe their interests and enthusiasms are not necessarily immune from proprietary seizure. While the decision here seems to contain an error that worked to deny the trademark, it’s possible to imagine a similar error that would allow a troublesome trademark to be granted.

In connection with trademarks and FOSS I was interested to see the establishment of modeltrademarkguidelines.org, a wiki-based site which

 …proposes language one might use for trademark guidelines for FLOSS software projects.

It already contains a very useful page listing pre-existent FOSS project trademark policies. I would encourage readers to read the draft version of the guidelines and comment.

Open Source and Open Standards key to future of public sector IT

Last week Open Source, Open Standards 2013 took place in London, an event focussed on the public sector. Naturally these being two topics we’re very keen on here at OSS Watch I went along too.

Overall the key message to take away from the event was just how central to public sector IT strategy these two themes have become, and also how policy is being rapidly turned into practice, everywhere from the NHS to local government.

Tariq Rashid, the Open Source policy lead for the UK Government, spoke of the need for IT to be focussed on user needs, and to deliver sustained value, by moving from “special” software procured for the public sector, to services delivered using commodified IT.

Even where services are unique to the public sector, Rashid and other speakers at the event made the case that most elements of such services can be delivered by building on commodified IT. For example, the open source CMS Drupal is used for delivering increasing numbers of public sector IT services, and the Government Digital Service builds its services from open source components.

The two strategies of Open Source and Open Standards are necessary as they create the ‘competitive tension’ needed to drive down cost and improve sustainability.

Mark Bohannon of Red Hat gave an overview of the global landscape of Open Source in government, in the US and UK, and identified the UK policies as being particularly forward looking. Mark positioned Cloud and Big Data as two key areas where Open Source and Open Standards were critical, calling out OpenStack and Hadoop as particular cases, and also provided some great case studies on open source from the military and from space exploration.

Mark made the point that Open Source and Open Standards underpin a more fundamental change in IT, away from big IT projects towards IT that is agile, modular and responsive to user needs.

Ian Levy of CESG dispelled some myths around security and Open Source (“If anyone in UK government says CESG has banned open source send their name to me and I’ll have them killed”) and made the case for a common sense approach to security, whether the software or service is open source or closed source.

Mark Taylor from Sirius has long been an advocate for open source in the public sector, and it was good to be at a point where the message has been heeded! He began with a nice Schopenhauer quote:

All truth passes through three stages. First, it is ridiculed. Second, it is violently opposed. Third, it is accepted as being self-evident.

In the talk he provided lots of practical advice for public sector organisations on putting Open Source into practice, which include calling on those writing tenders to focus on user needs instead of naming technology solutions. Mark also gave a workshop later in the day where he continued this theme, expanding on how public sector organisations and companies had made transitions to open source. Its not very easy to summarise here in a post, but I found the information very practical and useful; for example, when transitioning IT, to start with the systems furthest away from users, such as backend services and infrastructure, to avoid sparking the usual neophobia when you change technologies for users.

Inderjit Singh gave an overview of the NHS standards-based approach to IT, with some nice background on which approaches had been tried and where the current strategy is going. The current approach has been to use a programme of change projects involving SMEs that have engaged 40 new suppliers, and which is accelerating the take up of the standards.

Singh asserted that standards and fundamental for enabling an open architecture, and that open source and open standards go hand in hand in delivering value for users.

After some workshop sessions, we had Alasdair Mangham from the London borough of Camden giving us a look into how they’ve been building services using open source software in collaboration with SMEs. This involved a major shift in contracting – rather than write an huge set of requirements in a tender document, they disaggregated the project and bought in specialist capabilities (in usability, service design, SOA etc) as needed in smaller chunks of time using an agile process.

Graham Mellin gave an overview of the Met Office’s new space weather system built using open standards and using open source software; for their own specialist systems they decided to go down the route of making it Open Source rather than the private partner sharing route as result of an exploitation planning process.

I met with a lot of people at the event, from suppliers, local government, NHS and national government departments, and it was good to get a sense of how the public sector is moving – whatever the pace in individual areas – towards this vision of more affordable, sustainable and user focussed IT, and better utilising the capabilities of UK SMEs and startups.

We pointed out recently in our post in the Guardian, Higher Education in particular is in a strong position in this area as a result of past investments in Open Source and Open Standards, and we now need to think about how we take that forwards.

As Mark Taylor pointed out in his talk, the public sector accounts for over half of IT spend in the UK – and we can choose to either unite and use that market power to shape the future, or be divided up and conquered.

Is Tomorrow’s World an Open Source one?

Last week BBC’s Horizon put out a special episode looking at the next generation of technological advances. Two of the stories they reported caught my eye as they suggest that the future of innovation lies in an open way of working.

Photo of Liz Bonnin, Horizon presenter

Liz Bonnin presented the show from one of The Science Museum's storage hangers. Photo Credit:BBC

The first story looked at the work of Professor Bob Langer at MIT.  Professor Langer has received the Draper Prize and National Medal of Science for his work in biomedical engineering.  Langer’s approach to research is to bring experts from a range of fields together to create an interdisciplinary team.

Previous approaches to designing medical devices were designed by doctors based on existing materials.  Langer sought to design new materials to operate inside the body and be safely absorbed once their job was done.  To make this possible he assembled a team including engineers, chemists, neurosurgeons, pharmacologists and a number of other disciplines.

The approach of applying one expert’s knowledge to the problem posed in another’s primary field has many parallels with open innovation, and led to advances never thought possible by those working in single fields.

The second story reported on the Protei project which we heard about recently at Open Source Junction.  Protei was founded by Cesar Harada, and seeks to produce sailing drones which can be used to clean up oil spills.

Harada released his initial designs online and set out forming a community of scientists and engineers to collaborate on the project. Supported by a kickstarter campaign, over $33,000 dollars were raised allowing him to hire a work shop and invite his community to work together on the open hardware project.

The programme then focused on the contrast between the model of inventors patenting an invention which Harada characterised as “good for the manufacturer but not very good for the people”, to the “new culture of openness” shaping what we invent.

One comment that piqued my interest came from Gia Milinovich, who spoke of a “tension between the open source movement and business”, and a “battle between these two worlds”.  While this paints an exciting picture for a science documentary, I think the language used here was slightly disingenuous.

While we hear of stories where one company attacks another company who backs an open source project, these bear little distinction from companies litigating against each other over issues with no relation to open source. It’s fortunately very rare that we see a “battle” between a business and an open source community, and the examples of this are greatly outstripped by the examples where the two work together in harmony, indeed furthering one another’s goals.

Designer Wayne Hemingway then described how he “loved the idea” of an environment with no patents and no copyright, which while certainly a valid goal doesn’t do well to represent the way open source works.  The most common open source licences all at least require that the the original author be credited for their work, which in a copyright-free world wouldn’t be enforceable.

These criticisms aside, It’s great to see open source and open hardware getting airtime from a mainstream broadcaster like this.

Koha: a case study in open source project ownership | opensource.com

While compiling OSS Watch’s list of Open Source Options for Education, I discovered Koha, an open source Integrated Library System (ILS). I discovered, with some confusion, that there seemed to be several ILS systems called Koha. Investigation into the reason for this uncovered a story which provides valuable lessons for open source project ownership, including branding, trademarks, and conflict resolution.

Read the full article at opensource.com.

Shallow versus Deep Differentiation: Do we need more copyleft in the cloud?

In a previous post I discussed two different models for open source services; the “secret source” model, which is based on providing a differentiated offering on top of an open source stack, and a copyleft model using licenses that address the “ASP loophole” such as AGPL.

Another way of looking at these two models is in terms of the level and characteristics of differentiation that they afford.

Shallow versus Deep

Sea and clouds

If a service offering – and this applied whether its a SaaS solution or infrastructure virtualization or anything in between – uses a copyleft license such as AGPL, then this tends to encourage shallow differentiation. By this I mean that the service offered to users by different providers is differentiated in a way that does not involve significant changes to the core codebase. For example, service providers may differentiate on price points, service packages, location, and reputation, while offering the same solution.

There can also be differentiation at the technology level including custom configurations, styling and so on, or added features; however under an AGPL-style license these are also typically distributed under AGPL, so if service providers do want to extend and enhance the codebase, this is contributed back to the community. If a provider really did want to provide deep differentiation, it would effectively have to create a fork.

If a service offering instead builds on top of an open source stack using a permissive license such as the Apache license, then it becomes possible for providers to offer deep differentiation in the services they provide; they are at liberty to make significant changes to the software without contributing this back to the community developing the core codebase. This is because, under the terms of most open source licenses, providing an online service using software is not considered “distribution” of the software.

What does this all mean?

For service providers this presents something of a quandary. On the one hand, a common software base represents a significant cost saving as development effort is pooled, reducing waste. On the other, there is a clear business case for greater differentiation to compete as the market becomes more crowded.

How this is resolved is something of a philosophical question.

It may be that, acting out of self-interest, service providers will over time balance out the issues of differentiation and pooled development regardless of any kind of licensing constraint; the cost savings and reduced risk offered by pooling development effort for the core codebase will be clear and significant, and providers will apply deep differentiation only where there is very clear benefit in doing so, while contributing back to the core codebase for everything else.

Alternatively, service providers may rush to differentiate deeply from the outset, leaving the core codebase starved of resources while each provider focusses on their own enhancements. In this scenario, copyleft licensing would be needed to sustain a critical mass of contributions to the core.

Which is it to be?

Given that OpenStack and Apache CloudStack, two of the main cloud infrastructure-as-a-service projects, are both licensed under the Apache license, we can observe over the coming year or two which seems to be the likely scenario. Under the first model, we should see the developer community and contributions for these projects continue to grow, irrespective of how deeply providers differentiate services based on the software.

Under the second scenario, we should see something rather different, in that the viability of the project should suffer even as the number of providers building services on them grows.

As of now, both projects seem to be growing in terms of contributors; here’s OpenStack (source: Ohloh):

OpenStack contributors per month; rising from 50 in 2011 to 250 in 2013

… and here is CloudStack (source: Ohloh):

CloudStack contributors per month; around 25 over 2011-2012, rising steeply to 60 in 2013

(Both projects have slightly lower numbers of commits, though that can simply reflect greater maturity of codebases rather than reduced viability, which is why I’ve focussed on the number of contributors)

If the concerns over “deep differentiation” turn out to be justified, then community engagement in these two projects should suffer as effort is diverted into differentiated offerings built on them, rather than channelled into contributions back to the core projects.

Is deep differentiation really an issue for cloud?

Deep and shallow differentiation is a concept borrowed from marketing, and is sometimes used to refer to how easy it is for a competitor to copy a service offering. One example of this is the Domino’s Pizza “hot in 30 minutes or its free” service promise – it would be difficult for a competitor to copy this offering without actually changing the nature of its operation to match – it can’t just copy the tagline without risking giving away free pizza and going out of business.

In cloud services, its arguable how much differentiation will be in terms of software functionalities and capabilities, and how much on the operational and marketing aspects of the services: things like pricing, reliability, support, speed, ease of use, ease of payment and so on.

If the key to success in cloud is in amongst the latter, then it really doesn’t matter that most providers use basically the same software, and providers will want to take advantage of having a common, high quality software platform with pooled development costs.

A further problem with deep differentiation in the software stack is that this could impact portability and interoperability – having extra features is great, but cloud customers also value the ability to choose providers and to switch when they need to. Providers focussing on a few popular open source cloud offerings are another kind of standardisation, complementing interoperability standards such as OVF, and one that gives customers confidence that they aren’t being locked in; as well as being able to move to another provider, they also get the option to in-source the solution if they so wish.

Are there better reasons for copyleft?

It remains to be seen whether there really is a problem with the open cloud, and whether copyleft is an answer. Personally I’m not convinced there is.

However, that doesn’t mean copyleft on services isn’t important; on the contrary I think that licenses such as AGPL offer organisations a useful option when looking to make their services open.

Recent examples such as EdX highlight that AGPL is a viable alternative for licensing software that runs services, and that perhaps with greater awareness among service providers we may see more usage of it in future. For example, for the public sector it may offer an appropriate approach for making government shared services open source.

(Sea and cloud photo by Michiel Jelijs licensed under CC-BY 2.0)