Author Archive for Sander van der Waal

Opportunities for scientific research in open source projects

There are many interesting open source projects that can be beneficial to academic research. As OSS Watch’s recent article on e-Research by Gabriel Hanganu shows there are social and organisational problems in adopting open source for e-Research, but there are many open source software projects there to be joined. Some projects are suited very well to be used in scientific research and I feel that this is especially true in the realm of big data databases.

Google showed the way, really, with the MapReduce paper in 2004. They published their programming model for processing large amounts of data in parallel and although publishing it, they did not neglect to apply for a patent as well, which was recently granted. Hadoop, which originates from a project at Yahoo!, also implements the MapReduce pattern, but is completely open source being a project of the Apache Software Foundation. And now recently Apache Cassandra has joined the mix. Cassandra originates from Facebook, but has become open source in July 2008. It recently promoted from the Apache Incubator and is now an official top-level Apache project.
Work has been initiated to facilitate integration between Cassandra and Hadoop, which simplified means the Hadoop database HBase is replaced with Cassandra. There has been discussion of this on the list and a feature has recently been implemented. So there’s Yahoo! working on Hadoop and Facebook working on Cassandra, and recently also Twitter has announced that it is working towards using Cassandra for their backend. Also worth mentioning is the open source implementation of Amazon’s Dynamo database which is named Voldemort. This project is used and actively developed by LinkedIn and is therefore another example of how you can benefit from the work this large company is investing by engaging with this project.

To me, this all shows that there will be large investments in NoSQL databases from major companies in the coming years, and it will all be in open source software. This means that there is a lot of opportunity for anybody who has to deal with big data to profit from this investment. All you have to is try out the software and engage with these projects. Researchers also have to cope with more and more data, so I think they have good reason to follow these developments closely and step in to benefit.

Building W3C widgets on the Wookie training day

Last week OSS Watch organised its first training day in Oxford. We got together with about 15 people to gain hands-on experience with Apache Wookie (Incubating). Wookie provides an implementation of the W3C widget specifications, so a lot of emphasise was put on building these kinds of widgets. We succeeded quite well in getting to know the spec and how to build widgets and ended the day with a nice collection of newly built widgets and even a submitted patch to the Wookie source code.

Scott Wilson, the Wookie guru from Bolton University, where it all started, started the day off with a presentation (pdf) of what widgets and Wookie are all about. Widgets are basically small mini applications that are designed to work in a small view area. Many platforms have created their own format for it, but the W3C is working on a set of specifications for it with a consortium of partners from both traditional computing and mobile platforms, which will lead to a true cross-platform standard which will hopefully lead to widespread adoption. A minimal W3C widget consists of nothing more than a config file and an HTML file, zipped up as an archive with file extension .wgt. The config file contains basic configuration such as the name, description and preferred dimensions of the widgets. The widget can furthermore include as much HTML, CSS, images and JavaScript files as one would like.

Apache Wookie (Incubating) is an application that provides a W3C-compliant widget server. You can use Wookie to deploy widgets and you can serve W3C widgets from the Wookie server in third party applications. Plugins have already been written for Moodle, LAMS, Sakai and Google Wave. Wookie also has a REST API that can be used to get, or create widgets.

After Scott’s intro it was time to get dirty. Ross handed out CDs containing the latest sourcecode of Wookie (which can be downloaded by anybody from Subversion) and prerequisites like a JDK and Apache Ant. His presentation (pdf) was about ‘how to build your first widget’. I was surprised to see that there were 10 people with Mac-books in the room, amongst 4 Windows machines and one Linux netbook. Apple surely knows how to impress the developer these days! After some initial troubles with environment settings etc. most people got up-and-running fairly quickly and were ready to build their first widget. Wookie provides handy Ant tasks for building and deploying widgets, which means that generating a hello-world skeleton widget is as easy as typing ant seed-widget and answering some questions about the name, description and dimensions of your widget. After you have started up the Wookie server using ant run you can deploy the widget using ant deploy-widget. That was it, quite easily. I must say, having moved away from Ant and using Maven2 for the last few years, it’s nice to be remembered of the powerful features Ant has to offer. Especially since Wookie uses Ant in combination with Apache Ivy, the dependency management alternative for Maven2. (To be precise, you can also use Ivy with Maven2 repositories). Ross also demonstrated how you can make use of OpenStreetMap JavaScript APIs to embed cool navigational features in your widget quite easily. You can check out his presentation (pdf) or directly check out the source code of the tutorial including the example JavaScript.

In his second presentation (pdf) Scott focussed on some design principles behind the widget specification and gave a walk-through of how you can design a more advanced widget by making use of features of the W3C widget object API and integrate with the Google Wave Gadgets API. No Wave server is needed to get this working, as Wookie can handle everything for you. Scott demonstrated a Task widget with collaboration features, that can be used by different users concurrently using State and Participants.

After the break it was high time for everybody to create their own widget and some interesting ideas had come up. One of us decided it would be much cooler to hack directly in the server code instead of building widgets and he submitted a patch to Wookie to allow hot deployment of a widget to ease the development/deployment cycle. That’s very cool, thanks Matthew!

The rest of us built some widgets for a wide variety of purposes. One of the nice things about the widgets was that we could easily merge them all together on one Wookie instance and show all widgets there. These were some of the widgets that resulted from this 1.5 hour hack-fest:

  • Video player embedded in a widget with fallback to other formats depending on the user agent
  • Display a list of links using output from one of the Yahoo pipes
  • Display the last.fm playlist of a user and show what that user is currently listening to
  • Show a canvas drawing where multiple people can collaborate by working on the same drawing using HTML5
  • Cool kids’ game where the user can name his pet dinosaur
  • Currency converter that (eventually) would use an external currency conversion provider

It was fun to see how easily you can create functional widgets. If you make use of external JavaScript APIs or data feeds it is also quite simple to create a useful (or not so useful…) widget. This was a nice conclusion of the day and seeing all the widgets we had created we thought we had deserved our beer and headed off to the pub. Thanks to Scott and Ross for making this a successful Wookie training day!

The power of community put into practice

At OSS Watch, we actively promote that there is more to open source software than just a licence. Open source projects should use not just an OSI-approved licence but practice the open development method and if they want to become sustainable they should be building a community around their project. Once in a while, we come across a nice example of how the power of the community can be beneficial, and recently one of these examples occurred.

It started with an application that has been built by Nick Burch at the Apache Software Foundation to facilitate the search of geographically ‘nearby people’. He made this little Django application available via a Subversion repository with an Apache licence.

Linking people and projects is also one of the aims of the project registry framework Simal that OSS Watch is involved in. On Simal’s public demo site there is a collection of projects and people working on these projects. Besides doing development work on the Simal application OSS Watch is starting to use the registry more often in our daily work. Unfortunately, we recently failed to find out about a project that was run at our institution, Oxford University, even though it was present in our public registry.

When I realised Simal was lacking functionality that had been useful for OSS Watch, i.e. to find nearby projects based on location, I created issue 263 for Simal, dumping my thoughts about possible solutions, among which the ASF application on nearby people.

A key problem in adding this functionality was to have the geo-location data of the institutions that are in involved in the projects. This prompted Ross to reach out to his wider community to see whether anyone had tackled this issue.

The first and very useful suggestion on this matter was from Paul Stainthorp who pointed to a list of UK universities and their geo-location, which is maintained at Wolverhampton university.

The second one was from Sam Easterby-Smith who pointed to a list on Wikipedia. That was a good one, as Wikipedia is quite complete and geo-tagged, so we would have the data from that source if only we had a convenient way of extracting it.

The solution to that problem is to use DBpedia and it was suggested both by James, who added a comment to the issue in the tracker, and by Wilbert Kraan on Twitter. DBpedia is a community effort to extract structured information from Wikipedia and it provides a public SPARQL endpoint for querying Wikipedia data. We can conveniently query that endpoint for a list of the geo locations of al UK universities and add that data to our Simal repository.

So within one working day we have a solution to the main problem, getting the geo-location data. But Ross’s discussion with someone already doing this revealed that they are manually creating the data, so they can potentially benefit from our search and automate it, if they want to. Furthermore, someone on Twitter noticed our search and he indicated that he would be interested in the solution, so potentially more people and/or projects can benefit. Furthermore, since everything happened completely in the open, even more people have the opportunity to find our solution and use it in their own problem space.

To me, this is a perfect illustration of the power of community. There is just so much that we all collectively know and by having your project run out in the open, freely accessible by everyone, enables you to tap into the collective knowledge of many experts. If this is not a reason to use the open development method, I don’t know what is.

Learn to build W3C compliant widgets with Apache Wookie (Incubating)

Are you interested in widget development and do you want to learn how to build widgets that use the new open W3C widget standard? OSS Watch is organising an Apache Wookie (Incubating) training day in Oxford, UK on 11 February 2010 for developers who would like to get hands-on experience with building widgets using Wookie.

Wookie provides an implementation of the W3C widget standard and allows you to write, deploy and manage W3C compliant widgets easily. The project is also working on implementing extra modules, such as a Google Wave Gadgets API. To increase cross-platform interoperability, several plugins have been written to integrate with other systems, such as Moodle and Wordpress.

This event is free of charge and invitation only. However, we have three open places left for interested developers. General development skills are required, but you don’t need to have specific experience in building widgets. If you are interested we welcome you to contact us on info@oss-watch.ac.uk and let us know why you should to be there. If you would like to know more, details about this all day event can be found on the event page.

One of the reasons we hold this event is in preparation for the dev8D developer days. During that event we will be doing more widget development in a Wookie hackathon and we want to gather some more skilled widget developers for that hackathon. Therefore we expect people who come to the Wookie training day to also attend the dev8D days. However, don’t hesitate to contact us for the Wookie training day if you can’t make it to dev8D.

Mailing lists vs. forums

On Monday the 9th of December we organised two simultaneous workshops on open development. One track was about open innovation whereas the other focused on the theme of building an engaged community around open source software projects. I gave a presentation on the latter track about my first experiences with an open source project and explained about the community tools that are essential for open development: a good homepage, a version control system, an issue tracker and mailing lists.

One question at the end of my session was about the mailing lists. I had explained that it is very important to have a publicly accessible mailing list that anybody can subscribe to and that you should ensure that all communication about the project is on the mailing list. The question was about why you should use mailing lists for this and not forums.
Continue reading ‘Mailing lists vs. forums’