Building Communities in Computational Science

Building Communities in Computational Science
An OSS Watch / CCPForge Community Day
Friday 11 January 2008
Rutherford Appleton Lab, Oxfordshire

Do you want more people to use your software? Do you want it to interoperate with other pieces of software? Do you want it to be more robust and reliable? Do you want to involve more collaborators in the development of your software?

Come along to the OSS Watch / CCPForge Community Day and see how.

With speakers from OSS Watch, OMII and CCPForge as well as ChemShell and CCP1GUI, the day aims to first tell you how to do it and the help you achieve it in workshop sessions.

Program of the day

10:00-10:30 Arrive and coffee/tea/biscuits. People can set up laptops
10:30-10:40 David Worth (RAL) with an introduction
10:40-11:00 Ross Gardler (OSS Watch) on building community
11:00-11:20 Neil Chue Hong (OMII-UK) on building communities of software users
11:20-11:40 Stuart Yeates (OSS Watch) on tools for community
11:40-12:00 David Worth (RAL) on CCPForge
12:00-12:20 Johannes Kaestner (Daresbury) on ChemShell/DL-FIND
12:20-12:40 Paul Sherwood (Darebury) on CCP1GUI
12:40-13:00 David Worth (RAL) on software engineering/development

13:00-14:00 Lunch

14:00-15:30 Workshop session 1
15:30 Coffee/tea/biscuits arrive
15:30-16:30 Workshop session 2

The workshops will involve:
* Ross from OSS Watch working with people to develop a plan for build community around their project
* Johannes from the ChemShell project working with people who want to connect their software to the ChemShell software
* Paul from CCP1GUI project on working with people who want to build an interface with CCP1GUI or associated technology
* Neil from OMII-UK on community for the ways the software you’ve built is going to be sustained/supported/extended
* David from CCPForge working with people who want to get their code into a version control system and understanding how to use it going forward
* Stuart from OSS Watch working with people who want to understand or deploy community building tools
* Stuart from OSS Watch or David from Darebury working with people who want understand particulars software engineering technique
* Stuart or Ross from OSS Watch working with people who want licensing advice for software or content.
* Neil from OMII-UK on ways you could benefit from working with OMII

The OSS Watch / CCPForge Community Day is being held on Friday 11 January 2008 at the Rutherford Appleton Laboratory in Oxfordshire.

If you interested in coming along, please register with Dr David Worth
<> by Tuesday 8 January 2008.

Changing Licences

There’s a story in wired about licences in flickr photos. The problem is that flickr requires users to tag each photo with one of a range of licences (including “all rights reserved”). Users can change the licence at will, either on individual photos or on thousands at once.

If a third party takes a creative commons licensed image, reuses it under the terms of the licence and the user subsequently changes the licence on the image on the flickr site, difficulties arise.

The creative commons licences are perpetual, containing words like:
Subject to the terms and conditions of this License, Licensor hereby grants You a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below:

So the third party can continue using the image under the creative commons licence indefinitely, provided they have a local copy. The user (the copyright owner) has now removed their offer of the image under the licence, so proving they are entitled to use the image could be problematic, unless they’ve done their homework and kept some form of log of the licence. Getting a new copy of the image under the old licence if they haven’t kept a copy is likely to be impossible.

I have no idea what happens in the case where an image is pulled dynamically from flickr and built into a composite in a way which breaches the new licence. Presumably such dynamic system need to check the licence every time, as is entirely possible using the flickr API.

The take home message? Keep track of what software and content you’re reusing, keep and archive a local copy of everything you use.

Open access bill in the USA looks likely to pass

Open access looks about to pass a significant milestone with a bill in USA Congress which requires open access to National Institutes of Health (NIH) funded outputs. While NIH funded research is only a small fraction of the peer review funded research globally, it’s one of the largest coordinated research programs with huge inertia, both externally and internally. It seems likely that all significant medical and genetics peer review publication forums will be open access in the near future. NIH also funds work in a whole range of disciplines which impact on human health, so they’ll receive an open access boost too.

Such a big win is built on the work of a whole lot of individuals and groups world-wide, including Stevan Harnad (long term open access über-evangelist) and the JISC funded Sherpa, OpenDOAR and ePrints projects. Congratulations guys.


FOAF + OpenID is a semantic web attempt to solve the problem of blog spam. The idea is that those people who have FOAF files and OpenID identities can identify each others networks of friends, colleagues and acquaintances using their FOAF files and authenticate the individuals using OpenID.

I’ve found a flaw in this: I could have (and probably should have) a link in my FOAF file to a semantic wiki representation of myself, which is (in the way of wikis) world writeable. Spammers could easily edit the wiki to insert a link from myself to them which would let them become part of the group and spam us.

There are a number of fixes for this:

  • Check the metadata in each FOAF file to ensure that it claims to be written by the subject of the file (which wouldn’t be the case for the wiki). This would require many FOAF/RDF generation tools to be updated.
  • Add trust attributes to external links in FOAF files. This would also require many FOAF/RDF generation tools to be updated.
  • Compile a list of known world-writable RDF sources and use it to black-list them. This would always be playing a game of catch-up and there some sites might slip through.
  • Require trusted users not to link to world-writeable RDF sources (or sources of RDF that harvest from the wider web). This requires that the semantic web workers work in a walled garden and not link outside it into the wider web.

None of these are easy.

Somehow this whole thing reminds me of the OpenPGP web-of-trust, without the cryptographic underpinnings.

Access from Prisons

Beautiful morning by Stuart Yeates
This morning while talking to Niall Sclater (Director of the Open University’s VLE Programme) at moodlemoot about barriers to migrating the last of the Open University’s paper courses to electronic courses via moodle, he pointed me to a great pilot underway in some of the roughest prisons in London.

The POLARIS project trial is rolling out access to educational websites into a number of London prisons, including the Wormwood Scrubs and Bellmarsh. Apparently Bellmarsh with it’s population of very high security inmates is less of a problem than some of the others which have a much higher rates of turnover.

The rolling out of access into such places puts a whole new emphasis on the security of the applications used in educational institutions. It’s worth noting that the OU (for whom prisoners represent a small but significant number of students) has just spent a great deal of time and effort rewriting the roles and security in Moodle.

New OSI licences from Microsoft

The Open Source Initiative have announced the approval of a pair of licenses from Microsoft. The Microsoft Public License (Ms-PL) and the Microsoft Reciprocal License (Ms-RL) are:

…refreshingly short and clean, compared to, say, the GPLv3 and the Sun CDDL. They share a patent peace clause, a no-trademark-license clause, and they differ only in the essential clause of reciprocation. (slashdot)

This is another step on the road to open source for Microsoft, a road already mapped up with projects like Wix, an open source licensed packager for Microsoft Windows systems. Hopefully these new licences will mean that more projects native to Microsoft platforms (such as those at codeplex) will use an open source licence.

Personally I’m a little worried about the Ms-RL’s use of the word “file,” a technical term used without definition, which a sufficiently well paid lawyers could probably cause problems over: “What if it’s in an email not a file?” “What about when it’s embedded in hardware” etc. But them I’m not a lawyer, so I may have the wrong end of the stick.

Microsoft releasing more source code.

Microsoft is set to release source code to the .net class libraries under the Microsoft Reference License. This is not an open source licence, is not even close to an open source licence:

“Reference use” means use of the software within your company as a reference, in read only form, for the sole purposes of debugging your products, maintaining your products, or enhancing the interoperability of your products with the software, and specifically excludes the right to distribute the software outside of your company.

(A) Copyright Grant- Subject to the terms of this license, the Licensor grants you a non-transferable, non-exclusive, worldwide, royalty-free copyright license to reproduce the software for reference use.

So why is Microsoft doing this?

Having worked at a Microsoft partner in the distant past, I’m guessing that the real reason for this is completely unconnected to the open source community and more related to easing the lines of communication with Microsoft partners, who’ve had (very restricted and cumbersome) access to code such as this for a long time. By making the code publicly available to everyone under a licence giving the same rights that partners have had for a long time, Microsoft greatly eases the communication of that code to partners, particularly those who’re looking jealously at the free flow of information and code between Sun and their partners under increasingly liberal licences.

Miguel de Icaza has an excellent post on what this might mean for the open source Mono project (very little). eweek has an even more sceptical article.

ICANN to start Internationalised Domain Name testing

The ICANN plan to roll out domain names non-western scripts is about to release trial top-level domain names using the world ‘test’ translated into Arabic, Persian, Chinese (simplified), Chinese (traditional), Russian, Hindi, Greek, Korean, Yiddish, Japanese and Tamil. The top level trial domains will be retired once production domains in these scripts are rolled out:

It is planned that the .test labels will be kept in the DNS root zone and resolving with example
positioned at the second level (i.e., translations of example.test) until registrations in a
corresponding script are available in a production environment. Although it is anticipated that
the evaluation facility will be of short-term utility the lifespan of the evaluation may be
extended if it is demonstrated that target groups will derive continuing benefit from it.

I’m expecting that many, many, applications which touch DNS are going to have to be patch to fix bugs that will be shown up by this change, so expect a wave of patches and updates 2-6 months from now. Well behaved applications which handle DNS by calling system libraries are likely to be OK, since this change have been in the wind for ~3 years and the maintainers of specialist libraries should be prepared for it.

I’ve previously written about ICANN’s failure to move on this and other topics, but it looks like I may have been too quick to criticise them.

Users Report and Feedback on the JISC Blogs Pilot

At the recent JISC Services Communications Group Upskilling Day, I gave a presentation on our use of this, the JISC Blogs Pilot.

My slides have been up on the Comms Group wiki for a while, but that’s password protected, so I’ve published them on the OSS Watch website too, as PDF and HTML.

In summary, the most important key to having a successful blog is knowing who you’re writing for, why you’re writing for them and what you’re writing about. The pilot service had a number of short-comings which Matt Dukes assured us in his presentation are about to be rectified when the service goes live.

Email management: Saving searches in Thunderbird

At the JISC Comms upskilling day yesterday, Ross described how to create a saved search searching for occurrences of your name across all mail folders in a mail server. This means that all messages from mailing lists can be filed into per-list mail boxes using filters and then those mentioning your name re-aggregated into a virtual folder, so you don’t miss emails that are follow up to something you posted weeks ago or emails to obscure mailling lists when someone mentions me by name.

Go to File -> New -> Saved Search … , fill in the dialog box and choose which mail boxes you want to be searched.

The saved search dialogue box

Despite the warning, I’ve not found “Search Online” slow (but then I only have a few thousands of emails on this server).

Saved searches in thunderbird do not copy or move the emails as filters do, so when you read, delete or flag the emails in the virtual folder that the saved search creates, you are actually acting on the emails in their original folder.

Your dialog may look slightly different to the one pictured here, which is from Thunderbird on Ubuntu.