Untangling the university web presence with OpenScholar

This is a guest post from the OpenScholar team at Gizra. A lot of public sector organisations have moved recently to an open source CMS solution, citing the benefits not just in cost but also in flexibility, and its great to see examples of universities following suit. If your university has a similar experience, tell us about it in the comments!

OpenScholar at Harvard logoAs in many fields, the introduction of the web into higher education took place gradually and unevenly. This led many academic staff, projects and even whole departments to build their own Web presence independent from each other, using their personal or department budgets to hire external help and grad students to create their websites.

Naturally, this led fairly quickly to the Ivory Tower looking more like the Tower of Babel in terms of web presence, when universities found out they have scores of sites running on various incompatible environments, increasingly difficult to maintain, update or apply security patches – a situation that is still bogging down many academic IT departments.

Many institutions are attempting to fix this by standardizing on a single CMS system, often an Open Source one. When Harvard University faced the problem, it decided to take it one step further and create a CMS focused on academic use.

As a basis, it picked Drupal, one of the most widely used Open Source CMS solutions, powering civic and commercial websites such as WhiteHouse.gov, The Economist, Twitter’s developer website and many others, which already had a strong academic presence. Harvard used Drupal as the base for its own distribution named OpenScholar, which essentially bundles specific backend modules (e.g. bibliography handling) along with a user interface tailored for users in academia.

As the project progressed, we at Gizra were called in for a short consulting gig based on our experience releasing the Organic Groups module for Drupal, which then morphed into a 3 year engagement, at its peak employing four full time developers on our end and an equal number on Harvard’s.

The result is a system that aims to solve both the content creator and IT admin woes. Academic staff are provided with an intuitive UI for smooth website creation. Templates already incorporate the common (and some less common) elements used in such sites: For example, a professor can sign in and have a basic template created. She can then choose to have a calendar on the right sidebar, a blog in the middle, a bibliography page linked on the footer etc – all with an easy to use drag & drop interface.

For the IT side, this helps reduce the amount of user support required, but more critically the system also provide a single, unified codebase upon which all the institution websites are built. Upgrading to a new version or applying a security patch is done in one place, as opposed to keeping dozens of different environments up to date.

OpenScholar now runs all of Harvard University’s websites – 5120 at time of writing – and is starting to be used at Princeton, Berkeley, Virginia Tech and others. Drupal’s excellent multilingual support is helping it spread worldwide, and we’ve recently helped the Hebrew University in Jerusalem add support for right-to-left text, enabling easy creation and management of websites in Hebrew, Farsi, Arabic and other languages.

Leading Drupal cloud hosting providers Acquia and Pantheon now offer a turnkey solution for easily setting up highly optimized, elastic OpenScholar environments without the need for local installation and maintenance at all. For organizations wishing to keep their servers on-site, we’re collaborating with Zend Technologies on a packaged solution that will allow installing a complete secure and optimized OpenScholar environment locally from scratch.

Following the success at Harvard, OpenScholar continues to develop its core as well as adding more UI elements per professor and department’s demands. An RESTful API is now being developed which will allow easier integration with existing systems as well a smoother and more sophisticated front end.

For more information on OpenScholar, visit the OpenScholar website.

Open Source Software Licensing Trends

This is a guest post from Jim Farmer, Chairman of Instructional Media + Magic Inc. Jim has also written a series of feature articles on open source for Informa’s London-based Intellectual Property Magazine.

Higher education has traditionally been a knowledge “sharing” environment. Early software was exchanged without license and, in practice, without restrictions. As the monetization of intellectual property, including software, becomes pervasive more restrictive software licenses have been introduced and enforced. These licenses impose legal duties of the user of “open source software” that could be unexpected and have undesirable consequences.

The first license restrictions were a series of “copyleft” licenses that imposed a duty of a user who makes modifications of open source software to share these modifications with others. In addition, the terms and conditions of licenses of the modified software is required for all subsequent users as well. Richard Stallman is credited with launching the free software movement. He used software licensing to enforce this desired behaviour. In practice the open source community was already sharing software so the “copyleft” licenses were not a substantial burden. Disputes were avoided by an email or telephone request, almost always honoured.

Some open source software from higher education because commercial software products with proprietary licenses. Examples include North Carolina State University’s statistical package that led to SAS, and the University of Chicago’s package that led to SPSS. Their contribution was documentation and standardized stable versions of the software. Subsequently this strategy was used by Red Hat to introduce Red Hat Linux.

Extending Stallman’s practice of imposing duty, the recent and rarely used Affero license has imposed additional and potentially burdensome restrictions on distribution of modifications made to software used as a service over a network.

Higher education is becoming more sensitive to these license restrictions. There are three recent licensing choices that illustrate the trade-off decisions that were made.

edX Seeks More Software Users

Harvard University and MIT had adopted the Affero software license for their edX learning technology platform. In September, Ned Batchelder, edX Sofrtware Architect wrote “…one license does not fit all purposes, which is why we’ve decided to relicense one part, our XBlock API, under Apache 2.0.”

As part of its license compliance software and services, Black Duck compiles use of the various licenses. Using this data the edX shift from restrictive to permissive licensing is illustrated in Figure 1. The data suggests edX’s action was consistent with trends in open source licensing.

Graph showing license usage in open source software; Affero is less than 1% and rank ed 16th most popular; Apache 2.0 is ranked 3rd most popular, after GPL 2.0 and MIT.

Figure 1 – Use of Open Source Software Licenses

Batchelder describes the motivation for the change:

The XBlock API will only succeed to the extent that it is widely adopted, and we are committed to encouraging broad adoption by anyone interested in using it. For that reason, we’re changing the license on the XBlock API from AGPL to Apache 2.0.

 

The Apache license is permissive: it lets adopters and extenders do what they want with their changes. They can release them under a copyleft license like AGPL, or a permissive license like Apache, or even keep them closed-source.

Using Black Duck data for 2009 and 2015, the licensing trends in Figure 2 show the sharp increases in use of the MIT and Apache permissive licenses.

Figure 2.Trends in license use from 2009-2015, showing increases for MIT and ASL, decrease in GPL and LGPL

Figure 2 – Change in Use 2009 to 2015

According to Black Duck’s data on the use of software licenses, Apache 2.0 – used by 19% – has moved from its 7th ranking to 3rd most used software license. The GNU General Public License is still the most frequently used at 25%. However the GPL license has lost 21.4% of user share since 2009 and Apache has gained 12.4%. The least restrictive MIT license grew from 3.3% to 19.0% during the same period to become the second most frequently used open source software license.

The least restrictive MIT license has few restrictions: You can not sue MIT that the software didn’t do what you thought it should—“fitness of purpose.” Also it mandates attribution via reproduction of the copyright statement.

There is also a difference based on the purpose of the license. Figure 3 shows the differences in use by software developers of open source software, on downloads of the software selected for use, and what companies are using. For enterprise use the Apache license is most used.

Figure 3. License usage by purpose. Figure 3 – License Use by Purpose

Donnie Berkholz at RedMonk quantified the shift toward permissive licensing using data from July 2012. He summarized his results using the ratio of permissive to copyleft licenses. The results are shown in Figure 4. Licenses for both Java and JavaScript—two of the most frequently used—became more frequently used than copyleft licenses in 2008. Cumulatively in 2010 the majority of open source software licenses were permissive licenses.

Figure 4: Upwards trend for permissive licensing (source: redmonk)

Figure 4 – Shift of Open Source Software to Permissive Licensing.

In December 2014 ZDNet’s Steven J Vaughan-Nicholas summarized::

“The three primary permissive license choices (Apache/BSD/MIT) … collectively are employed by 42 percent. They represent, in fact, three of the five most popular licenses in use today.” These permissive licenses have been gaining ground at GPL’s expense. The two biggest gainers, the Apache and MIT licenses, were up 27 percent, while the GPLv2, Linux’s license, has declined by 24 percent.

He also reported that in July 2013 Aaron Williamson, senior staff counsel at the Software Freedom Law Center, documented that 85.1 percent of GitHub programs had no license. He commented:

Yes, without any license, your code defaults to falling under copyright law. In that case, legally speaking no one can reproduce, distribute, or create derivative works from your work. You may or may not want that. In any case, that’s only the theory. In practice you’d find defending your rights to be difficult.

The primary edX learning system continues to use the Affero license. Apereo Foundation’s Sakai learning system is licensed under Apache; Moodle uses the GPL license.

edX’s move to a less restrictive license will likely increase use. To gain additional users, perhaps the Apache license should be used for the edX learning system as well.

Kuali Foundation Seeks to Protect Cloud User Market

Administrative software being developed by the participants in the Kuali Foundation was licensed under the Educational Community License (ECL)—an OSI (Open Source Initiative) approved special purpose license for higher education software based o the Apache license. In August the Kuali Foundation Chair Brad Wheeler announced “… the Kuali Foundation is creating a Professional Open Source commercial entity.” He also said “Kuali software now and in the future will remain open source and available for download and local implementations.” The same day the Kuali Foundation posted Brad Wheeler’s blog Kuali 2.0 FAQs. He wrote “The current plan is for the Kuali codebase to be forked and re-licensed under Affero General public License (AGPL). AGPL allows customers to download and use the code at will, but requires partners trying to monetize the software to contribute code changes back to Kuali. This is intended to discourage partners/Kuali Commercial Affiliates (KCAs) from receiving revenue from hosting Kuali software, but does not prohibit them.”

The Foundation asked its participants to transfer their software development to Kuali Inc.and use their proposed cloud-based systems. The Kuali Foundation continues to make available the current version of its software under ECL. The cloud versions also include software proprietary to Kuali Inc.

On September 8, 2014, Chuck Severance wrote :

… the successful use of AGPL3 to found and fund “open source” companies that can protect their intellectual property and force vendor lock-in *is* the “change” that has happened in [Kuali’s] past decade that underlies both of these announcements and the makes a pivot away from open source and to professional open source an investment with the potential for high returns to its shareholders.

Severance suggested how to achieve “high returns:”

First take VC [venture capitalists] money and develop some new piece of software. Divide the software into two parts – (a) the part that looks nice but is missing major functionality and (b) the super-awesome add-ons to that software that really rock. You license (a) using the AGPL3 and license (b) as all rights reserved and never release that source code.

 

You then stand up a cloud instance of the software that combines (a) and (b) and not allow any self-hosted versions of the software which might entail handing your (b) source code to your customers.

On October 2 at Educause, reporting for e-Literate on the Kuali session, Phil Hill identified “(b):”

The back-and-forth involved trying to get a clear answer, and the answer is that the multi-tenant framework to be developed / owned by KualiCo will not be open source – it will be proprietary code. I asked Joel Dehlin for additional context after the session, and he explained that all Kuali functionality will be open source, but the infrastructure to allow cloud hosting is not open source.

Referring to multi-tenancy, Inside Higher Ed’s Carl Straumsheim described the purpose of “(b)” confirming Chuck Severance’s scenario:

“I’ll be very blunt here,” [Kuali’s Barry] Walsh said. “It’s a commercial protection — that’s all it is.”

In a 10 September blog post Locked into Free Software? Unpicking Kuali’s AGPL Strategy OSS Watch’s Scott Wilson considered the implications of AGPL. He pointed out “The GPL license requires any modifications of code it covers to also be GPL if distributed [emphasis added]. The use of a cloud-based service is not considered distribution of code. So a user could offer a cloud service without making modifications available to the community. Wilson wrote:

The AGPL license, on the other hand, treats deployment of websites and services as “distribution”, and compels [his emphasis] anyone using the software to run a service to also distribute the modified source code.

Wilson also reported Bradley Kuhn, one of the original authors of AGPL, in a talk at Open World Forum in 2012 said “… at that time, some of the most popular uses of AGPL were effectively “shakedown practices” (in his words). This unfortunate characterization may rarely be true.

The AGPL license does meet the Open Source Initiative’s criteria of an open source license. But the pressures of monetization causes its terms to be used inconsistent with the connotation of “open source.”

Oracle Builds a Community?

On September 29th at Oracle World, Oracle announced their Oracle Student Cloud and their investment in the Oracle Customer Strategic Design Program. Embry-Riddle Aeronautical University, University of Texas System and the University of Wisconsin-Madison will participate “to provide guidance and domain expertise that will help shape the design and development of Oracle Student Cloud. A press release described the initiative:

  • Each university will work with Oracle through significant milestones and releases, providing guidance and expertise to develop an industry-leading product. The growth of non-traditional programs is an important trend for these customers, and the first release of Oracle Student Cloud is expected to include flexible core structures and an extensible architecture to manage a variety of traditional and non-traditional educational offerings.
  • Oracle Student Cloud will feature a compelling mobile user interface that enables customers to extend, brand, and differentiate the student experience for each institution.
  • The first phase of Oracle Student Cloud is designed to support the core capabilities of enrolment, payment, and assessment. Oracle Student Cloud will embed CRM-based functionality throughout the solution to promote engagement and collaboration, along with a business intelligence foundation to provide customers with actionable insight into their student operations.

The Design Program could be interpreted as combining the contributions of a community as found in open source development, and a proprietary model that would use the standard Oracle license. If successful this innovation could benefit both Oracle and colleges and universities.

In an October 7 blog Cole Clark, Global Vice President Education and Research industry, reflected on Oracle World. He included Stanford University as a participant. He also said a fifth partner in Europe would be named the following week at the Utrecht NL Higher Education User Group meeting.

He wrote:

We believe this [Oracle Customer Strategic Design Program] gives us a broad spectrum of the higher ed panoply from which to draw a great deal of insight and council [counsel] as we build the next generation student system in the cloud with mobile and social attributes at the core of the development initiative.

He also described the role of open source software:

Don’t get me wrong; there are definitely areas where Kuali (and other open source initiatives) fill gaps that the private sector will likely never pursue – Coeus [research administration] and the open library environment are excellent examples.  Parts of Unizen may be another.  But in the broader areas … where ample (and growing) competition exists to drive innovation up and costs down, there is no justification for investing shrinking resources in higher education on software development and support.

The description of contribution expected of the participants—guidance and domain expertise—and their diverse needs and competencies suggest functional requirements and designs of student services that improve the Oracle software. The reference to the growth of non-traditional programs demonstrated sensitivity to unfilled needs of current student systems. If these are incorporated into the Oracle product, it would benefit their college and university customers. And perhaps be available earlier than other alternatives.

Incorporating customer feedback on products is becoming a standard industry practice for consumer goods. If broadly implemented Clark’s innovation could change the relationship between higher education and software suppliers.

There is one concern. Oracle declined to answer the question whether the participants would be required to sign non-disclosure agreements. It they are, many of the benefits of the broad open communications found in open source development projects may be lost.

Observations

  1. The data on the shift from restrictive to permissive licensing suggests, but does not confirm, broader participation and use of software using permissive licenses. edX may want to consider relicensing the learning platform itself using an Apache license to attract more users of its software
  2. Kuali Inc.’s experience introducing the Aferro license demonstrates how restrictions can be perceived based, in part, on the intent of the copyright holder. The many yet-undefined terms that could be a “cause of action” enabling a copyright holder to bring a legal action against a user presents risks that advice of a licensing specialist or an intellectual property attorney may be needed to fully understand.
  3. Oracle Higher Education may benefit colleges and universities by introducing broad collaboration similar to open source communities. That should be encouraged. But implementation may be fragile in the sense participants, users, and prospects are likely sceptical of success. Complete transparency and open communication about the work of the Strategic Design Program may make the true purpose better known and results more widely used.

The emergence of “intellectual property”—software licenses in these cases—has created monetary incentives for copyright holders. Assessment of licensing restrictions and risks should now be incorporated into all information technology decisions.

This guest post is (c) Jim Farmer, and is licensed under the Creative Commons Attribution 4.0 International license. The graphic in Figure 4 is by Donnie Berkholz of RedMonk, and licensed under the Creative Commons Attribution ShareAlike 3.0 license.

Open or Fauxpen? Use the OSS Watch Openness Rating tool to find out

Open  sign with "Sorry We're Closed" sign beneath it

Is your software open or fauxpen?

This is the question that OSS Watch, in partnership with Pia Waugh, developed the Openness Rating to help you find out.

Using a series of questions covering legal issues, governance, standards, knowledge sharing and market access, the tool helps you to identify potential problem areas for users, contributors and partners.

Unlike earlier models designed to evaluate open source projects, this model can also be applied to both open and closed source software products.

We’ve used the Openness Rating internally at OSS Watch for several years as a key part of our consultancy work, but this is the first time we’ve made the app itself open for anyone to use. It requires a fair bit of knowledge to get the most out of it, but even at a basic level its useful for highlighting questions that a project needs to be able to answer.

Get started with the Openness Rating tool.

Photo by Alan Levine used under CC-BY-SA.

fOSSa 2014: crypto currencies, crowdsourcing research, and hardware hacking in Rennes

fOSSa 2014 will be in Rennes, France, on 19, 20 and 21 November 2014.

This year the event has three themes:

  •  Crypto Currencies in context & let’s look under the hood  
  • Open knowledge creation : Crowdsourcing scientific research 
  • The new hardware bazaar

For more information and to register visit the fOSSa website.

Sadly I won’t be able to make it this year, which is a shame as its a great event with lots of interesting topics.

New open source organisation launches in China

This week saw the launch of KAIYUANSHE (开源社), an association comprising both companies and universities with the aim of providing developers in China with education, tools and services to foster a healthy and robust open source ecosystem.

KAIYUANSHE from the outset is working through two core programs. The first, Open Source Star, helps software developers apply an open source license to their projects, and specifically recognize those that use one of the several available OSI-approved licenses.

The second program is called Open Source Ambassadors. Through this program, the alliance aims to recognize individuals and organizations who are actively engaged in community efforts, for their work to champion best practices and collaboration.

At OSS Watch here at the University of Oxford we’ve also been collaborating with the new initiative, providing access to our content and tools so that they can be localised and translated. You can find Chinese versions of some of our briefing notes on the KAIYUANSHE website already, and I’m sure more will soon follow.

Initial members of the association include Ubuntu Kylin, Microsoft Open Technologies, GitCafe, CSDN and Mozilla. For more information visit the KAIYUANSHE website.

You can also check out the press coverage of the launch (in Chinese) at EvolifeZOL and ChinaByte.

OggCamp14

Last weekend I organised the first OggCamp to be held in Oxford. OggCamp is an annual free culture unconference, where 300 people with a variety of interests related to open source, open hardware, creative commons and more meet up to share projects, ideas and experience.

OggCamp name plate made with a Handibot CNC router

OggCamp name plate made with a Handibot CNC router

As an unconference, the vast majority of the scheudle is decided on the day. This means that we never really know what’s going to happen, but we always have a great range of interesting talks, and this year was no different. Talks this year included a demo of a hydrogen-powered Raspberry Pi, the beginnings of a project to create an open source wireless presentation dongle, software-defined radio, and several live podcast recordings.

Alongside our 3 presentation tracks, we had a fantastic exhibition hosting stands from the events sponsors as well as a number of local hackspaces. Projects being showed off included a vintage teletype connected to Twitter, an open source CNC router, a home heating automation system, and a persistance-of-vision display using a bike wheel.

The result of all of this was a fantastic weekend full of fun an inspiration. Next year’s event isn’t in the works yet, but I’m already excited for next time.

The Wikipedia hoover

Six frogs from a paper available on PubMed

“Pick a frog, any frog” – an image automatically imported from PubMed to Wikipedia

Back in August Wikimania came to London and I heard some interesting discussion there of Wikipedia’s approach to open access materials and the tools they are developing to support that approach. This github repo contains some interesting open source projects designed mainly to automate the process of identifying cited external resources that can be copied into Wikipedia’s repositories of supporting material wikisource (for texts) and upload.wikimedia.org (for pictures, video and sound).

open-access-media-importer for example is a tool which searches the online repository of academic biology papers PubMed for media files licensed under the Creative Commons attribution licence and copies them into the wikimedia repository. Where the files are in media formats that are encumbered by patents, the script also attempts to convert them to the patent free ogg format framework.

In the same github repo is the OA-Signalling project presents a developing framework for flagging open access academic papers using standardised metadata, perhaps integrated in future with the systems being developed by DOAJ and CrossRef. This wikipedia project page explains further:

Some automated tools which work with open access articles are already created. They impose nothing upon anyone who does not wish to use them. For those who wish to use them, they would automate some parts of the citation process and make an odd Wikipedia-specific citation which, contrary to academic tradition, notes whether a work is free to read rather than subscription only. The tools also rip everything usable out of open access works, including the text of the article, pictures or media used, and some metadata, then places this content in multiple Wikimedia projects including Wikimedia Commons, Wikisource, and Wikidata, as well as generating the citation on Wikipedia.

 

During the sessions in which open access and these tools were discussed, many participants expressed strong dislike for academic publishers and their current closed practices. Clearly for many the idea that Wikipedia could become the de facto platform for academic publication was a charming idea, and more open access was seen as the best route to achieving this.

Many years ago I worked in a digital archive, and one of the problems we faced was that academics who were depositing their databases and papers wanted to be able to revise them and effectively remove the earlier, unrevised versions. Naturally this made our jobs more challenging, and to a certain extent seemed to be opposed to the preservation role of the archive. My experiences there make me wonder how the same academics would react to their papers being hoovered up by Wikipedia, potentially to become unalterable ‘source’ copies attached to articles in the world’s most used reference work. On the one hand it is a great practical application of the freedoms that this particular kind of open access provides. On the other hand, it perhaps risks scaring authors into more conservative forms of open access publication in the future. Personally I hope that academics will engage with the tools and communities that Wikipedia provides, and handle any potential friction through communication and personal engagement. And in the end, as these tools are open source, they could always build their own hoover.

Summer round-up

We’ve decided to change the way we publish our newsletter, so instead of having a separate site over at http://newsletter.oss-watch.ac.uk, from now on we’ll be posting a monthly round-up of our activities on this blog.  If you’re only interested in these round-ups, you can subscribe to the feed for the Newsletter category.  We’ll still be publishing event reports, analysis and opinion pieces on this blog as before.

This month is a bumper edition covering what we’ve been up to over the summer.  With Kuali announcing its move to a company-based governance model, Scott has looked at whether this means the end of “community-source”, and whether its choice of an AGPL license poses a risk of vendor lock-in.

We’ve also continued our work with the VALS project, helping over 60 FOSS organisations submit over 250 project ideas.  The participating universities have now signed up for the programme, and students are submitting their project proposals.

Finally for this month, Mark attended the first AGM of the Research Software Engineers UK group, who are seeking to champion and support software developers working with researchers.

Research Software Engineers AGM report

Last Monday I attended the first (hopefully of many!) AGM of The UK Community of Research Software Engineers.  The group has been formed to champion the cause of software engineers producing software of research, be they developers who are embedded in research groups, or academics who have found themselves developing and maintaining software.  Throughout the day, there were a number of issues debated by the group.

While the career path for academics hinges on them publishing papers, developers contributing to research through their work often find that they dont get the opportunity to publish.  One of the problems that RSE seeks to address is finding an alternative way of universities giving recognition to the contribution of software engineers to research.

Should universities seek to support development of research software centrally, or is it better done in departments?  At UCL, they’ve formed a central group of developers, partly from core funding and partly from project funding, who can provide development effort to research projects.  While this provides a useful core of development expertise, a central service can’t provide the same level of domain-specific knowledge that some research groups will require, and some institutions simply don’t have the skills base in central IT to provide the development support that researchers would find valuable.

Another approach for central support for research software engineers is to provide training and tools to support good software engineering practice.  Version control, continuous integration and other common tools can be instilled in researchers’ workflows through collaboration with experienced developers, or through training initiatives such as Software Carpentry.  Provisioning systems like GitLab and Jenkins centrally provides easy access to infrastructure which supports these practices.

These issues and more were discussed in groups over the day, and will continue to be discussed by the RSE community.  If you’re a research software engineer, or just want to help champion their cause, you can visit the website and join the discussion group.

Gateway for Higher Education provides insight into research funding – and is open source

Today Jisc announced the beta G4HE website. The site pulls data from the BIS-funded RCUK Gateway to Research API and provides an interface to allow searching and visualising data on research in the UK.

For example, you can see at a glance the councils funding research at the University of Oxford, as well as the key collaboration partners in joint research work

There are several interesting “open” angles to this project.

First, its ‘open’ in the sense that the site is opening up access to information about research spending.

Second, the site is using crowdsourcing to clean up the available data to make it more meaningful – for example by asking visitors to help identify duplicates and naming mistakes from the original data.

And finally, its great to see that the code that runs the site has been released under the MIT license and is available from Github.