Monday, March 23, 2020

Economics of open source versus open science

Common postman

Common postman (Heliconius melpomene) on a Lantana

Almost two years ago I started participating on the then-new (now-archived) npm forum. I had been using npm for a few years at that point, and I had some free time to spend providing technical support, for fun. I fixed a number of bugs in the CLI, and users thanked me for those. My impact was limited, but the work was fulfilling. That is, until the developers at npm I had been working with got laid off.

Later in 2019 came the second hit: shortly after a popular JavaScript library started displaying ads to fund maintainers raised a ruckus on Twitter, npm started banning terminal ads. The ensuing chaos was a wake-up call for me. Lots of people started talking about the economics of open-source development, suggesting that open source is a fake ideology propagated by tech companies in Silicon Valley to generate value at no cost — to the companies, that is.

We were putting hours and hours of work into some ideology, and the corporations that profited from our open-source libraries gave us nothing in return. Everyone keeps laughing about the enormous dependency trees of Node.js projects, but that also means every project depends on a lot of other open-source projects, mostly by unpaid maintainers. Similarly, the bugs that I fixed for the npm CLI had a very small impact in the grand scheme of things, but npm is used by almost every company that uses JavaScript — most likely including Google, Amazon, Apple and Facebook. And a small percentage multiplied by almost all the tech capital in the world is still quite a lot.

This contradicts with what I have been taught about Open Science: ideally, all aspects of all science should be open to everyone, to allow small players to take part. The more small players can take part, the better the science is, both morally and in quality & quantity.

While in the tech world, a small but vocal group is trying to bring about a revolution to rethink open source to help the individual, at the same time the science community has just gotten into the idea of expanding open source — again, to help the individual. Is open science just a few years behind open source?

One important thing to note is that both revolutions are trying to bring about the same thing: fair representation. In fair open source, this is about maintainers of public infrastructure (in the form of libraries) getting part of the profit generated by companies using it. In open science, this is about letting everyone take part in science, from people without affiliation to people whose institution cannot or does not want to pay for access, and lowering the barrier by making source code and data available.

The main difference is probably that scientists usually get paid, at which point it is easier to choose whether to make your work open or not: not making it open would be a waste. Additionally, there is the notion that any science is good for science (and the world) as a whole: even if commercial pharmaceutical companies get to use open research (and open source software) from researcher that they did not fund, advances in pharmaceutics are good for everyone. Plus, open science helps the smaller players, which would be beneficial for competition and so prices (if market forces finally follow through).

In the middle of this is me. I maintain an open-source project (Citation.js) aimed at people who care about bibliographical data — e.g. scientists and librarians. It has 142 stars on GitHub. I am proud of it. Neither side really applies to me: I cannot think of any commercial application that needs my library, nor do I receive funding for working on a (very small) part of the scientific community. So, which revolution should I follow? Fair open source or open science?

At the moment, I am fine with keeping it as it is. Though tiresome, it is also fulfilling, and right now I can still use the Exposure™. For the longer term, I guess I will naively carry on until I burn out or someone convinces me otherwise.


Note: a proposed solution for fair open source is the Parity Public License: it allows people to use it in private without limitations, and otherwise it requires the project using it to be open-source as well. Additionally, it is possible to buy licenses for closed-source work. To me, it seems a bit limited. Licenses like this can quickly become complex to use. Do I want people to be able to use Citation.js on their personal website without making the website open source? I do not think that would be possible with this license, without personally giving permission to people who would want that.

There are probably better blog posts to be read about the trade-offs of such licenses. If you find any, I will add them here.

No comments:

Post a Comment