Social Life of Information

We Are Who We Link
flickr Photoflickr Photoflickr Photoflickr Photoflickr Photoflickr Photoflickr Photo

Around the World in 2008

2008 was a very busy year; for one thing, it was my first full calendar year in Syracuse. It was also remarkable for the amount of travel that was involved with attending and presenting at conferences. My carbon footprint for the year is horrifying, in fact: 10,307 kg of CO2 was produced just to move me around the world for the 18 trips I recorded in Dopplr. According to my frequent flyer status with United (and I only joined in June) I’ve flown almost 43,000 miles this year; I would have had more like 60,000 on the record if I had joined in January!

My year-in-review for 2008 is therefore a quick recap of my year’s travels:

  • January: I stayed at home the whole month of January, which made for a good start to the semester.
  • February: I made two trips to California in February. The first trip was to Irvine for an NSF workshop, and the second was to Los Angeles for the iConference.
  • March: Stayed home all month in March, too, which gave me just enough time to catch up from February’s trips before April rolled around.
  • April: Took a week in Florence, Italy, doing conference photography for CHI for the third year. Florence was absolutely lovely in April - I had gelatto every day and we strolled the cobblestones of the central part of the city for miles.
  • May: Headed back to Michigan for a weekend in Southfield to photograph my friends’ wedding; I was really happy with most of the shots and so was the bride.
  • June: Spent 5 days in Manchester, UK for the eSocial Science conference, which was a thoroughly enjoyable conference and trip.
  • July: Made an insanely short visit to Oxford, UK for a workshop on profiling communities, put on by the OSS Watch group. It was a great experience, and provided a great opportunity to meet some FLOSS researchers and developers.
  • August: Another really short visit to Michigan to see a few friends and stock up on a few Ann Arbor favorites, right before school started.
  • September: A two-stop European tour with my advisor, starting in Milan, Italy for the Open Source Software (IFIP 2.13) conference, and continuing on to Oxford for the eScience conference. I made several presentations at the conferences, all of which were well received.
  • October: Managed to stay home again for another whole month, which helps with getting the coursework completed on time.
  • November: My third trip to California in 2008 was for the Computer-Supported Cooperative Work conference (CSCW) in San Diego. It was a really great conference for me, and in the future I intend to attend it regularly, as it seems like the right research community for my interests.
  • December: A last-minute trip to Paris for the International Conference on Information Systems (ICIS) was a great finish to the semester, and was the fifth transatlantic journey for the year. Paris was, to my surprise, pretty much as I remembered it - although the weather was considerably rainier, colder and foggier, which is to be expected given the time of year. Finally, we made a week-long trek through Michigan to visit family and friends for the holidays, and experienced some really terrible driving weather (but great hospitality!) along the way.

So what’s on the itinerary for next year? I don’t have any air travel booked just yet, but I have a few trips in mind for 2009. In February, the iConference will be held in North Carolina, and in April I’ll go to Boston to be the conference photographer for CHI, just one last time. There’s a reasonably good chance that I’ll go to Sweden in June for the OSS conference, and to Phoenix in December for ICIS. Compared to 2008, it doesn’t seem like much travel at all, but that will be good for finishing my coursework and progressing to candidacy. Chances are good that another workshop or conference will pop up along the way; there seem to be no shortage of opportunities to explore the world through academia.

Tag: conference, life, travel
Dec 31st, 2008 - No Comments

Free Culture and Copyleft: A social movements perspective

Radical militant librarians are out to destroy the music industry. Or at least, that’s what you might come to believe if you listen to the RIAA’s arguments. The recording industry and other corporate lobbies have convinced policymakers for decades that intellectual property protections are necessary to motivate the production of creative works. Without copyright to protect authors, they claim, the production of creative and cultural artifacts will cease. While legal decisions extending copyright terms have continued to support this purely conjectural economic argument for extended copyright terms, there is a mounting body of empirical evidence that undeniably demonstrates that this economic argument is fundamentally flawed. The Free Culture movement has recently emerged in response to increasingly restrictive intellectual property protections. Associated with the Creative Commons (CC) and its founder, the well known Stanford law professor Lawrence Lessig (who has, incidentally, authored a freely-available book by the title Free Culture), the Free Culture movement extends the ideals of the free software movement to other creative and cultural works.

The free software movement, from which the Free Culture movement draws its inspiration, was the brain child of a famously radical militant hacker, Richard Stallman, known in the hacker community as “RMS.” Hackers like RMS are computer aficionados who enjoy complex problem-solving and tinkering; when software suddenly became a proprietary commodity in the 1980’s, RMS saw this as a hegemonic threat to hackers’ right to access the means of self-improvement (for a hacker, that’s source code) and founded the Free Software Foundation in reaction. Because he realized that elimination of the current intellectual property legal system was an unrealistic short-term goal, RMS crafted a solution in typical hacker fashion, working within the constraints of the legal system by create the General Public License, or GPL.

The GPL itself is an instantiation of copyleft, a general concept dating back as far as 1976, when a computer hobbyist publication produced a code entry that included the message “@COPYLEFT ALL WRONGS RESERVED”. An alternate origination story for copyleft credits the Principia Discordia (fourth edition, circa 1970) for using the term “kopyleft” with the note, “All rites reversed - reprint what you like.” Copyleft has since solidified into an intellectual property licensing practice, and has emerged as a tactic for using copyright law to ensure that works are free to use. Copyleft licensing makes it legal to freely share, modify, and distribute the work, with the single but significant restriction that the same so-called unrestrictive license is preserved in derivative works.  This final condition is what made copyleft such a radical tactic for a knowledge production industry, such as software development. The GPL is sometimes known as a “viral” license, implying negative connotations to the condition that the freedoms it guarantees are transmitted to any derivative works. In this fashion, the GPL in particular is a paradoxical tactic: it guarantees freedom to an intellectual work and any works that are based on it, but because the license must be included in derivative works, the actual uses of GPL licensed work are restricted to those for which GPL licensing is considered acceptable, which limits usage. In short, its explicit unrestrictiveness is implicitly restrictive.

The Creative Commons has developed and actively promotes several other alternative licenses that offer creators a range of choice for determining how their work may be used. CC licenses, sometimes called artistic licenses, apply a set of four usage conditions, including attribution, noncommercial, no derivatives, and share alike, all of which are explained in plain language. Combining these conditions for several variations, the CC has created a set of 6 licenses to cover a wide set of creative licensing needs, and swift adoption has been spurred by Web 2.0 sites that depend on user contributions, such as Flickr, a popular photo sharing site. Flickr’s implementation of CC licensing options for user contributions of photography allows artists to specify conditions for use; combined with a CC image search, Flickr now provides a valuable commons-based resource for photographic art. Copyleft was once an unusual and radical tactic, but with dissemination of the tactic in the popular media, it is becoming an everyday and even mainstream practice.

Will copyleft ever put copyright out of business? It’s unlikely, and not even necessarily desirable; copyright is in fact a part of American heritage.  Thomas Jefferson’s vision for copyright, embedded in the first article of the Constitution, was intended to promote innovation through short-term protections for individual creators with respect to publication. The 1909 Copyright Act changed matters by changing wording, so the initial restrictions on “publishing” were transformed to instead prohibit “copying,” which is a far more limiting interpretation of the protection of intellectual works. Today, copyright terms in the US are the duration of the original author’s lifetime, plus 70 years. Extension after extension of the duration of copyright terms have resulted in a situation where very few works will be entering the public domain any time soon. The Free Culture movement hopes to change that, and one very successful tactic has been promoting copyleft as an alternative. In a best case scenario, copyright durations might someday be reduced to a 14-year tenure, empirically shown to be an appropriate length based on the purported goals. Reversing copyright scope creep in the US legal system will require a long-term effort from the Free Culture movement.

The increasing complexity of the intellectual property landscape will eventually come to a breaking point; legal cases involving copyright are notoriously expensive and difficult to settle, and the new profusion of alternative licensing options serves to further complicate the situation, despite the relative clarity of the CC licenses themselves. And these licenses have been adopted in an economically significant fashion. An amicus brief submitted for a recent federal appeals court decision pointed out that millions of works have been released under copyleft licenses, affecting such organizations as MIT, IBM, Wikipedia, numerous free software projects and many businesses; upholding a lower court decision that threatened copyleft enforceability would therefore be enormously disruptive. The Federal Circuit ruled that violating copyleft is equivalent of copyright infringement, rendering copyleft license enforceable. This decision is already seen as a significant victory for proponents of copyleft. Despite its relative newness, the Free Culture movement is already creating change.

Tag: research
Nov 21st, 2008 - 1 Comment

Open Access Day

Alternate title: “Why I Give it Away for Free”

As a part of Open Access Day’s synchroblogging event, I’m going to address several assigned points; it just so happens that I was planning to blog on this very topic anyway, so the first Open Access Day seems apropos timing.  Here are today’s topics of discussion:

  • Why does Open Access matter to you?
  • How did you first become aware of it?
  • Why should scientific and medical research be an open-access resource for the world?
  • What do you do to support Open Access, and what can others do?

Open access matters to me because I’m an idealist at heart. There, I said it. If you need more proof, consider my AmeriCorps service and 5 year stint working in nonprofits… What more reason do I need? But my idealism in this regard runs a little deeper than simple bleeding-heart liberalism; as an academic, I’m not in this knowledge production business to hoard ideas and information and knowledge that could potentially make the world a better place. The whole point of scientific research is to address real-world problems. If I wanted to hide my light under a basket, I would have stayed in industry, where my hourly billing rate was pretty astronomical (a good web analytics professional doesn’t come cheap). I want to do science to make the world better, not just to improve the scholarship of privileged institutions. I’m motivated to make an intellectual contribution for its own sake, not to make a buck, which is a very good thing, because I fully expect that I would have made more money as a professional than I will as an academic.

Open Access matters to me because I’m a scholar. Sometimes I can’t get easy access to the journal articles I want, although my librarian friends are always at the ready. But imagine if I had no librarian friends, and no interlibrary loan privileges. I’d be finished as a scholar before long. Yet another reason that my rational self-interest moves me to support open access is simple greed: the easier it is for others to find my work, the more citations I will acquire. And we all know that citations are the ultimate academic currency… But seriously, more citations means broader impact, and that means more potential for my contributions to be meaningful and useful in the so-called “real world.” Not to mention tenure.

Open Access also matters to me because I’m a taxpayer. The current system of publication makes us pay for our scholarship twice. In effect, research (at public institutions) is conducted on the taxpayers’ dime, and then the taxpayers turn around and pay for access to the results. Institutional repositories are one effort to minimize the absurdity of this situation by promoting sharing of scholarship within an institution, but no matter how you slice it, universities are literally paying twice for scholarly production. There has to be a better model.

I first became aware of open access as a Master’s student at the University of Michigan School of Information. A quote from a lecture that I wrote down during my first week of classes basically sums it up for me: “Information yearns to be free.” And yet, there are distinctly economic barriers to sharing information, despite the overwhelming evidence supporting the potential economic benefits of open access. Economic barriers have to do with the money for production, but the economic benefits are more indirect, so this set of dueling incentives has yet to be overcome.

In addition, since I’m taking a social movement theory course at Syracuse University’s Maxwell School of Citizenship and Public Affairs right now, I’m writing a little op-ed piece on Open Access Day as a course assignment; perhaps I can turn around and send it to the Daily Orange as well. From a social movements perspective, Open Access Day is an interesting example of collective action. The online organizing, mobilizing, and even the strategic use of synchroblogging to demonstrate the level of support for the movement’s goals are all indicators of a movement that has good potential for success, if its supporters can be convinced to take action to support open access every day, not just on Open Access Day.

Scientific (especially medical) resources should be open-access because it is unethical to restrict information access based solely on economics. These are the information resources that have the potential to improve the human bottom-line in ways that matter a lot more than business or industry bottom-lines. Privileged access to scientific and medical information only benefits the privileged, and not everyone who needs it can get it. This serves to continually reconstitute the structure of academic performance as the status quo in yet another case of the Matthew Effect: “For unto every one that hath shall be given, and he shall have abundance: but from him that hath not shall be taken away even that which he hath” (Matthew 25:29, King James Version). The Matthew Effect is anti-meritocratic, particularistic, and systemically disadvantages scholars who are not so fortunate as to have access to the wealth of information resources available to the typical American Research I Institution. Are we really so insecure as an intellectual culture that we have to ensure our intellectual hegemony by limiting others’ access to “our” knowledge?

I support open access by taking an active role in developing the OAI-compliant FLOSSplanet repository for Free/Libre Open Source Software research. It started as a working papers repository and is still a work in progress itself… I also tell people about it, hence this blog post. And I’m willing to stick my neck out a little further; at a recent conference, I had several discussions with the publications chair about the overly restrictive copyright agreement required for publication of the proceedings. The ironic part? It’s for an open source software conference.

I also support open science by sharing my research directly. I make my work available from my web site in my own self-archive, and by depositing in open access repositories. I also share my research analyses on myExperiment.org. This is really giving it away for free, because these are not trivial to create. My analysis workflows require a lot of work to develop, and there’s a genuine risk that I could be “scooped” on my own analysis. But I also believe that not all the magic is in the research artifact. All the brilliant analysis tools in the world are as much good as a box of rocks if you don’t have the metadata, background, and skills to interpret the results. Despite my best efforts to document the work as completely as possible, no one else will understand the data and the workflow in the same intimate way that I do. The process of production, as well as its artifacts, are the real competitive advantage in knowledge production.

Others can help support open access too, and the easiest way is most effective - just tell people about it. Faculty and graduate students are the best people to educate about open access because they have the most incentive to change the current system of publication. Academics can help support open access through their choices of publication venues, and individuals in leadership roles in conference planning and journal publications can really make a difference quite directly, so pester them mercilessly.

Tag: complexity, research, school
Oct 14th, 2008 - 1 Comment

Replicating Research

I’ve been working on a project for almost a year now that involves replicating published research. Replicability is one of the goals of good scientific research: a strong research design should be replicable, and this is part of the reason for methods sections in scientific papers; another reason is to make it possible for others to evaluate the work appropriately.

But what I’m doing isn’t exact replication, it’s replication using eResearch methods. This means that the analysis is reproduced from the papers in collaboration with other researchers, using tools that specifically support shared analysis and data. In general, I’ve found it challenging to replicate the specifications of someone else’s work; you have to make sure you understand where all the variables come from, for example, which can be a challenge if the conceptual framework isn’t spelled out clearly enough.

Nonetheless, I’ve pretty much got a research replication routine down. I start off by evaluating the data and methods sections of the paper that I want to replicate, and use it to generate a requirements specification that lists desired outputs and necessary data to achieve them. Next, I generate an abstract workflow, which simply represents the analysis processes that need to happen to achieve the analysis results. This is a good time to bring in collaborators, and my colleague James has been an invaluable asset in this respect. So far, we have typically split our workflow development up between us, so that James has handled the data and I’ve handled the analysis.

After defining an abstract workflow and deciding how to divide up the tasks, James and I have separately developed our portions of the workflow, using the Taverna Workbench scientific workflow analysis tool. After much effort and debugging and all manner of troubleshooting, a final workflow is produced. This is the one we use to replicate the analysis, so we run it on comparable data to the original analysis to see whether we can achieve a similar (or better yet, identical) result.

Finally, after all that work, we package up the analysis work to share it. I’m particular about providing metadata for every process in the workflow, describing every analysis process in plain language so that just about anyone could understand it. We also have to make sure that if the workflow uses any web services that we created, making them public won’t cause any server problems for us - essentially ensuring that sharing our analysis doesn’t take down our own infrastructure.

Once I’ve attended to all these little administrative details, it’s time to share. I’ve been posting the workflows publicly on myExperiment.org, which is a social networking site created to support collaboration through tools specifically designed to support workflow sharing. There are a number of reasons that a lot of people don’t share their work this way, but I think it’s the right thing to do for a number of reasons.

Of course, we’ve learned valuable lessons from the effort of replicating research. Namely, it’s just plain hard, and working out the details of the data to make the analyses happen is often the most challenging part of the process. We’ve also come up with some design strategies for creating workflows, but all that requires a lot more explication than I suspect is desired by my blog audience. So for anyone who actually read this all the way through and is interested in hearing about applying design principles to analysis workflows or why I think it’s good to give away my hard work for free, leave a comment and I’ll make a post about these related topics.

Tag: research, social media, work
Sep 29th, 2008 - No Comments

Final Year of Coursework!

The school year started last week, and in addition to getting back into the semester groove, I’ve been fixing up the tile in my shower to complement the new glass block window. Not the best timing for that task, but it had to be done. This is my final year of graduate coursework, assuming I promptly pass my end of coursework milestone in another year, so if all goes well, by this time next year I should be on my way to doctoral candidacy.

This semester’s schedule includes a methods course on research design, and a seminar course. After much deliberation, I’ve settled on a seminar on social movement theory in the Anthropology department of the Maxwell school. Choosing a seminar this semester required balancing the cost/benefits of workload and my interest in the content against the direct applicability to my stated research interests. I’m hoping to write a paper on open movements for my course paper, as I’ve been wanting to make an investigation into the topic for a few months.

My practica this semester are incredibly well suited to my interests, however, which is really a tribute to the dedication and involvement of our school’s faculty in doctoral education. No, I’m not being sarcastic at all; the faculty that I’ll be working with are incredibly busy but are also quite willing to spend time mentoring me in teaching and research. For my teaching practicum, I’m working with Jeff Stanton to overhaul the course design for IST 777, a basic stats course for interdisciplinary researchers. It should be an interesting effort, as we’re planning to really reorient the way quantitative analysis is taught in the school at the doctoral level.

My research practicum this fall focuses on collaboration dynamics in an ecosystem research community, and will involve collecting and organizing data for some quantitative analysis.  I’m excited about the project because it will help me address some of my skill gaps in gathering and managing research data from the Internet. At the same time, I’m slowly building my technical skills by working on repository configuration for FLOSS Planet, a (new and improved!) repository for research on free/libre and open source software. Learning how to manage databases isn’t particularly fun, but it’s really useful for the kind of research data to which I’m most attracted

Getting the semester going is a bit of an adjustment in itself, but I’ll also be away all next week for conferences, completely missing the third week of classes. However, I get to go to Milan, Italy for IFIP 2.13 (Open Source Software) and to Oxford, UK for the Oxford e-Research Conference, so I have no real complaint. I’m making three presentations, plus doing a demo, so it will be a pretty busy week-plus of conferencing. Happily, I’ll be at home for the rest of the semester, save for a week in early November for CSCW 2008. Two trips for conferences in a semester is a bit disruptive, but not too bad with appropriate planning. Speaking of which, I should be planning what to pack, since my flight from Syracuse departs on Friday morning.

Tag: research, school
Sep 3rd, 2008 - No Comments