When TWC’s No. 6 (History issue) came out a few weeks ago, we had a not-so-minor snafu: all the hotlinks in the press release were broken. The reason? We had (cleverly, we thought) drafted the press release using DOIs instead of URLs, and we had problems with the issue’s DOI deposit. DOIs, or digital object identifiers, are a way to pretend that Web items are permanent. Web sites change so frequently that links continually get broken. DOIs aim to help solve this problem: an online item, in this case a journal article, is assigned a unique identifier, and then that identifier is linked to a URL in a database. When you hit the identifier, it searches the database for the URL and then goes there. The DOI record file is simply updated when the URL changes. One goal of DOIs is to expedite persistence of online content. When we wrote the press release, we used the DOIs because the links would persist, and theoretically, anyone running across it years later would still be able to hit live links. The most widely used DOI service is CrossRef, whose “mandate is to connect users to primary research content, by enabling publishers to work collectively. CrossRef is also the official DOI link registration agency for scholarly and professional publications.” Their Free DOI Lookup lets you type in information and return a DOI. You may also type a DOI in a box on their home page, and it will take you right to the relevant URL, often not the article itself but a summary page that lists all the options available for viewing, downloading, or purchasing content. (You may also add “http://dx.doi.org/” before the DOI to turn it into a URL.) The DOI system is very flexible and can take any number of forms. A sample DOI from TWC No. 6 is “10.3983/twc.2011.0272.” Here’s what it means: first is our unique DOI number (10.3983), which was assigned to us when we signed up and paid our fees. Next are the journal abbreviation (twc), the year (2011), and a minimum-four-digit individual article number (0272), which is the same as the article’s OJS submission record. We made up the whole format to please ourselves, and to permit future growth—for example, we could replace “twc” with something else to reflect something else we wanted to index, like a monograph. According to the DOI folks, the DOI system is “A system for persistent, semantically interoperable, identification of intellectual property entities on any digital network.” Various sorts of metadata may be uploaded into each item’s database entry—not just the URL, but also things like author names, whether the item is online or print, journal titles, journal abbreviated titles, whether the item is full text or abstract only, page number for first page, year of publication, in-house identifier, volume and issue, and individual references cited inside the article. The information chosen by the administrator to be collected is typed into an XML form, and that form is uploaded into a DOI system. TWC doesn’t deposit much: we file the first author, the URL, and the journal’s status as online (as opposed to print). Although DOIs are most often identified with journals in the sciences with an online presence, TWC joined the club because we approve of the theory of the persistence of links. As part of the deal with DOI, we have to hotlink to DOIs in the articles’ works cited sections, so an issue or so back, we researched all the works cited sections of all the published articles and added in the DOIs so we would be compliant. Now we do this for each article as a production step; we have a CrossRef account that permits batch querying. Regarding TWC’s failed deposit for No. 6: it turned out that the DOI folks had changed the way the deposit was formatted (in their terms, they updated to a new schema), only we didn’t know about it. We thought we were incorrectly formatting the XML file. We spent a lot of time examining it, trying to figure out where we went wrong. We got it done, but days late. Still, we’re committed to the broader issue of persistence of online content: in addition to depositing DOIs, TWC grants permission to libraries to make archive copies of each issue, and of course TWC’s Creative Commons Attribution-Noncommercial 3.0 Unported License copyright lets anybody who wants to (nonprofit only!) duplicate the entire text and credit back to the original source—and we hope people will use the DOI, naturally, not the URL. TWC is using an established tool built for scholars to help persistence of data. But it makes me wonder: what do fans do to ensure that an online fan artwork can be found? I can think of a few strategies: continuously curated links roundups, multiple copies of items at several blogs and archives, screenshots of artwork, zipped and filed files (vids? art? stories?) maintained in locked communities. (Comment with more!) It just reminds me of how much we rely on fans’ time-consuming, detailed organizational work to find things.
[META] Persistence and DOIs