When TWC’s No. 6 (History issue) came out a few weeks ago, we had a not-so-minor snafu: all the hotlinks in the press release were broken. The reason? We had (cleverly, we thought) drafted the press release using DOIs instead of URLs, and we had problems with the issue’s DOI deposit. DOIs, or digital object identifiers, are a way to pretend that Web items are permanent. Web sites change so frequently that links continually get broken. DOIs aim to help solve this problem: an online item, in this case a journal article, is assigned a unique identifier, and then that identifier is linked to a URL in a database. When you hit the identifier, it searches the database for the URL and then goes there. The DOI record file is simply updated when the URL changes. One goal of DOIs is to expedite persistence of online content. When we wrote the press release, we used the DOIs because the links would persist, and theoretically, anyone running across it years later would still be able to hit live links. The most widely used DOI service is CrossRef, whose “mandate is to connect users to primary research content, by enabling publishers to work collectively. CrossRef is also the official DOI link registration agency for scholarly and professional publications.” Their Free DOI Lookup lets you type in information and return a DOI. You may also type a DOI in a box on their home page, and it will take you right to the relevant URL, often not the article itself but a summary page that lists all the options available for viewing, downloading, or purchasing content. (You may also add “http://dx.doi.org/” before the DOI to turn it into a URL.) The DOI system is very flexible and can take any number of forms. A sample DOI from TWC No. 6 is “10.3983/twc.2011.0272.” Here’s what it means: first is our unique DOI number (10.3983), which was assigned to us when we signed up and paid our fees. Next are the journal abbreviation (twc), the year (2011), and a minimum-four-digit individual article number (0272), which is the same as the article’s OJS submission record. We made up the whole format to please ourselves, and to permit future growth—for example, we could replace “twc” with something else to reflect something else we wanted to index, like a monograph. According to the DOI folks, the DOI system is “A system for persistent, semantically interoperable, identification of intellectual property entities on any digital network.” Various sorts of metadata may be uploaded into each item’s database entry—not just the URL, but also things like author names, whether the item is online or print, journal titles, journal abbreviated titles, whether the item is full text or abstract only, page number for first page, year of publication, in-house identifier, volume and issue, and individual references cited inside the article. The information chosen by the administrator to be collected is typed into an XML form, and that form is uploaded into a DOI system. TWC doesn’t deposit much: we file the first author, the URL, and the journal’s status as online (as opposed to print). Although DOIs are most often identified with journals in the sciences with an online presence, TWC joined the club because we approve of the theory of the persistence of links. As part of the deal with DOI, we have to hotlink to DOIs in the articles’ works cited sections, so an issue or so back, we researched all the works cited sections of all the published articles and added in the DOIs so we would be compliant. Now we do this for each article as a production step; we have a CrossRef account that permits batch querying. Regarding TWC’s failed deposit for No. 6: it turned out that the DOI folks had changed the way the deposit was formatted (in their terms, they updated to a new schema), only we didn’t know about it. We thought we were incorrectly formatting the XML file. We spent a lot of time examining it, trying to figure out where we went wrong. We got it done, but days late. Still, we’re committed to the broader issue of persistence of online content: in addition to depositing DOIs, TWC grants permission to libraries to make archive copies of each issue, and of course TWC’s Creative Commons Attribution-Noncommercial 3.0 Unported License copyright lets anybody who wants to (nonprofit only!) duplicate the entire text and credit back to the original source—and we hope people will use the DOI, naturally, not the URL. TWC is using an established tool built for scholars to help persistence of data. But it makes me wonder: what do fans do to ensure that an online fan artwork can be found? I can think of a few strategies: continuously curated links roundups, multiple copies of items at several blogs and archives, screenshots of artwork, zipped and filed files (vids? art? stories?) maintained in locked communities. (Comment with more!) It just reminds me of how much we rely on fans’ time-consuming, detailed organizational work to find things.

[META] Persistence and DOIs
Tagged on:             

4 thoughts on “[META] Persistence and DOIs

  • 31/03/2011 at 14:00

    Right now on Live journal, a “librarians of fandom” appreciation event is going on, and it and your very timely post makes me personally grateful for archivists and data wranglers of any sort in fandom.

    It’s such a labor of love. There are many factors pulling in all directions when it comes to the permanence of fanworks online — the ephemeral nature of fandom, the desire of fan authors and creators to somehow preserve control of their own work even when it is posted and let loose on the infinitely malleable public billboard and canvas of cyberspace, the desire of readers and viewers to keep their favorites in their own library, whether that means a delicious bookmark or a file saved to their hard drive.

    Organizing and archiving stuff is my least favorite fannish task; one for which I am by nature not very well suited. I’m such a magpie — enjoy it and move on! So I am probably, because of that, even more grateful for the people who set up archives, manage content, and save it for future fans to enjoy.

    Thanks for the post, and so sorry for your snafu! I didn’t know much about DOI’s other than that they existed, so this is good info for me both as a teacher of APA style and as a fan.

    • 02/04/2011 at 09:40

      I actually love organizing data; I love it more than the data itself, if that makes sense. I actually once created a links roundup for all fan fic in this tiny little fandom that I became obsessed with (because it was scattered everywhere, not concentrated on LiveJournal), but… I lost interest and no longer curate the links. I feel bad about it. But this happens all the time. Fan stuff is so time-based and ephemeral: I get notes of apology if someone sends feedback on a story more than a week old!

      I’m hopeful that the AO3 can help keep the data wrangled and find-able, although I also know that AO3, while large, is really its own tiny little corner. I do appreciate that they are attempting to manage the data while also keeping to fannish norms of control and privacy.

      I like that fans organize themselves. I rely on recs and places like crack_van to find fic. The first thing I do when I find a fandom I want to read fic in is look for links newsletters and links roundups. LiveJournal drives me crazy because it’s not targeted enough; I don’t want to have to friend people to read fic, and unless I know someone personally, I don’t want to read their blog. Too overwhelming! I’m targeting to the fan artworks! So some kind of vetting is absolutely required.

  • 02/04/2011 at 00:56

    There are easily four or five (and maybe more) things to think about here. In my mind, yeah, DOI are a good tool against the issue of link rot that is *very* well-documented in the LIS literature across all fields. But at the same time, there’s still nothing built into the concept to ensure that in fifteen years, the whole thing will be obsolete or meaningless or just plain-out not around any more. And from an information literacy point of view, a standard URL – like, say, http://journal.transformativeworks.org/index.php/twc/article/view/221 – has the benefit of being “human-parsable” – from looking at it, I can tell immediately that it leads to an article in a journal that’s published by a non-commercial, not-for-profit organization. And that information already allows me to evaluate the quality and value of the content at the URL.

    …with http://dx.doi.org/10.3983/twc.2011.0221, that’s clearly not the case.

    And the deeper, more underlying question, of course, touches on something the LIS field has been dealing in various shapes and forms since at least the first years of the 20th century. The European documentalist movement had as its ideal the preservation of *all* information, regardless of content; a librarian is by definition a curator, selecting what works are worth being found, and how to find them. […an archivist’s or records manager’s focus is on preservation – but not necessarily access, or at least not mass public access.] And I think with fan works, there are inescapable tensions – some fans no longer want their work found, so am I, as a librarian, to respect their wishes to to take the possibly noble but certainly kind of asshole-ish position that once you wrote something and had your name attached to it, you can’t unwrite it and have your name removed.

    Bottom line in my mind is, the features of both fan work and fans’ archival practices (and in both of these, the strong resistance to any kind of standardization) will mean that no matter what, fan works will remain ephemeral, and fan archival practices are based on individual fans or fan communities, but to not persist to the degree where they can be standardized or mandated.

    • 02/04/2011 at 09:24

      Thanks for your comment, Mikhail! I agree that fan works will remain ephemeral. It’s also true that fans will be in charge of organizing their own data structure, which generally works by utilizing existing tools, repurposed to this end. I know the only way for me to manage LiveJournal-based fandom is links roundups and fandom-specific daily links newsletters; the signal-to-noise ratio is otherwise too loud.

      I was thinking about fan archiving because of the History issue: libraries have hard-copy fan fiction zines, and I’m worried that librarians will make public indexes, with real-life names attached. It seems so rational to scan them to PDF and throw them up online, or type up an index of title, author, zine title, zine publication info. But this would violate an expectation of fan privacy. I advocate that writers’ names be anonymized or redacted in these cases, although people working with the original zines would obviously see this info.

      Interestingly, the Archive of Our Own (A03), under OTW’s umbrella of fan services, permits exactly what you suggest: it’s possible to upload fiction, then orphan it by removing the author name. Lots of fans gafiate; I would rather they leave their fic up but remove their name than take it all down, which is what I see people do. (I run a fan fic archive and have had wholesale takedown requests from some authors, which I do, no questions asked, but it still somehow makes me sad.)

      Re. DOIs: Sure, who knows if it will be around in 15 years, but the metadata being archived can simply be reformatted and repurposed; I don’t think all the data entry and deposits are in vain. Who knows what the landscape and architecture of data will look like then? I do know that Chicago Manual of Style’s 16th edition (2010) added in a way to style a reference that has a DOI. I personally don’t anticipate that *authors* will use DOIs; it will still be people in journal production who have to look them up and insert them, a process that can be automated by the larger typesetting outfits.

      I too prefer human-parsable URLs. Ironically, TWC’s style is to never hide hotlinks under words (we only hotlink hotlinks), and to never use shortened URLs (like bit.ly), so people can view the URL and assess it before they hit it. We also anticipate that people will print the articles out, so we want them to be able to see the URLs.

Comments are closed.