Researchers of online news and journalism (such as myself) face a serious problem when it comes to our empirical domain: because websites can be continuously updated and the front pages of news websites rarely stay unchanged for longer periods of time, our object of study is transient. When a news website is updated, the old version from before the update is practically gone and cannot be studied.
It goes without saying that the methodological consequences are serious. How do you study an object that no longer exists (or is close to impossible to recreate)? How do you deal with your object of study vanishing into thin air? Existing archiving websites such as the Internet Archive: Wayback Machine and netarkivet.dk do a great job at saving copies of websites for future use and are valuable resources for researchers, but for various reasons I find their usefulness limited. Today, however, an article on Journalisten.dk made me aware of a website that may help dealing with this issue, even if it doesn’t solve the problem.
The website is www.PastPages.org, and it captures the front pages of 67 news websites every hour (the captures are available as image files in the .png format). The websites PastPages is currently capturing are primarily from the US but there are also front pages from Argentina, Australia, Brazil, Canada, China, Egypt, France, Germany, Great Britain, India, Israel, Japan, Mexico, Norway, Qatar, Russia, South Africa, South Korea, Spain, Sweden, and Turkey. We still wait for Danish news websites to enter the sample, however…
I think this website can be an extremely helpful resource for both myself and everybody else who works with online journalism. The access to the front-page captures is easy and free, everybody can do it without any bureaucratic ado, and the logics and schedule of the data collection are transparent and easily understandable. The most obvious limitation to the website is that it only captures front pages but not articles. This choice put some limitations on what you can do with the material on the PastPages, but for future studies of how frames and news agendas change and of the forms of online news, it has the potential of becoming a key resource for researchers and students.
PastPages is still a quite new site, and its future value depends entirely on its continued existence and capturing of enough material to reach a critical mass suitable for studies. You can support PastPages financially here: http://www.kickstarter.com/projects/651552740/keep-pastpages-alive (the fundraising campaign closes on July 6, 2012). I consider the $20 I donated a good investment in both my own research and, more importantly, the preservation of today’s news for tomorrow.