-
Website
http://www.scripting.com/ -
Original page
http://www.scripting.com/stories/2009/10/10/onceAgainFuturesafeArchive.html -
Subscribe
All Comments -
Community
-
Top Commenters
-
eas
55 comments · 4 points
-
AndrewBurton
134 comments · 10 points
-
Michael Markman (Mickeleh)
154 comments · 16 points
-
Rex Hammock
52 comments · 9 points
-
malatmals
81 comments · 3 points
-
-
Popular Threads
-
How I develop formats and protocols. (Scripting News)
1 day ago · 11 comments
-
Open is in the eye of the beholder. (Scripting News)
3 days ago · 13 comments
-
Store Twitter URLs in earth's oceans? (Scripting News)
5 days ago · 16 comments
-
Why today's Twitter is like Napster in Y2K. (Scripting News)
5 days ago · 15 comments
-
If you wrote the words you own the copyright. (Scripting News)
5 days ago · 7 comments
-
How I develop formats and protocols. (Scripting News)
Currently, Kaltura charges quite a bit for Akamai level storage of data. I propose leveraging s3/cloudfront and archive.org to provide levels of access based upon public interest in the archived data. Some sort of co-op should be formed to provide access to data while respecting non-commercial licensing (really any desired licensing) for the duration of copyright to gracefully preserve the value of works.
The easiest way I see of achieving this is by creating some sort of redirection engine (I'd rather not say url shortener) that drives multiple content delivery networks. Redirect to archive.org when the data is accessed sporadically, redirect to ad supported(?) cloudfront for more popular data, redirect to akamai for data that's too hot to handle. any revenue generated goes toward supporting the network with net income distributed to the copyright owners or their descendants. The system would of course garbage collect itself to keep cost of s3/akamai distribution as low as possible.
assure us they''d be around 20 or 50 years from now. Google or
Microsoft -- I would never trust them. They throw their weight around
too much. It would have to be an organization with a strong public
service component. Not that Yahoo has that, but what choice do they
have but to find a new way to be.
What's the story on archive.org's "foolish handling of robots.txt?"
What's the story on deathvault.com?
What are they thinking.
Oy.
Thought-provoking. Arguably the gang at long now are also thinking about these things...
Best,
Martin Haeberli
I suppose i've got a bit torrent type of idea in mind. Trackers run which you can point personal domains too. e.g. vault.scripting.com.. The tracker has a list of clients, each with a bandwidth rating, which hold the content of vault.scripting.com and in turn request from that IP and this is delivered to the user. The great thing here is that your content can be stored by many hundreds of different sources, big or small all over the world.
It's a rough thought but it negates the need of any specific individual / company. I think there is a major flaw in my idea because you could potentially compromise content if you're a client but im sure there are CRC checks or some levels of encryption you could employ
They're essentially a hobbyist group, so hardly the well-established institution you're looking for (and I'm looking for too, frankly), but I'm sure you'd find their work interesting.
It's more of a financial thing than a technical thing, isn't it.
Dave, your post has touched on a huge issue. One that's only just emerging. The answer must be distributed, open source, and secure. Until such a solution exists, printing on archival paper is still the safest bet.
I'm astonished nobody proposed national libraries. Most national libraries get their contents by legal deposit: two copies of every book or other publication (map, “something”) published must be given to the library. There's no reason not to extend the concept of a (hopefully voluntary) legal deposit on web publications.
In fact the National Library of Germany is bound by law to exactly that: to collect and archive online publications. It's obiously not perfect (nothing could be) and they are still developing procedures and won't collect for some time but what I see look's promising: static content, strong metadata, persistent identifier and in the future automatical harvesting using open protocols of the Open Archive Initiative. If you just want to get archived and don't particulary care for continious service (like Dave seems to wish), this should be quite good enough.
1 TB for the next 50 years or something like 100 TB for 300 years. I think its the timeframe thats the real problem. Technology and financing set aside, there are a very few private projects that have such a stamina.
Perhaps its my european thinking but I can not imagine any private istitution that could asure an archive over such a long period.
Martin
I like the idea of academics being a part of it.. but what about an organization such as the Smithsonian, or something along those lines? Heck,dare I say an org like Google? While it has been mentioned that legal and financial are huge considerations - when it's all said and done, the underlying architecture, security, dr/coop and maintenance should be focal. And, while I kinda like the thought of my "knowledge" and life experience being immortal, it will only remain this way as long as someone is interested in having access to my stuff :)
So yes, I would be happy to volunteer - and while I'm not an tech-uber-geek (which I fondly call "TUGS", I've got a knack for planning needs based on user functionality liasoning with the TUGS, and I'm a phenominal organizer / scribe, albeit a modest one ;)
1. How about storing them on a p2p network of computers dedicated to this purpose. Legal issues and abuse need to be controlled by some one. A Wikipedia like non-profit structure may last longer.
2. A few timecapsules with the information around the world and may be in the moon as well :-) [if everything fails ]
@fellowcreative has a project called deathbook which is aimed at asking questions in the UK about how people can ask for their online data and id to be handled after death - i.e delete/archived/split up and passed on etc etc.
ATM an email address may just get reused if you pass away perhaps passing on some data (think the twitter hack). What happens to your facebook account - should it stay around so people can still post to your wall or be removed asap as it it causing distress....
I'm more of the opinion that the online "me" would be archived in some offline manner, since at some point I will be very much offline as well! The problem is of course how to best store it all? I agree the process very much matches what libraries, etc. have to deal with, and I think some key ideas might be:
1. Pick the best. This is very subjective, but what is Ansel Adams know for? Thousands of good images? Nope, just a few outstandingly great ones! What are the stand-out milestones of your life's work? With all our ways of getting metrics, this shouldn't be too hard to figure out.
2. Diversify. Not just putting content in various online locations, but also in various media. Figure out what you want to spend, and how you expect your loved ones to be able to get to it. Your guess is as good as mine, but hopefully one or more will work.
3. Share. There is such a thing as too much centralization. So often throughout history we find gems of historical documents squirreled away in someone's attic, even though the originals are gone. Get your results from step #2 into as many hands as are interested in having it.
I'll be watching with great interest to see what comes of an online solution!