Copyright ©1997-2008 Glenn Fleishman except as noted otherwise. All rights reserved. For permission to reprint, contact Glenn Fleishman at glenn at glennf.com. Photo © 2008 Laurence Chen; used with permission.
Turning technology from mumbo-jumbo into rich tasty gumbo
� If You Don't Pay for It, Don't Expect Persistence | Main | Killing 3,000 Weblogs? Child's Play. I Killed 7,000 Subscribers �Go to Google and enter your Weblogs.com domain and a keyword in this form:
site:tom.weblogs.com -flibbertygibbit
You need to pick a word that is not on any page, obviously, or you'll only get a subset of results. If you have a header or footer, use that word. I just tried this on tom.weblogs.com and got these results. You can now either tediously used the Cached link for every page and copy it, or use one of these Web reaping or crawling tools to download everything linked off the Google pages to one link depth. You might limit the downloads to pages in the pattern of dates, like (in grep syntax):
\/\d{4,4}\/\d{2,2}\/\d{2,2}
This retrieves just the date archives.
For instance, even my abandoned glennf.weblogs.com site, which can happily take a Deep Six, is archived in Google.
Extra, extra! Super-Google-genius Tara Calishain (I hope that makes her blush) posts much more expert advice about retrieving old pages from Weblogs.com sites through Google and another search engine, among other great ideas.
Posted by Glennf at June 15, 2004 3:34 PM
TrackBack URL for this entry:
https://db.isbn.nu/mt3/mt-tb.pl/2183
Listed below are links to weblogs that reference How to Restore Your Weblogs.com Site via Google:
� Weblogs.com backups from JD on MX
Weblogs.com backups: Glenn Fleishman notes that removed material is likely still in Google cache, and he describes how to efficiently recover these materials... The Wayback Machine keeps materials longer but doesn't sample as frequently.... [Read More]
Tracked on June 16, 2004 1:01 PM
� Mmm Gumbo. :) from Frenzied Daddy
Thanks to Glenn for this (via Code the Web Socket). [Read More]
Tracked on June 16, 2004 1:46 PM
� Recovering Your Weblogs.com Site Using Google -- A Couple Extra Suggestions from ResearchBuzz
Actually these will work when you need to recover old sites which have been indexed in Google, if your host has died or some horror befell your data, etc. Building... [Read More]
Tracked on June 16, 2004 5:07 PM
� Weblogs.com debacle from Backup Brain
I've read much of the hoohah over the recent weblogs.com debacle, and I was going to keep quiet, figuring that [Read More]
Tracked on June 16, 2004 6:26 PM
� Searching for ideas from rexblog
Searching for ideas: IHere are some suggestions about recovering old weblogs.com sites from ResearchBuzz building on suggestions from GlenLog . [Read More]
Tracked on June 17, 2004 4:54 AM
� weblogs.com backup via Google from house of warwick
Glenn points to a way to get your weblogs.com site back from Google. [Read More]
Tracked on June 17, 2004 6:18 AM
� What we can learn from the abrupt closure of Weblogs.com from Heal Your Church Web Site
We've talked more than once about the hazards of hosting for free, and more recently, we've talked about how it is unreasonable to expect an online endowment. So I while on one hand I'm not surprised to hear 3000 Weblogs.com users instantly lost their ... [Read More]
Tracked on June 18, 2004 7:25 AM
� What we can learn from the abrupt closure of Weblogs.com from Heal Your Church Web Site
We've talked more than once about the hazards of hosting for free, and more recently, we've talked about how it is unreasonable to expect an online endowment. So I while on one hand I'm not surprised to hear 3000 Weblogs.com users instantly lost their ... [Read More]
Tracked on June 18, 2004 7:47 AM
� What we can learn from the abrupt closure of Weblogs.com from Heal Your Church Web Site
We've talked more than once about the hazards of hosting for free, and more recently, we've talked about how it is unreasonable to expect an online endowment. So I while on one hand I'm not surprised to hear 3000 Weblogs.com users instantly lost their ... [Read More]
Tracked on June 18, 2004 7:47 AM
| Sun | Mon | Tue | Wed | Thu | Fri | Sat |
|---|---|---|---|---|---|---|
| 1 | 2 | 3 | ||||
| 4 | 5 | 6 | 7 | 8 | 9 | 10 |
| 11 | 12 | 13 | 14 | 15 | 16 | 17 |
| 18 | 19 | 20 | 21 | 22 | 23 | 24 |
| 25 | 26 | 27 | 28 | 29 | 30 | 31 |
Recent Entries
Archives
May 2008 | April 2008 | March 2008 | February 2008 | January 2008 | December 2007 | November 2007 | October 2007 | September 2007 | August 2007 | July 2007 | June 2007 | May 2007 | April 2007 | March 2007 | February 2007 | January 2007 | December 2006 | November 2006 | October 2006 | September 2006 | August 2006 | July 2006 | June 2006 | May 2006 | April 2006 | March 2006 | February 2006 | January 2006 | December 2005 | November 2005 | October 2005 | September 2005 | August 2005 | July 2005 | June 2005 | May 2005 | April 2005 | March 2005 | February 2005 | January 2005 | December 2004 | November 2004 | October 2004 | September 2004 | August 2004 | July 2004 | June 2004 | May 2004 | April 2004 | March 2004 | February 2004 | January 2004 | December 2003 | November 2003 | October 2003 | September 2003 | August 2003 | July 2003 | June 2003 | May 2003 | April 2003 | March 2003 | February 2003 | January 2003 | December 2002 | November 2002 | October 2002 | September 2002 | August 2002 | July 2002 | June 2002 | May 2002 | April 2002 | March 2002 | February 2002 | January 2002 | December 2001 | November 2001 | October 2001 |