Between all my various computers I have about 850GB of data. That’s source code, slides, home movies from the 80s, emails in a variety of obsolete formats, recorded phone conversations from skype, photos, scans from dozens of hand-scrawled notebooks, and a whole bunch of other stuff.
Like most people I can search some of this stuff on my desktop, and I back it up to a USB drive at home.
But this isn’t satisfying me.
What I want is a giant elastic bit bucket in the cloud, with a powerful search engine on top of it. And no service provides this today.
Dropbox and its ilk put the data in the cloud so you don’t have to worry about physical platters and so that you can get at the data wherever you are.
But the one feature that all of these services are missing is, for me, the most critical. And that’s search.
See, I want this service to dig deep into my data to make it all findable. To parse all my old mbox files (or for windows users, PSTs) and to transcribe and index all my phone calls and videos. I want it to perform face recognition on all my photos and to OCR all my scanned documents and the pictures I took of scribbled napkin-notes, and make those searchable too.
And I want it to provide fabulous web-based data-type-specific UI for viewing all these things, so email conversations look like threads and photos of people show up in a nice web gallery. And I’ll need a timeline view of my life that I can zoom into and out of, seeing the few scanned film-camera photos and homework emails of the 90s, the frenetic startup emails of the early ’00s, and the steadily growing hordes of me-bits growing into the future, all there, all a jumble, but all findable when I need them.
This search has to be blazingly fast, find-as-you-type fast, so I really feel like my whole life is at my fingertips all the time.
And everything should be taggable and of course I want all my stuff to be public-key encrypted, though I’m not sure how that will work with the indexing engine which will probably need to run in the cloud too.
Of course my data is still growing, so this bucket will need tubes running into it so it stays up to date. A gmail tube, and a flickr tube, and a Google Docs tube, and a running-sync-of-my-laptop tube.
What I want, basically, is Google for my life. Well, not quite. I want Google, with Evernote OCR and Google voice’s ASR, and markmail’s fast mail archive searching UI, and dropbox’s simple bit-bucket interface, and iPhoto’s face recognition.
And, frankly, I don’t care about social features. I just want my stuff all in one place, securely, and easily findable.
I want Memex, for me.
Posted on 29 October 2009
- Leave a comment
- Subscribe with Google Reader
- Follow me on Twitter
Did you like this article?
-
It would be nice if you could point beagle/tracker (or something like them) at an old drive like this attached by usb (or even written dvd/whatever) and save all this to some big index file on your current disk.
Then be able to search all this just from the index file when the media itself is not plugged in.
S++
-
You can use Broken Disk Manager lets you create a listing of all files in removable mediums and works in Wine.
The closest native tool on Gnome is Gnome Catalog but there is very little development and it is primitive.
It would be great if Tracker could be implemented or forked to keep indexes of “offline/removable” data.
-
I wonder how good/bad life would be if we would require lesser things to remember! Would our intelligence/processing capacity increase with lesser things to remember? But, may be we can derive better ideas, only if we had everything in mind. What would one’s mind precess if it had no data to process?
-
You should check out s3backer (code.google.com/p/s3backer) for having an unlimited bit bucket. It mounts an S3 bucket to your VFS under *NIX. As for searching… Let your choice of file system handle the metadata
-
…and then imagine if Dashboard automatically showed *your* content relevant to what you’re doing now, where you are/..
-
I think one big thing holding back storage is that most people fail to realize that they are getting the short end of the stick paying for plans that have unlimited transfer when they would be much better off with incurring costs on a model closer to the grid-owners’ by paying a little more up front and much less to retain the data. Anyone got any suggestions on where I can park my 10gig raw YUV that I will _probably_ never want again but might that will get it geo-redundant? I know it is a small use case, but it is one the right service ought to be able to do for $2.00 and maintain it indefinitely off the interest.
-
I have this *exact* same issues (minus the home movies from the 80s). One of my ideas was opening a gmail account and write a script that emailed the my data to it and then checked periodically to send any documents or whatever that has changed. Of course the limitation is the size, last I checked I only have like 7gb on gmail.
-
I’d like this too. I’m sure Google is working on it.
But, I’m also a bit skeptical (still) of hosted services, even ones hosted by Google. I had to upgrade my mail server and using Gmail was a strong candidate, but instead I decided to deploy Zimbra instead. When I lose data, I want it to be my own damn fault!
I have been gathering all my archive data onto a slowly growing Drobo. It would make sense for me to make a cloud backup of this. I don’t have global search, though, and nobody makes a personal search appliance — they are all very expensive devices or software from Google, Autonomy, FAST (now Microsoft) and so forth.
I’m going to say something terribly sacrilegious now — I’m starting to get scared of Google. Recent actions are very much out of the Microsoft playbook — we’re so big we can take a market, deploy a new product for free, and eliminate the competition. (Google Maps Navigator is the most recent, and significant, example.) So far they have not been “evil” towards the end user with this power, but it has the problem that it destroys innovation. When the only GPS nav is Google GPS nav, you get the features that Google wants to add, no more. While Google has an environment of “let’s keep adding neat stuff so we have better products that are more fun and effective to use even though we have a dominant market share”, Microsoft does not, and someday the Google attitude could shift…
… but until then, I’d really like Google Maps Navigator on my iPhone please.
-
Nat:
What you are talking about here is a Personal Cloud. It happens to be the very space that I am working in now at Mozy. Check out Frank Gillet’s (Forrester Research) recent paper called “The Personal Cloud” for a vendor-neutral reference.
–Ted -
Not sure if Mozy has search but why not consider talking to the iFolder folks in Bangalore. I’m sure this may be a good route to go with since you actually have an affiliation with these guys in a way.
Plus, it seems like iFolder could use some good ideas and freshing IMHO.

15 comments