Ceph vs HDFS
For the data nerds out there:
I've been considering for a while if it wouldn't be better to dump HDFS and replace it with Ceph. HDFS just feels so broken/fragile/abandoned. Like at some point in 2008, everyone jumped on the HDFS bandwagon, and by 2010 everyone had jumped off. Leaving just a few lonely holdouts to gaze at the Jira carnage.
http://ceph.com/ceph-storage/file-system/
Now with data locality for Hadoop Jobs.
http://www.mail-archive.com/ceph-users@lists.ceph.com/msg02306.html
The one thing MapR has over Cloudera is a nice Posix interface. I suspect Ceph is even better AND 100% open source. I'm going to give it 1 year before this becomes mainstream. By then enough people will have blazed the trail to make this a sure thing.
-JD
3 Comments:
Ceph to replace HDFS, Do you still believe in that?
Well.. Seems like my prediction was pretty far off. :)
These days if I'm doing BigData type work. I'm in EMR, using Spark, with S3 as the source and destination.
Mostly out of convenience since I don't run my own cluster anymore... I have no idea where the high end perf went to these days.
I'm actually surprised this account still exists.
Post a Comment
<< Home