同步操作将从 Juicedata/JuiceFS 强制同步,此操作会覆盖自 Fork 仓库以来所做的任何修改,且无法恢复!!!
确定后同步将在后台操作,完成时将刷新页面,请耐心等待。
Alluxio (/əˈlʌksio/) is a data access layer in the big data and machine learning ecosystem. Initially as research project "Tachyon", it was created at the University of California, Berkeley's AMPLab as creator's Ph.D. thesis in 2013. Alluxio was open sourced in 2014.
The following table shows difference of main features between Alluxio and JuiceFS.
Features | Alluxio | JuiceFS |
---|---|---|
Storage format | Object | Block |
Language | Java | Go |
Cache granularity | 64MiB | 4MiB |
Multi-tier cache | ✓ | ✓ |
Hadoop-compatible | ✓ | ✓ |
S3-compatible | ✓ | ✓ |
Kubernetes CSI driver | ✓ | ✓ |
Hadoop data locality | ✓ | ✓ |
Fully POSIX-compatible | ✕ | ✓ |
Atomic metadata operation | ✕ | ✓ |
Consistency | ✕ | ✓ |
Data compression | ✕ | ✓ |
Data encryption | ✕ | ✓ |
Zero-effort operation | ✕ | ✓ |
The storage format of one file in JuiceFS consists of three levels: chunk, slice and block. A file will be split into multiple blocks, and be compressed and encrypted (optional) store into object storage.
Alluxio stores file as object to UFS. The file doesn't be split info blocks like JuiceFS does.
The default block size of JuiceFS is 4MiB, compare to 64MiB of Alluxio, the granularity is smaller. The smaller block size is better for random read (e.g. Parquet and ORC) workload, i.e. cache management will be more efficiency.
JuiceFS is HDFS-compatible. Not only compatible with Hadoop 2.x and Hadoop 3.x, but also variety of components in Hadoop ecosystem.
JuiceFS provides Kubernetes CSI driver to help people who want to use JuiceFS in Kubernetes. Alluxio provides Kubernetes CSI driver too, but this project seems like not active maintained and not official supported by Alluxio.
JuiceFS is fully POSIX-compatible. One pjdfstest from JD.com shows that Alluxio didn't pass the POSIX compatibility test, e.g. Alluxio doesn't support symbolic link, truncate, fallocate, append, xattr, mkfifo, mknod and utimes. Besides the things covered by pjdfstest, JuiceFS also provides close-to-open consistency, atomic metadata operation, mmap, fallocate with punch hole, xattr, BSD locks (flock) and POSIX record locks (fcntl).
A metadata operation in Alluxio has two steps: the first step is modify state of Alluxio master, the second step is send request to UFS. As you can see, the metadata operation isn't atomic, its state is unpredictable when the operation is executing or any failure occurs. Alluxio relies on UFS to implement metadata operations, for example rename file operation will become copy and delete operations.
Thanks to Redis transaction, most of metadata operations of JuiceFS are atomic, e.g. rename file, delete file, rename directory. You don't have to worry about the consistency and performance.
Alluxio loads metadata from the UFS as needed and it doesn't have information about UFS at startup. By default, Alluxio expects that all modifications to UFS occur through Alluxio. If changes are made to UFS directly, you need sync metadata between Alluxio and UFS either manually or periodically. As "Atomic metadata operation" section says, the two steps metadata operation may resulting in inconsistency.
JuiceFS provides strong consistency, both metadata and data. The metadata service of JuiceFS is the single source of truth, not a mirror of UFS. The metadata service doesn't rely on object storage to obtain metadata. Object storage just be treated as an unlimited block storage. There isn't any inconsistency between JuiceFS and object storage.
JuiceFS supports use LZ4 or Zstandard to compress all your data. Alluxio doesn't have this feature.
JuiceFS supports data encryption in transit and at rest. Alluxio community edition doesn't have this feature, but enterprise edition has.
Alluxio's architecture can be divided into 3 components: master, worker and client. A typical cluster consists of a single leading master, standby masters, a job master, standby job masters, workers, and job workers. You need operation these masters and workers by yourself.
JuiceFS uses Redis as the metadata service. You could use Redis service managed by public cloud provider easily. There isn't any operation needed. See "Redis Best Practices" for more information.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。