12 Star 64 Fork 9

Juicedata / JuiceFS

Create your Gitee Account
Explore and code with more than 6 million developers,Free private repositories !:)
Sign up
Clone or download
how_to_setup_object_storage.md 27.71 KB
Copy Edit Web IDE Raw Blame History

How to Setup Object Storage

By reading JuiceFS Technical Architecture and How JuiceFS Store Files, you will understand that JuiceFS is designed to store data and metadata independently. Generally , the data is stored in the cloud storage based on object storage, and the metadata corresponding to the data is stored in an independent database.

Storage setting options

When creating a JuiceFS file system, setting up data storage generally involves the following options:

  • --storage Specify the storage service to be used by the file system, e.g. --storage s3
  • --bucket Specify the bucket endpoint of the object storage in a specific format, e.g. --bucket https://myjuicefs.s3.us-east-2.amazonaws.com
  • --access-key and --secret-key is the authentication key used when accessing the object storage service. You need to create it on the corresponding cloud platform.

For example, the following command uses Amazon S3 object storage to create a file system:

$ juicefs format --storage s3 \
	--bucket https://myjuicefs.s3.us-east-2.amazonaws.com \
	--access-key abcdefghijklmn \
	--secret-key nmlkjihgfedAcBdEfg \
	redis://192.168.1.6/1 \
	my-juice

Similarly, you can adjust the parameters and use almost all public/private cloud object storage services to create a file system.

Access Key and Secret Key

Generally, the object storage service uses access key and secret key to verify user identity. When creating a file system, in addition to using the two options --access-key and --secret-key to explicitly set. You can also set it through two environment variables ACCESS_KEY and SECRET_KEY.

Public cloud provider usually allow user create IAM (Identity and Access Management) role (e.g. AWS IAM role) or similar thing (e.g. Alibaba Cloud RAM role), then assign the role to VM instance. If your VM instance already have permission to access object storage, then you could omit --access-key and --secret-key options.

Supported Object Storage

The following table lists the object storage services supported by JuiceFS. Click the name to view the setting details:

If the object storage service you want is not in the list, please submit a request issue.

Name Value
Amazon S3 s3
Google Cloud Storage gs
Azure Blob Storage wasb
Backblaze B2 Cloud Storage b2
IBM Cloud Object Storage ibmcos
Scaleway Object Storage scw
DigitalOcean Spaces Object Storage space
Wasabi Cloud Object Storage wasabi
Storj DCS s3
Alibaba Cloud Object Storage Service oss
Tencent Cloud Object Storage cos
Huawei Cloud Object Storage Service obs
Baidu Object Storage bos
Kingsoft Cloud Standard Storage Service ks3
Meituan Storage Service mss
NetEase Object Storage nos
QingStor Object Storage qingstor
Qiniu Cloud Object Storage qiniu
Sina Cloud Storage scs
CTYun Object-Oriented Storage oos
ECloud (China Mobile Cloud) Object Storage eos
SpeedyCloud Object Storage speedy
UCloud US3 ufile
Ceph RADOS ceph
Ceph Object Gateway (RGW) s3
Swift swift
MinIO minio
HDFS hdfs
Redis redis
Local disk file

S3

S3 supports two style endpoint URI: virtual hosted-style and path-style. The difference between them is:

  • Virtual hosted-style: https://<bucket>.s3.<region>.amazonaws.com
  • Path-style: https://s3.<region>.amazonaws.com/<bucket>

The <region> should be replaced with specific region code, e.g. the region code of US East (N. Virginia) is us-east-1. You could find all available regions at here.

Note: For AWS China user, you need add .cn to the host, i.e. amazonaws.com.cn. And check this document to know your region code.

JuiceFS supports both types of endpoint since v0.12 (before v0.12, only virtual hosted-style were supported). So when you format a volume, the --bucket option can be either virtual hosted-style URI or path-style URI. For example:

# virtual hosted-style
$ ./juicefs format \
    --storage s3 \
    --bucket https://<bucket>.s3.<region>.amazonaws.com \
    ... \
    localhost test
# path-style
$ ./juicefs format \
    --storage s3 \
    --bucket https://s3.<region>.amazonaws.com/<bucket> \
    ... \
    localhost test

You can also use S3 storage type to connect with S3-compatible storage. For example:

# virtual hosted-style
$ ./juicefs format \
    --storage s3 \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test
# path-style
$ ./juicefs format \
    --storage s3 \
    --bucket https://<endpoint>/<bucket> \
    ... \
    localhost test

Google Cloud Storage

Because Google Cloud doesn't have access key and secret key, the --access-key and --secret-key options can be omitted. Please follow Google Cloud document to know how authentication and authorization work. Typically, when you running within Google Cloud, you already have permission to access the storage.

And because bucket name is globally unique, when you specify the --bucket option could just provide its name. For example:

$ ./juicefs format \
    --storage gs \
    --bucket gs://<bucket> \
    ... \
    localhost test

Azure Blob Storage

Besides provide authorization information through --access-key and --secret-key options, you could also create a connection string and set AZURE_STORAGE_CONNECTION_STRING environment variable. For example:

# Use connection string
$ export AZURE_STORAGE_CONNECTION_STRING="DefaultEndpointsProtocol=https;AccountName=XXX;AccountKey=XXX;EndpointSuffix=core.windows.net"
$ ./juicefs format \
    --storage wasb \
    --bucket https://<container> \
    ... \
    localhost test

Note: For Azure China user, the value of EndpointSuffix is core.chinacloudapi.cn.

Backblaze B2 Cloud Storage

You need first creating application key. The "Application Key ID" and "Application Key" are the equivalent of access key and secret key respectively.

The --bucket option could only have bucket name. For example:

$ ./juicefs format \
    --storage b2 \
    --bucket https://<bucket> \
    --access-key <application-key-ID> \
    --secret-key <application-key> \
    ... \
    localhost test

IBM Cloud Object Storage

You need first creating API key and retrieving instance ID. The "API key" and "instance ID" are the equivalent of access key and secret key respectively.

IBM Cloud Object Storage provides multiple endpoints for each region, depends on your network (e.g. public or private network), you should use appropriate endpoint. For example:

$ ./juicefs format \
    --storage ibmcos \
    --bucket https://<bucket>.<endpoint> \
    --access-key <API-key> \
    --secret-key <instance-ID> \
    ... \
    localhost test

Scaleway Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3.<region>.scw.cloud, replace <region> with specific region code, e.g. the region code of "Amsterdam, The Netherlands" is nl-ams. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage scw \
    --bucket https://<bucket>.s3.<region>.scw.cloud \
    ... \
    localhost test

DigitalOcean Spaces Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<space-name>.<region>.digitaloceanspaces.com, replace <region> with specific region code, e.g. nyc3. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage space \
    --bucket https://<space-name>.<region>.digitaloceanspaces.com \
    ... \
    localhost test

Wasabi Cloud Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3.<region>.wasabisys.com, replace <region> with specific region code, e.g. the region code of US East 1 (N. Virginia) is us-east-1. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage wasabi \
    --bucket https://<bucket>.s3.<region>.wasabisys.com \
    ... \
    localhost test

Note: For Tokyo (ap-northeast-1) region user, see this document to learn how to get appropriate endpoint URI.

Storj DCS

When using Storj DCS to create a JuiceFS file system, please refer to this document to learn how to create access key and secret key.

Storj DCS is an S3-compatible storage, just use s3 for --storage option. The setting format of the --bucket option is https://gateway.<region>.storjshare.io/<bucket>, please replace <region> with the storage region you actually use. There are currently three avaliable regions: us1, ap1 and eu1. For example:

$ juicefs format \
	--storage s3 \
	--bucket https://gateway.<region>.storjshare.io/<bucket> \
	--access-key <your-access-key> \
	--secret-key <your-sceret-key> \
	redis://localhost/1 my-jfs

Alibaba Cloud Object Storage Service

Please follow this document to learn how to get access key and secret key. And if you already created RAM role and assign it to VM instance, you could omit --access-key and --secret-key options. Alibaba Cloud also supports use Security Token Service (STS) to authorize temporary access to OSS. If you wanna use STS, you should omit --access-key and --secret-key options and set ALICLOUD_ACCESS_KEY_ID, ALICLOUD_ACCESS_KEY_SECRET, SECURITY_TOKEN environment variables instead, for example:

# Use Security Token Service (STS)
$ export ALICLOUD_ACCESS_KEY_ID=XXX
$ export ALICLOUD_ACCESS_KEY_SECRET=XXX
$ export SECURITY_TOKEN=XXX
$ ./juicefs format \
    --storage oss \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

OSS provides multiple endpoints for each region, depends on your network (e.g. public or internal network), you should use appropriate endpoint. When you running within Alibaba Cloud, you could omit <endpoint> in --bucket option. JuiceFS will choose appropriate endpoint automatically. For example:

# Running within Alibaba Cloud
$ ./juicefs format \
    --storage oss \
    --bucket https://<bucket> \
    ... \
    localhost test

Tencent Cloud Object Storage

The naming rule of bucket in Tencent Cloud is <bucket>-<APPID>, so you must append APPID to the bucket name. Please follow this document to learn how to get APPID.

The full format of --bucket option is https://<bucket>-<APPID>.cos.<region>.myqcloud.com, replace <region> with specific region code, e.g. the region code of Shanghai is ap-shanghai. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage cos \
    --bucket https://<bucket>-<APPID>.cos.<region>.myqcloud.com \
    ... \
    localhost test

When you running within Tencent Cloud, you could omit .cos.<region>.myqcloud.com part in --bucket option. JuiceFS will choose appropriate endpoint automatically. For example:

# Running within Tencent Cloud
$ ./juicefs format \
    --storage cos \
    --bucket https://<bucket>-<APPID> \
    ... \
    localhost test

Huawei Cloud Object Storage Service

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.obs.<region>.myhuaweicloud.com, replace <region> with specific region code, e.g. the region code of Beijing 1 is cn-north-1. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage obs \
    --bucket https://<bucket>.obs.<region>.myhuaweicloud.com \
    ... \
    localhost test

When you running within Huawei Cloud, you could omit .obs.<region>.myhuaweicloud.com part in --bucket option. JuiceFS will choose appropriate endpoint automatically. For example:

# Running within Huawei Cloud
$ ./juicefs format \
    --storage obs \
    --bucket https://<bucket> \
    ... \
    localhost test

Baidu Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<region>.bcebos.com, replace <region> with specific region code, e.g. the region code of Beijing is bj. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage bos \
    --bucket https://<bucket>.<region>.bcebos.com \
    ... \
    localhost test

When you running within Baidu Cloud, you could omit .<region>.bcebos.com part in --bucket option. JuiceFS will choose appropriate endpoint automatically. For example:

# Running within Baidu Cloud
$ ./juicefs format \
    --storage bos \
    --bucket https://<bucket> \
    ... \
    localhost test

Kingsoft Cloud Standard Storage Service

Please follow this document to learn how to get access key and secret key.

KS3 provides multiple endpoints for each region, depends on your network (e.g. public or internal network), you should use appropriate endpoint. For example:

$ ./juicefs format \
    --storage ks3 \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

Meituan Storage Service

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<endpoint>, replace <endpoint> with specific value, e.g. mtmss.com. You could find all available endpoints at here. For example:

$ ./juicefs format \
    --storage mss \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

NetEase Object Storage

Please follow this document to learn how to get access key and secret key.

NOS provides multiple endpoints for each region, depends on your network (e.g. public or internal network), you should use appropriate endpoint. For example:

$ ./juicefs format \
    --storage nos \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

QingStor Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.<region>.qingstor.com, replace <region> with specific region code, e.g. the region code of Beijing 3-A is pek3a. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage qingstor \
    --bucket https://<bucket>.<region>.qingstor.com \
    ... \
    localhost test

Qiniu Cloud Object Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.s3-<region>.qiniucs.com, replace <region> with specific region code, e.g. the region code of China East is cn-east-1. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage qiniu \
    --bucket https://<bucket>.s3-<region>.qiniucs.com \
    ... \
    localhost test

Sina Cloud Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.stor.sinaapp.com. For example:

$ ./juicefs format \
    --storage scs \
    --bucket https://<bucket>.stor.sinaapp.com \
    ... \
    localhost test

CTYun Object-Oriented Storage

Please follow this document to learn how to get access key and secret key.

The --bucket option format is https://<bucket>.oss-<region>.ctyunapi.cn, replace <region> with specific region code, e.g. the region code of Chengdu is sccd. You could find all available regions at here. For example:

$ ./juicefs format \
    --storage oos \
    --bucket https://<bucket>.oss-<region>.ctyunapi.cn \
    ... \
    localhost test

ECloud (China Mobile Cloud) Object Storage

Please follow this document to learn how to get access key and secret key.

ECloud Object Storage provides multiple endpoints for each region, depends on your network (e.g. public or internal network), you should use appropriate endpoint. For example:

$ ./juicefs format \
    --storage eos \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

SpeedyCloud Object Storage

Writing ...

UCloud US3

Please follow this document to learn how to get access key and secret key.

US3 (formerly UFile) provides multiple endpoints for each region, depends on your network (e.g. public or internal network), you should use appropriate endpoint. For example:

$ ./juicefs format \
    --storage ufile \
    --bucket https://<bucket>.<endpoint> \
    ... \
    localhost test

Ceph RADOS

The Ceph Storage Cluster has a messaging layer protocol that enables clients to interact with a Ceph Monitor and a Ceph OSD Daemon. The librados API enables you to interact with the two types of daemons:

JuiceFS supports the use of native Ceph APIs based on librados. You need install librados library and build juicefs binary separately.

First installing librados:

# Debian based system
$ sudo apt-get install librados-dev

# RPM based system
$ sudo yum install librados-devel

Then compile JuiceFS for Ceph (ensure you have Go 1.14+ and GCC 5.4+):

$ make juicefs.ceph

The --bucket option format is ceph://<pool-name>. A pool is logical partition for storing objects. You may need first creating a pool. The value of --access-key option is Ceph cluster name, the default cluster name is ceph. The value of --secret-key option is Ceph client user name, the default user name is client.admin.

For connect to Ceph Monitor, librados will read Ceph configuration file by search default locations and the first found is used. The locations are:

  • CEPH_CONF environment variable
  • /etc/ceph/ceph.conf
  • ~/.ceph/config
  • ceph.conf in the current working directory

The example command is:

$ ./juicefs.ceph format \
    --storage ceph \
    --bucket ceph://<pool-name> \
    --access-key <cluster-name> \
    --secret-key <user-name> \
    ... \
    localhost test

Ceph Object Gateway (RGW)

Ceph Object Gateway is an object storage interface built on top of librados to provide applications with a RESTful gateway to Ceph Storage Clusters. Ceph Object Gateway supports S3-compatible interface, so we could set --storage to s3 directly.

The --bucket option format is http://<bucket>.<endpoint> (virtual hosted-style). For example:

$ ./juicefs format \
    --storage s3 \
    --bucket http://<bucket>.<endpoint> \
    ... \
    localhost test

Swift

OpenStack Swift is a distributed object storage system designed to scale from a single machine to thousands of servers. Swift is optimized for multi-tenancy and high concurrency. Swift is ideal for backups, web and mobile content, and any other unstructured data that can grow without bound.

The --bucket option format is http://<container>.<endpoint>. A container defines a namespace for objects. Currently, JuiceFS only supports Swift V1 authentication. The value of --access-key option is username. The value of --secret-key option is password. For example:

$ ./juicefs format \
    --storage swift \
    --bucket http://<container>.<endpoint> \
    --access-key <username> \
    --secret-key <password> \
    ... \
    localhost test

MinIO

MinIO is an open source high performance object storage. It is API compatible with Amazon S3. You need set --storage option to minio. Currently, JuiceFS only supports path-style URI when use MinIO storage. For example (<endpoint> may looks like 1.2.3.4:9000):

$ ./juicefs format \
    --storage minio \
    --bucket http://<endpoint>/<bucket> \
    ... \
    localhost test

HDFS

HDFS is the file system for Hadoop, which can be used as the object store for JuiceFS. When HDFS is used, --access-key can be used to specify the username, and hdfs is usually the default superuser. For example:

$ ./juicefs format \
    --storage hdfs \
    --bucket namenode1:8020 \
    --access-key hdfs \
    localhost test

When the --access-key is not specified during formatting, JuiceFS will use the current user of juicefs mount or Hadoop SDK to access HDFS. It will hang and fail with IO error eventually, if the current user don't have enough permission to read/write the blocks in HDFS.

JuiceFS will try to load configurations for HDFS client based on $HADOOP_CONF_DIR or $HADOOP_HOME. If an empty value is provided to --bucket, the default HDFS found in Hadoop configurations will be used.

For HA cluster, the addresses of NameNodes can be specified together like this: --bucket=namenode1:port,namenode2:port.

Redis

Writing ...

Local disk

When creating JuiceFS storage, if no storage type is specified, the local disk will be used to store data by default. The default storage path for root user is /var/jfs, and ~/.juicefs/local is for ordinary users.

For example, using the local Redis database and local disk to create a JuiceFS storage named test:

$ ./juicefs format redis://localhost:6379/1 test

Local storage is only used to understand and experience the basic functions of JuiceFS. The created JuiceFS storage cannot be mounted by other clients in the network and can only be used on a stand-alone machine.

If you need to evaluate JuiceFS, it is recommended to use object storage services.

Note: JuiceFS storage created using local storage cannot be mounted by other hosts on the network. This is because the data sharing function of JuiceFS relies on the object storage and metadata service that can be accessed by all clients. If the storage service and metadata service used when creating JuiceFS storage cannot be accessed by other clients in the network, other clients cannot mount and use the JuiceFS storage.

Comment ( 0 )

Sign in for post a comment

1
https://gitee.com/juicedata/JuiceFS.git
git@gitee.com:juicedata/JuiceFS.git
juicedata
JuiceFS
JuiceFS
main

Search