By Herald Yu
I want to create a filesystem by JuiceFS which combines with DigitalOcean managed Redis and Spaces. Does anyone have an idea?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Accepted Answer
When your business system needs to store and share a large amount of unstructured data in a distributed computing environment, you need to consider using the open source JuiceFS storage at this time.
JuiceFS is a cloud-native distributed file system designed specifically for large-scale data storage scenarios. It is released under the AGPLv3. Any enterprise and individual can freely use JuiceFS under the agreement.
The architecture of JuiceFS is to store all data in the cloud, the data is mainly stored in object storage, and the corresponding metadata is stored independently in the database. In terms of object storage, it supports almost all object storage services. In terms of databases, it supports Redis, TiKV, PostgreSQL, MySQL, MariaDB, etc., and more databases will be supported in the future.
It is not difficult to understand that local storage has many limitations, such as capacity exhaustion, single point of failure, and difficulty in sharing. These problems become particularly obvious when storing large-scale data. JuiceFS does not have these problems at all. First of all, it uses object storage to store all data, breaking the upper limit of capacity, and the storage space is approaching infinite. Secondly, all data and metadata are stored in the cloud, which determines that it is very suitable for being mounted and shared by multiple hosts at the same time. In addition, there is a single point of failure of any host that mounts JuiceFS, which will not affect the stored data and other hosts.
JuiceFS is designed for the cloud. When using the out-of-the-box storage and database services of the cloud platform, the installation can be completed in a few minutes. This article is oriented towards DigitalOcean and introduces how to quickly and easily install and use JuiceFS on the cloud computing platform.
JuiceFS is driven by a combination of storage and database, so you need to prepare:
The cloud server on DigitalOcean is called Droplet. You don’t need to purchase a new Droplet separately to use JuiceFS. If you already have a Droplet in use, which cloud server needs JuiceFS storage, just install the JuiceFS client on it.
JuiceFS has no special requirements for hardware, and Droplets of any specification can be used stably. However, it is recommended to choose a better-performing SSD and reserve at least 1GB of capacity for JuiceFS as a local cache.
JuiceFS supports Linux, BSD, macOS, and Windows. In this article, we use Ubuntu Server 20.04.
JuiceFS uses object storage to store all data. Using Spaces on DigitalOcean is the easiest solution. Spaces is an S3-compatible object storage service that works out of the box. It is recommended to select the same area as the Droplet when creating it so that you can get the best access speed and avoid additional traffic charges.
Of course, you can also use object storage services on other platforms, or use Ceph or MinIO to build manually on Droplet. In short, you are free to choose the object storage you want to use, as long as you make sure that the JuiceFS client can access the object storage API.
Here, I created a Space named juicefs
, the region is Singapore sgp1
, and its access address is:
In addition, you need to create Spaces access keys
in the API menu, and JuiceFS needs to use it to access the Spaces API.
Unlike the local file system, JuiceFS stores all the metadata corresponding to the data in an independent database, so that the larger the size of the stored data, the better the performance.
Currently, JuiceFS supports common databases such as Redis, TiKV, MySQL/MariaDB, PostgreSQL, and SQLite, and it is also continuing to develop support for other databases. If the database you need is not yet supported, please submit an Issue for feedback.
In terms of performance, scale, and reliability, each database has its own advantages and disadvantages, and you should choose according to actual scenarios.
Please don’t worry about the choice of database. The JuiceFS client supports metadata migration. You can easily export metadata from one database and migrate it to other databases.
In this article, we use DigitalOcean’s Redis 6 database managed service, select the region Singapore, and select the same VPC private network as the existing Droplet. It takes about 5 minutes to create a Redis cluster. We follow the setup wizard to initialize the database cluster.
By default, the Redis cluster allows all inbound connections. For security reasons, you should select the Droplet that has access to the Redis cluster in the security setting section of the setup wizard in the Add trusted sources
, that is, only allow the selected host to access the Redis cluster.
In the setting of the eviction policy, it is recommended to select noeviction
, that is, when the memory is exhausted, only errors are reported and no data is evictioned.
Note: In order to ensure the safety and integrity of metadata, please do not select
allkeys-lru
andallkey-random
for the eviction policy.
The access address of the Redis cluster can be found in the Connection Details
of the console. If all computing resources are in DigitalOcean, it is recommended to use the VPC private network for connection first, which can maximize security.
I am currently using Ubuntu Server 20.04, execute the following commands in sequence to install the latest version of the client.
Check current system and set temporary environment variables:
$ JFS_LATEST_TAG=$(curl -s https://api.github.com/repos/juicedata/juicefs/releases/latest | grep 'tag_name' | cut -d '"' -f 4 | tr -d 'v')
Download the latest version of the client software package adapted to the current system:
$ wget "https://github.com/juicedata/juicefs/releases/download/v${JFS_LATEST_TAG}/juicefs-${JFS_LATEST_TAG}-linux-amd64.tar.gz"
Unzip the installation package:
$ mkdir juice && tar -zxvf "juicefs-${JFS_LATEST_TAG}-linux-amd64.tar.gz" -C juice
Install the client to /usr/local/bin
:
$ sudo install juice/juicefs /usr/local/bin
Execute the command and see the command help information returned to juicefs
, which means that the client is installed successfully.
$ juicefs
NAME:
juicefs - A POSIX file system built on Redis and object storage.
USAGE:
juicefs [global options] command [command options] [arguments...]
VERSION:
0.16.2 (2021-08-25T04:01:15Z 29d6fee)
COMMANDS:
format format a volume
mount mount a volume
umount unmount a volume
gateway S3-compatible gateway
sync sync between two storage
rmr remove directories recursively
info show internal information for paths or inodes
bench run benchmark to read/write/stat big/small files
gc collect any leaked objects
fsck Check consistency of file system
profile analyze access log
stats show runtime stats
status show status of JuiceFS
warmup build cache for target directories/files
dump dump metadata into a JSON file
load load metadata from a previously dumped JSON file
help, h Shows a list of commands or help for one command
GLOBAL OPTIONS:
--verbose, --debug, -v enable debug log (default: false)
--quiet, -q only warning and errors (default: false)
--trace enable trace log (default: false)
--no-agent Disable pprof (:6060) and gops (:6070) agent (default: false)
--help, -h show help (default: false)
--version, -V print only the version (default: false)
COPYRIGHT:
AGPLv3
In addition, you can also visit the JuiceFS GitHub Releases page to select other versions for manual installation.
To create a file system, use the format
subcommand, the format is:
juicefs format [command options] META-URL NAME
The following command creates a file system named mystor
:
juicefs format \
--storage space \
--bucket https://juicefs.sgp1.digitaloceanspaces.com \
--access-key <your-access-key-id> \
--secret-key <your-access-key-secret> \
rediss://default:your-password@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1 \
mystor
Parameter Description:
--storage
: Specify the data storage engine, here is space
, click here to view all [supported storage](https://github.com/juicedata/juicefs/blob/main/docs/en/ how_to_setup_object_storage.md).--bucket
: Specify the bucket access address.--access-key
and --secret-key
: Specify the secret key for accessing the object storage API.rediss://
protocol header. The /1
added at the end of the link represents the use of Redis’s No. 1 database.If you see output similar to the following, it means that the file system is created successfully.
2021/08/23 16:36:28.450686 juicefs[2869028] <INFO>: Meta address: rediss://default@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1
2021/08/23 16:36:28.481251 juicefs[2869028] <WARNING>: AOF is not enabled, you may lose data if Redis is not shutdown properly.
2021/08/23 16:36:28.481763 juicefs[2869028] <INFO>: Ping redis: 331.706µs
2021/08/23 16:36:28.482266 juicefs[2869028] <INFO>: Data uses space://juicefs/mystor/
2021/08/23 16:36:28.534677 juicefs[2869028] <INFO>: Volume is formatted as {Name:mystor UUID:6b0452fc-0502-404c-b163-c9ab577ec766 Storage:space Bucket:https://juicefs.sgp1.digitaloceanspaces.com AccessKey:7G7WQBY2QUCBQC5H2DGK SecretKey:removed BlockSize:4096 Compression:none Shards:0 Partitions:0 Capacity:0 Inodes:0 EncryptKey:}
To mount a file system, use the mount
subcommand, and use the -d
parameter to mount it as a daemon. The following command mounts the newly created file system to the mnt
directory under the current directory:
$ sudo juicefs mount -d \
rediss://default:your-password@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1 mnt
The purpose of using sudo
to perform the mount operation is to allow juicefs to have the authority to create a cache directory under /var/
. Please note that when mounting the file system, you only need to specify the database address
and the mount point
, not the name of the file system.
If you see an output similar to the following, it means that the file system is mounted successfully.
2021/08/23 16:39:14.202151 juicefs[2869081] <INFO>: Meta address: rediss://default@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1
2021/08/23 16:39:14.234925 juicefs[2869081] <WARNING>: AOF is not enabled, you may lose data if Redis is not shutdown properly.
2021/08/23 16:39:14.235536 juicefs[2869081] <INFO>: Ping redis: 446.247µs
2021/08/23 16:39:14.236231 juicefs[2869081] <INFO>: Data use space://juicefs/mystor/
2021/08/23 16:39:14.236540 juicefs[2869081] <INFO>: Disk cache (/var/jfsCache/6b0452fc-0502-404c-b163-c9ab577ec766/): capacity (1024 MB), free ratio (10%), max pending pages (15)
2021/08/23 16:39:14.738416 juicefs[2869081] <INFO>: OK, mystor is ready at mnt
Use the df
command to see the mounting status of the file system:
$ df -Th
File system type capacity used usable used% mount point
JuiceFS:mystor fuse.juicefs 1.0P 64K 1.0P 1% /home/herald/mnt
As you can see from the output information of the mount command, JuiceFS defaults to sets 1024 MB as the local cache. Setting a larger cache can make JuiceFS have better performance. You can set the cache (in MiB) through the --cache-size
option when mounting a file system. For example, set a 20GB local cache:
$ sudo juicefs mount -d --cache-size 20000 \
rediss://default:your-password@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1 mnt
After the file system is mounted, you can store data in the ~/mnt
directory just like using a local hard disk.
Use the status
subcommand to view the basic information and connection status of a file system. You only need to specify the database URL.
$ juicefs status rediss://default:bn8l7ui2cun4iaji@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1
2021/08/23 16:48:48.567046 juicefs[2869156] <INFO>: Meta address: rediss://default@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1
2021/08/23 16:48:48.597513 juicefs[2869156] <WARNING>: AOF is not enabled, you may lose data if Redis is not shutdown properly.
2021/08/23 16:48:48.598193 juicefs[2869156] <INFO>: Ping redis: 491.003µs
{
"Setting": {
"Name": "mystor",
"UUID": "6b0452fc-0502-404c-b163-c9ab577ec766",
"Storage": "space",
"Bucket": "https://juicefs.sgp1.digitaloceanspaces.com",
"AccessKey": "7G7WQBY2QUCBQC5H2DGK",
"SecretKey": "removed",
"BlockSize": 4096,
"Compression": "none",
"Shards": 0,
"Partitions": 0,
"Capacity": 0,
"Inodes": 0
},
"Sessions": [
{
"Sid": 1,
"Heartbeat": "2021-08-23T16:46:14+08:00",
"Version": "0.16.2 (2021-08-25T04:01:15Z 29d6fee)",
"Hostname": "ubuntu-s-1vcpu-1gb-sgp1-01",
"MountPoint": "/home/herald/mnt",
"ProcessID": 2869091
},
{
"Sid": 2,
"Heartbeat": "2021-08-23T16:47:59+08:00",
"Version": "0.16.2 (2021-08-25T04:01:15Z 29d6fee)",
"Hostname": "ubuntu-s-1vcpu-1gb-sgp1-01",
"MountPoint": "/home/herald/mnt",
"ProcessID": 2869146
}
]
}
Use the umount
subcommand to unmount a file system, for example:
$ sudo juicefs umount ~/mnt
Note: Force unmount the file system in use may cause data damage or loss, please be careful to operate.
If you don’t want to manually remount JuiceFS every time you restart the system, you can set up automatic mounting.
First, you need to rename the juicefs
client to mount.juicefs
and copy it to the /sbin/
directory:
$ sudo cp /usr/local/bin/juicefs /sbin/mount.juicefs
Edit the /etc/fstab
configuration file and add a new record:
rediss://default:bn8l7ui2cun4iaji@private-db-redis-sgp1-03138-do-user-2500071-0.b.db.ondigitalocean.com:25061/1 /home/herald/mnt juicefs _netdev,cache-size=20480 0 0
In the mount option, cache-size=20480
means to allocate 20GiB of local disk space as the local cache of JuiceFS. Please decide the allocated cache size according to the actual hardware. You can adjust the FUSE mount options in the above configuration according to your needs.
The JuiceFS file system supports being mounted by multiple cloud servers at the same time, and there is no requirement for the geographic location of the cloud server. It can easily realize the real-time data of servers between the same platform, between cross-cloud platforms, and between public and private clouds. shared.
Not only that, but the shared mount of JuiceFS can also provide a strong data consistency guarantee. When multiple servers mount the same file system, the writes confirmed on the file system will be visible in real-time on all hosts.
To use the shared mount, it is important to ensure that the database and object storage services that make up the file system can be accessed by each host to mount it. In the demonstration environment of this article, the Spaces object storage is open to the entire Internet, and it can be read and written through the API as long as the correct access key is used. But for the Redis database cluster managed by DigitalOcean, you need to configure the access strategy reasonably to ensure that the hosts outside the platform have access permissions.
When you mount the same file system on multiple hosts, first create a file system on any host, then install the JuiceFS client on every host, and use the same database address to mount it with the mount
command. Pay special attention to the fact that the file system only needs to be created once, and there should be no need to repeat file system creation operations on other hosts.
This article introduces the basics of installing and using JuiceFS on DigitalOcean, using Spaces object storage, and the platform-managed Redis database cluster to create and mount a file system.
If you are interested, you can also try to create file systems using object storage and cloud databases on different platforms. In addition, if you are worried about the reliability of Redis, you can also try databases such as MySQL, TiKV, and PostgreSQL. Different databases will give you completely different experiences.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.