I want to create a filesystem by JuiceFS which combines with DigitalOcean managed Redis and Spaces. Does anyone have an idea?
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.
Sign up for Infrastructure as a Newsletter.
Working on improving health and education, reducing inequality, and spurring economic growth? We'd like to help.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
When your business system needs to store and share a large amount of unstructured data in a distributed computing environment, you need to consider using the open source JuiceFS storage at this time.
What is JuiceFS
JuiceFS is a cloud-native distributed file system designed specifically for large-scale data storage scenarios. It is released under the AGPLv3. Any enterprise and individual can freely use JuiceFS under the agreement.
The architecture of JuiceFS is to store all data in the cloud, the data is mainly stored in object storage, and the corresponding metadata is stored independently in the database. In terms of object storage, it supports almost all object storage services. In terms of databases, it supports Redis, TiKV, PostgreSQL, MySQL, MariaDB, etc., and more databases will be supported in the future.
It is not difficult to understand that local storage has many limitations, such as capacity exhaustion, single point of failure, and difficulty in sharing. These problems become particularly obvious when storing large-scale data. JuiceFS does not have these problems at all. First of all, it uses object storage to store all data, breaking the upper limit of capacity, and the storage space is approaching infinite. Secondly, all data and metadata are stored in the cloud, which determines that it is very suitable for being mounted and shared by multiple hosts at the same time. In addition, there is a single point of failure of any host that mounts JuiceFS, which will not affect the stored data and other hosts.
JuiceFS is designed for the cloud. When using the out-of-the-box storage and database services of the cloud platform, the installation can be completed in a few minutes. This article is oriented towards DigitalOcean and introduces how to quickly and easily install and use JuiceFS on the cloud computing platform.
Requirement
JuiceFS is driven by a combination of storage and database, so you need to prepare:
1. Cloud Server
The cloud server on DigitalOcean is called Droplet. You don’t need to purchase a new Droplet separately to use JuiceFS. If you already have a Droplet in use, which cloud server needs JuiceFS storage, just install the JuiceFS client on it.
Hardware
JuiceFS has no special requirements for hardware, and Droplets of any specification can be used stably. However, it is recommended to choose a better-performing SSD and reserve at least 1GB of capacity for JuiceFS as a local cache.
Operating System
JuiceFS supports Linux, BSD, macOS, and Windows. In this article, we use Ubuntu Server 20.04.
2. Object Storage
JuiceFS uses object storage to store all data. Using Spaces on DigitalOcean is the easiest solution. Spaces is an S3-compatible object storage service that works out of the box. It is recommended to select the same area as the Droplet when creating it so that you can get the best access speed and avoid additional traffic charges.
Of course, you can also use object storage services on other platforms, or use Ceph or MinIO to build manually on Droplet. In short, you are free to choose the object storage you want to use, as long as you make sure that the JuiceFS client can access the object storage API.
Here, I created a Space named
juicefs
, the region is Singaporesgp1
, and its access address is:In addition, you need to create
Spaces access keys
in the API menu, and JuiceFS needs to use it to access the Spaces API.3. Database
Unlike the local file system, JuiceFS stores all the metadata corresponding to the data in an independent database, so that the larger the size of the stored data, the better the performance.
Currently, JuiceFS supports common databases such as Redis, TiKV, MySQL/MariaDB, PostgreSQL, and SQLite, and it is also continuing to develop support for other databases. If the database you need is not yet supported, please submit an Issue for feedback.
In terms of performance, scale, and reliability, each database has its own advantages and disadvantages, and you should choose according to actual scenarios.
Please don’t worry about the choice of database. The JuiceFS client supports metadata migration. You can easily export metadata from one database and migrate it to other databases.
In this article, we use DigitalOcean’s Redis 6 database managed service, select the region Singapore, and select the same VPC private network as the existing Droplet. It takes about 5 minutes to create a Redis cluster. We follow the setup wizard to initialize the database cluster.
By default, the Redis cluster allows all inbound connections. For security reasons, you should select the Droplet that has access to the Redis cluster in the security setting section of the setup wizard in the
Add trusted sources
, that is, only allow the selected host to access the Redis cluster.In the setting of the eviction policy, it is recommended to select
noeviction
, that is, when the memory is exhausted, only errors are reported and no data is evictioned.The access address of the Redis cluster can be found in the
Connection Details
of the console. If all computing resources are in DigitalOcean, it is recommended to use the VPC private network for connection first, which can maximize security.Installation and Use
1. Install JuiceFS client
I am currently using Ubuntu Server 20.04, execute the following commands in sequence to install the latest version of the client.
Check current system and set temporary environment variables:
Download the latest version of the client software package adapted to the current system:
Unzip the installation package:
Install the client to
/usr/local/bin
:Execute the command and see the command help information returned to
juicefs
, which means that the client is installed successfully.In addition, you can also visit the JuiceFS GitHub Releases page to select other versions for manual installation.
2. Create a file system
To create a file system, use the
format
subcommand, the format is:The following command creates a file system named
mystor
:Parameter Description:
--storage
: Specify the data storage engine, here isspace
, click here to view all [supported storage](https://github.com/juicedata/juicefs/blob/main/docs/en/ how_to_setup_object_storage.md).--bucket
: Specify the bucket access address.--access-key
and--secret-key
: Specify the secret key for accessing the object storage API.rediss://
protocol header. The/1
added at the end of the link represents the use of Redis’s No. 1 database.If you see output similar to the following, it means that the file system is created successfully.
3. Mount a file system
To mount a file system, use the
mount
subcommand, and use the-d
parameter to mount it as a daemon. The following command mounts the newly created file system to themnt
directory under the current directory:The purpose of using
sudo
to perform the mount operation is to allow juicefs to have the authority to create a cache directory under/var/
. Please note that when mounting the file system, you only need to specify thedatabase address
and themount point
, not the name of the file system.If you see an output similar to the following, it means that the file system is mounted successfully.
Use the
df
command to see the mounting status of the file system:As you can see from the output information of the mount command, JuiceFS defaults to sets 1024 MB as the local cache. Setting a larger cache can make JuiceFS have better performance. You can set the cache (in MiB) through the
--cache-size
option when mounting a file system. For example, set a 20GB local cache:After the file system is mounted, you can store data in the
~/mnt
directory just like using a local hard disk.4. File system status
Use the
status
subcommand to view the basic information and connection status of a file system. You only need to specify the database URL.5. Unmount a file system
Use the
umount
subcommand to unmount a file system, for example:6. Auto-mount at boot
If you don’t want to manually remount JuiceFS every time you restart the system, you can set up automatic mounting.
First, you need to rename the
juicefs
client tomount.juicefs
and copy it to the/sbin/
directory:Edit the
/etc/fstab
configuration file and add a new record:In the mount option,
cache-size=20480
means to allocate 20GiB of local disk space as the local cache of JuiceFS. Please decide the allocated cache size according to the actual hardware. You can adjust the FUSE mount options in the above configuration according to your needs.7. Multi-host shared
The JuiceFS file system supports being mounted by multiple cloud servers at the same time, and there is no requirement for the geographic location of the cloud server. It can easily realize the real-time data of servers between the same platform, between cross-cloud platforms, and between public and private clouds. shared.
Not only that, but the shared mount of JuiceFS can also provide a strong data consistency guarantee. When multiple servers mount the same file system, the writes confirmed on the file system will be visible in real-time on all hosts.
To use the shared mount, it is important to ensure that the database and object storage services that make up the file system can be accessed by each host to mount it. In the demonstration environment of this article, the Spaces object storage is open to the entire Internet, and it can be read and written through the API as long as the correct access key is used. But for the Redis database cluster managed by DigitalOcean, you need to configure the access strategy reasonably to ensure that the hosts outside the platform have access permissions.
When you mount the same file system on multiple hosts, first create a file system on any host, then install the JuiceFS client on every host, and use the same database address to mount it with the
mount
command. Pay special attention to the fact that the file system only needs to be created once, and there should be no need to repeat file system creation operations on other hosts.Summary
This article introduces the basics of installing and using JuiceFS on DigitalOcean, using Spaces object storage, and the platform-managed Redis database cluster to create and mount a file system.
If you are interested, you can also try to create file systems using object storage and cloud databases on different platforms. In addition, if you are worried about the reliability of Redis, you can also try databases such as MySQL, TiKV, and PostgreSQL. Different databases will give you completely different experiences.