Report this

What is the reason for this report?

Why minio servers can't find each other and having troubles with disks

Posted on February 15, 2021

On my local machine everything works fine: docker-compose file

minio1:
    image: minio/minio:RELEASE.2020-12-18T03-27-42Z
    volumes:
      - data1-1:/data1
      - data1-2:/data2
#      - ./docker/nginx/fullchain.pem:/root/.minio/certs/public.crt
#      - ./docker/nginx/privkey.pem:/root/.minio/certs/private.key
    expose:
      - "9000"
    environment:
      MINIO_ACCESS_KEY: KAJKUQ_TLIABIWEMKKANT0EN
      MINIO_SECRET_KEY: pJG-t0rZV5Xo11pd8XAu7IxHQyy0wCMVacW6SiONnjUxr01w
    command: server http://minio{1...4}/data{1...2}
    healthcheck:
      test: [ "CMD", "curl", "-f", "--insecure", "https://localhost:9000/minio/health/live" ]
      interval: 30s
      timeout: 20s
      retries: 3

  minio2:
    image: minio/minio:RELEASE.2020-12-18T03-27-42Z
    volumes:
      - data2-1:/data1
      - data2-2:/data2
#      - ./docker/nginx/fullchain.pem:/root/.minio/certs/public.crt
#      - ./docker/nginx/privkey.pem:/root/.minio/certs/private.key
    expose:
      - "9000"
    environment:
      MINIO_ACCESS_KEY: KAJKUQ_TLIABIWEMKKANT0EN
      MINIO_SECRET_KEY: pJG-t0rZV5Xo11pd8XAu7IxHQyy0wCMVacW6SiONnjUxr01w
    command: server http://minio{1...4}/data{1...2}
    healthcheck:
      test: [ "CMD", "curl", "-f", "--insecure", "https://localhost:9000/minio/health/live" ]
      interval: 30s
      timeout: 20s
      retries: 3

  minio3:
    image: minio/minio:RELEASE.2020-12-18T03-27-42Z
    volumes:
      - data3-1:/data1
      - data3-2:/data2
#      - ./docker/nginx/fullchain.pem:/root/.minio/certs/public.crt
#      - ./docker/nginx/privkey.pem:/root/.minio/certs/private.key
    expose:
      - "9000"
    environment:
      MINIO_ACCESS_KEY: KAJKUQ_TLIABIWEMKKANT0EN
      MINIO_SECRET_KEY: pJG-t0rZV5Xo11pd8XAu7IxHQyy0wCMVacW6SiONnjUxr01w
    command: server http://minio{1...4}/data{1...2}
    healthcheck:
      test: [ "CMD", "curl", "-f", "--insecure", "https://localhost:9000/minio/health/live" ]
      interval: 30s
      timeout: 20s
      retries: 3
  minio4:
    image: minio/minio:RELEASE.2020-12-18T03-27-42Z
    volumes:
      - data4-1:/data1
      - data4-2:/data2
#      - ./docker/nginx/fullchain.pem:/root/.minio/certs/public.crt
#      - ./docker/nginx/privkey.pem:/root/.minio/certs/private.key
    expose:
      - "9000"
    environment:
      MINIO_ACCESS_KEY: KAJKUQ_TLIABIWEMKKANT0EN
      MINIO_SECRET_KEY: pJG-t0rZV5Xo11pd8XAu7IxHQyy0wCMVacW6SiONnjUxr01w
    command: server http://minio{1...4}/data{1...2}
    healthcheck:
      test: [ "CMD", "curl", "-f", "--insecure", "https://localhost:9000/minio/health/live" ]
      interval: 30s
      timeout: 20s
      retries: 3

  nginx:
    image: nginx:1.19.2-alpine
    volumes:
      - ./docker/nginx/nginx.conf:/etc/nginx/nginx.conf:ro
      - ./docker/nginx/fullchain.pem:/etc/nginx/fullchain.pem
      - ./docker/nginx/privkey.pem:/etc/nginx/privkey.pem
    ports:
      - "9000:9000"
    depends_on:
      - minio1
      - minio2
      - minio3
      - minio4

volumes:
  data1-1:
  data1-2:
  data2-1:
  data2-2:
  data3-1:
  data3-2:
  data4-1:
  data4-2:
  db-data:

nginx.conf

user  nginx;
worker_processes  auto;

error_log  /var/log/nginx/error.log warn;
pid        /var/run/nginx.pid;


events {
    worker_connections  1024;
}

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log  /var/log/nginx/access.log  main;

    sendfile        on;
    #tcp_nopush     on;

    keepalive_timeout  65;

    #gzip  on;

    include /etc/nginx/conf.d/*.conf;

    upstream minio {
        server minio1:9000;
        server minio2:9000;
        server minio3:9000;
        server minio4:9000;
    }

    server {
        listen       9009 ssl;
        listen  [::]:9000 ssl;
        #listen       9000;
        #listen  [::]:9000;
        server_name  localhost;

         ssl_certificate fullchain.pem;
         ssl_certificate_key privkey.pem;
        add_header Strict-Transport-Security "max-age=31536000; includeSubDomains";

         # To allow special characters in headers
         ignore_invalid_headers off;
         # Allow any size file to be uploaded.
         # Set to a value such as 1000m; to restrict file size to a specific value
         client_max_body_size 0;
         # To disable buffering
         proxy_buffering off;

        location / {
            proxy_set_header Host $http_host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            proxy_connect_timeout 300;
            # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
            proxy_http_version 1.1;
            proxy_set_header Connection "";
            chunked_transfer_encoding off;

            proxy_pass https://minio;
        }
    }
}

Error

minio2_1          | API: SYSTEM()
minio2_1          | Time: 08:55:47 UTC 02/15/2021
minio2_1          | Error: Marking http://minio4:9000/minio/storage/data2/v22 temporary offline; caused by Post "http://minio4:9000/minio/storage/data2/v22/readall?disk-id=&file-path=format.json&volume=.minio.sys": dial tcp 172.19.0.5:9000: connect: connection refused (*fmt.wrapError)
minio2_1          |        6: cmd/rest/client.go:122:rest.(*Client).Call()
minio2_1          |        5: cmd/storage-rest-client.go:132:cmd.(*storageRESTClient).call()
minio2_1          |        4: cmd/storage-rest-client.go:420:cmd.(*storageRESTClient).ReadAll()
minio2_1          |        3: cmd/format-erasure.go:415:cmd.loadFormatErasure()
minio2_1          |        2: cmd/format-erasure.go:324:cmd.loadFormatErasureAll.func1()
minio2_1          |        1: pkg/sync/errgroup/errgroup.go:55:errgroup.(*Group).Go.func1()
minio2_1          | 
minio2_1          | API: SYSTEM()
minio2_1          | Time: 08:55:47 UTC 02/15/2021
minio2_1          | Error: Marking http://minio4:9000/minio/storage/data1/v22 temporary offline; caused by Post "http://minio4:9000/minio/storage/data1/v22/readall?disk-id=&file-path=format.json&volume=.minio.sys": dial tcp 172.19.0.5:9000: connect: connection refused (*fmt.wrapError)
minio2_1          |        6: cmd/rest/client.go:122:rest.(*Client).Call()
minio2_1          |        5: cmd/storage-rest-client.go:132:cmd.(*storageRESTClient).call()
minio2_1          |        4: cmd/storage-rest-client.go:420:cmd.(*storageRESTClient).ReadAll()
minio2_1          |        3: cmd/format-erasure.go:415:cmd.loadFormatErasure()
minio2_1          |        2: cmd/format-erasure.go:324:cmd.loadFormatErasureAll.func1()
minio2_1          |        1: pkg/sync/errgroup/errgroup.go:55:errgroup.(*Group).Go.func1()
minio2_1          | Client http://minio4:9000/minio/storage/data1/v22 online
minio2_1          | Client http://minio4:9000/minio/storage/data2/v22 online

Then

API: SYSTEM()
minio4_1          | Time: 08:55:49 UTC 02/15/2021
minio4_1          | DeploymentID: 5d5c6db4-f7c9-417d-b548-a6d33565bdf9
minio4_1          | Error: Storage resources are insufficient for the read operation .minio.sys/buckets/.bloomcycle.bin (cmd.InsufficientReadQuorum)
minio4_1          |        1: cmd/data-crawler.go:95:cmd.runDataCrawler()
minio4_1          | Client http://minio3:9000/minio/storage/data1/v22 online
minio4_1          | Client http://minio3:9000/minio/lock/v4 online
minio4_1          | Client http://minio3:9000/minio/storage/data2/v22 online
minio2_1          | Client http://minio3:9000/minio/storage/data1/v22 online
minio2_1          | Client http://minio3:9000/minio/lock/v4 online
minio2_1          | Client http://minio3:9000/minio/storage/data2/v22 online
minio2_1          | All MinIO sub-systems initialized successfully
minio2_1          | 
minio2_1          | API: SYSTEM()
minio2_1          | Time: 08:55:52 UTC 02/15/2021
minio2_1          | DeploymentID: 5d5c6db4-f7c9-417d-b548-a6d33565bdf9
minio2_1          | Error: disk not found:  (*errors.errorString)
minio2_1          |        2: cmd/erasure.go:132:cmd.getDisksInfo.func1()
minio2_1          |        1: pkg/sync/errgroup/errgroup.go:55:errgroup.(*Group).Go.func1()
minio2_1          | 
minio2_1          | API: SYSTEM()
minio2_1          | Time: 08:55:52 UTC 02/15/2021
minio2_1          | DeploymentID: 5d5c6db4-f7c9-417d-b548-a6d33565bdf9
minio2_1          | Error: disk not found:  (*errors.errorString)
minio2_1          |        2: cmd/erasure.go:132:cmd.getDisksInfo.func1()
minio2_1          |        1: pkg/sync/errgroup/errgroup.go:55:errgroup.(*Group).Go.func1()
minio2_1          | Waiting for all MinIO IAM sub-system to be initialized.. lock acquired
minio2_1          | Use `mc admin info` to look for latest server/disk info
minio2_1          |  Status:         6 Online, 2 Offline. 
minio2_1          | Endpoint:  http://172.19.0.6:9000  http://127.0.0.1:9000
minio1_1          | Object API (Amazon S3 compatible):
minio1_1          |    Go:         https://docs.min.io/docs/golang-client-quickstart-guide
minio1_1          |    Java:       https://docs.min.io/docs/java-client-quickstart-guide
minio1_1          |    Python:     https://docs.min.io/docs/python-client-quickstart-guide
minio1_1          |    JavaScript: https://docs.min.io/docs/javascript-client-quickstart-guide
minio1_1          |    .NET:       https://docs.min.io/docs/dotnet-client-quickstart-guide
minio1_1          | Waiting for all MinIO IAM sub-system to be initialized.. lock acquired
minio1_1          | fatal error: runtime: out of memory
minio1_1          | 
minio1_1          | runtime stack:
minio1_1          | runtime.throw(0x1f2209c, 0x16)
minio1_1          | 	runtime/panic.go:1116 +0x72
minio1_1          | runtime.sysMap(0xc008000000, 0x4000000, 0x3458a58)
minio1_1          | 	runtime/mem_linux.go:169 +0xc6
minio1_1          | runtime.(*mheap).sysAlloc(0x343c160, 0x800000, 0x42c117, 0x343c168)
minio1_1          | 	runtime/malloc.go:727 +0x1e5
minio1_1          | runtime.(*mheap).grow(0x343c160, 0x201, 0x0)
minio1_1          | 	runtime/mheap.go:1344 +0x85
minio1_1          | runtime.(*mheap).allocSpan(0x343c160, 0x201, 0x100, 0x3458a68, 0x2cf83d8)
minio1_1          | 	runtime/mheap.go:1160 +0x6b6
minio1_1          | runtime.(*mheap).alloc.func1()
minio1_1          | 	runtime/mheap.go:907 +0x65
minio1_1          | runtime.(*mheap).alloc(0x343c160, 0x201, 0xc0002b0101, 0xc0002b3380)
minio1_1          | 	runtime/mheap.go:901 +0x85
minio1_1          | runtime.largeAlloc(0x40000c, 0xc002090101, 0x1752c08)
minio1_1          | 	runtime/malloc.go:1177 +0x92
minio1_1          | runtime.mallocgc.func1()
minio1_1          | 	runtime/malloc.go:1071 +0x46
minio1_1          | runtime.systemstack(0x1)
minio1_1          | 	runtime/asm_amd64.s:370 +0x66
minio1_1          | runtime.mstart()
minio1_1          | 	runtime/proc.go:1116


This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Hi there,

From the provided logs, it looks like the MinIO instances are having trouble communicating with each other. Here are some potential causes and solutions for these issues:

  1. Network Communication Issue:

    • Docker’s networking allows containers to communicate with each other by their container names. It’s likely that there’s an issue with Docker’s networking on the machine where this setup is deployed. You can check this by getting into one of the MinIO containers (e.g., minio1) and pinging another container by its name (e.g., minio2).
    • Ensure there’s no firewall or security group policy that’s blocking inter-container communication on port 9000.
  2. Container Start Order:

    • Docker doesn’t guarantee the order in which containers start. Even if you’ve defined depends_on in your docker-compose file, it only ensures that minio1, minio2, minio3, and minio4 start before nginx. It doesn’t guarantee the order among minio1 to minio4.
    • It’s possible that some MinIO instances are trying to communicate with others before they are fully started and ready to accept connections. This can result in the “connection refused” errors you’re seeing.
  3. Volumes & Storage Issues:

    • Ensure that the Docker volumes are correctly set up and each MinIO instance has access to its respective volume.
    • If the MinIO instances can’t read/write from/to their volumes, it can result in errors like “disk not found.”
  4. MinIO Configuration:

    • Ensure that the command for starting each MinIO container is correct. The distributed mode setup for MinIO requires all instances to know about each other. Your command server http://minio{1...4}/data{1...2} looks correct, but double-check for any discrepancies.
    • MinIO’s distributed mode requires a quorum to be available for it to function. If too many nodes are offline, it won’t work correctly.
  5. SSL/TLS Configuration:

    • You’ve commented out the SSL/TLS certificates for MinIO and are using https in your Nginx configuration. If you are planning to use SSL/TLS, ensure that the certificates are correctly set up. If not, change the proxy_pass in Nginx configuration to use http instead of https.

Solutions:

  1. Wait for Initialization:

    • Introduce a delay or wait mechanism to ensure that all MinIO instances are fully initialized before they start accepting connections.
    • Consider using a script or a health check mechanism to ensure all nodes are ready before starting the cluster.
  2. Network Check:

    • Use docker exec to get into one MinIO container and try to ping/curl other MinIO containers by name. This can help identify if there’s a network communication issue.
  3. Volume Check:

    • Check the Docker volumes’ status and ensure there’s no issue with disk access for the MinIO containers.
  4. Logs & Debugging:

    • Check the logs of all MinIO instances to see if there are any other error messages or indications of what might be going wrong.
  5. MinIO Version:

    • Consider updating to a newer version of MinIO if there’s one available. Sometimes issues can be fixed in newer releases.

Best,

Bobby

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.