gegere
By:
gegere

CoreOS Failed getting response from http://127.0.0.1:4001/: too many redirects

November 3, 2014 3.7k views

Hello.

I have completed these guides provided by DigitalOcean and feel very confident.
https://www.digitalocean.com/community/tutorial_series/getting-started-with-coreos-2

I had 3 x 512mb Droplets running great. I decided to spin up 3 x 1gb Droplets using the same "discovery" token.

I was able to run:

core@coreos-1 ~ $ fleetctl list-machines
MACHINE     IP      METADATA
3c5fc75c... 10.132.191.194  public_ip=104.236.56.224,region=nyc
3f45fcd6... 10.132.192.113  public_ip=104.236.63.141,region=nyc
6aab38ca... 10.132.189.117  public_ip=104.236.47.209,region=nyc
b851ef0f... 10.132.189.118  public_ip=104.236.47.210,region=nyc
cd30023e... 10.132.189.116  public_ip=104.236.47.208,region=nyc
d378e962... 10.132.191.193  public_ip=104.236.56.222,region=nyc

Returning all the correct information. I then decided to take down the 3 x 512mb systems thinking that everything should cycle correctly.

When I SSH into the remaining 3 x 1gb Droplets and run "fleetctl list-machines"

There is a problem:

$ fleetctl list-machines
2014/11/03 23:13:48 INFO client.go:278: Failed getting response from http://127.0.0.1:4001/: too many redirects
2014/11/03 23:13:48 ERROR client.go:200: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms

Did I not give enough time before shutting down the initial 3 systems? How does the fleetctl process transfer roles if other nodes go offline?

2 comments
2 Answers

Rebooting the Droplets having an issue connecting to fleetctl resolved the problem.

I have the same issue.
The cluster is up and running (it's a cluster of workers processing jobs from an external queue).
Then, after an undefined amount of time (hours? day(s)?) the instances shows the same problem

fleetctl list-units
2014/11/19 09:23:28 INFO client.go:278: Failed getting response from http://127.0.0.1:4001/: too many redirects
2014/11/19 09:23:28 ERROR client.go:200: Unable to get result for {Get /_coreos.com/fleet/machines}, retrying in 100ms
2014/11/19 09:23:28 INFO client.go:278: Failed getting response from http://127.0.0.1:4001/: too many redirects
...

Trying to use etcd:

[etcd] Nov 19 09:17:19.511 WARNING | Using the directory lkcore1go1.etcd as the etcd curation directory because a directory was not specified.
[etcd] Nov 19 09:17:19.512 INFO | The path lkcore1go1.etcd/log is in btrfs
[etcd] Nov 19 09:17:19.512 INFO | Set NOCOW to path lkcore1go1.etcd/log succeeded
[etcd] Nov 19 09:17:19.512 INFO | lkcore1go1 is starting a new cluster
[etcd] Nov 19 09:17:19.515 INFO | etcd server [name lkcore1go1, listen on :4001, advertised url http://127.0.0.1:4001]
[etcd] Nov 19 09:17:19.516 CRITICAL | Failed to create listener: listen tcp :4001: bind: address already in use

I've not touched the cluster; (not stopped any instances nor created new ones, etc..).

Ideas?

Have another answer? Share your knowledge.