2 minute read

Problem

Want to read from replicas and only write to the master, to reduce master loads.

Solution

Create read-only clients using NewFailOverClient(opt) with opt.SlaveOnly set to true.

IMPORTANT NOTE: If this field is false, go-redis will only choose the MASTER. Too bad!

https://github.com/go-redis/redis/blob/cae67723092cac2cb441bc87044ab9edacb2484d/sentinel.go#L227

if failover.opt.SlaveOnly {
    addr, err = failover.RandomSlaveAddr(ctx)
} else {
    addr, err = failover.MasterAddr(ctx)
    if err == nil {
        failover.trySwitchMaster(ctx, addr)
    }
}

If set to true, go-redis will find a random slave through Sentinel each time, which perfectly suits our purpose.

Intricacies

Question

SlaveOnly: If there’re many replicas, it will randomly choose one. Does the “choose” action happen only when the Client sets up or it happens for every command (like GET)?

Answer

The “choose” action only happens on Client creation. After that, all commands are directed to the same replica.

Test1

c := NewFailoverClient()
err := c.FlushAll(c.Context()).Err()
if err != nil {
    panic(err)
}
err = c.Set(c.Context(), "k1", "v1", 0).Err()
if err != nil {
    panic(err)
}

roc := NewFailoverReadOnlyClient()
for i := 0; i < 3; i++ {
    res, err := roc.Get(roc.Context(), "k1").Result()
    if err != nil {
        panic(err)
    }
    fmt.Println(res)
}

Listen for GET commands on each replica using

redis-cli -p 6380 -a password monitor | grep "get"

Result

All directed to one replica.

Test2

c := NewFailoverClient()
err := c.FlushAll(c.Context()).Err()
if err != nil {
    panic(err)
}
err = c.Set(c.Context(), "k1", "v1", 0).Err()
if err != nil {
    panic(err)
}
for i := 0; i < 3; i++ {
    roc := NewFailoverReadOnlyClient()
    res, err := roc.Get(roc.Context(), "k1").Result()
    if err != nil {
        panic(err)
    }
    fmt.Println(res)
}

Result

Requests are directed to different replicas.

Test3

roc := NewFailoverReadOnlyClient()

for i := 0; i < 3; i++ {
	res, err := roc.Get(roc.Context(), "k1").Result()
	if err != nil {
		panic(err)
	}
	fmt.Println(res)
}
fmt.Println("going to sleep")
// shutdown the replica that the client connected to when GETting k1
time.Sleep(3 * time.Second)
fmt.Println("wake up")
currentTime := time.Now()
fmt.Println("Wake up time: ", currentTime.Format("2006.01.02 15:04:05"))

for i := 0; i < 3; i++ {
	res, err := roc.Get(roc.Context(), "k2").Result()
	if err != nil {
		panic(err)
	}
	fmt.Println(res)
}

Result

Although the client’s previously connected replica was down, the read-only client is able to redirect to a working replica and read the value of k2.

This even reacts faster than the sentinel: Although the sentinel takes 5 seconds to realize that the client we were reading from is down, the client, having been put to sleep for 3 seconds, wakes up before the sentinel has realized and is able to still read from a different replica.

Will try to figure out how this logic is implemented on their end and how the client is making this decision

Tags: ,

Categories:

Updated: