DEV Community

Cover image for DevPill 10 - Fault tolerance: adding retries to your Go code
Raul Paes Silva
Raul Paes Silva

Posted on

DevPill 10 - Fault tolerance: adding retries to your Go code

Adding retries to your API is a must to make your system more resilient. You can add them in database operations, communication with external apis and every other operation that might depend on a response of a third party.

Here's an example of a retry implementation for database transient errors:

1. First, let's code the retry function

The function receives the number of attempts, delay (interval between attempts) and the function which represents the operation you want to do.

func Retry(attempts int, delay time.Duration, fn func() error) error {
    err := fn()
    if err == nil {
        return nil
    }

    if !IsTransientError(err) {
        // log.Println("--- retry not needed")
        return err
    }

    if attempts--; attempts > 0 {
        time.Sleep(delay)
        // log.Println("--- trying again...")
        return Retry(attempts, delay*2, fn) // exponential backoff
    }

    return err
}
Enter fullscreen mode Exit fullscreen mode

2. IsTransientError function

In this example I created a function that check if the operation is worth trying again based on the error.
Let's say the repository returns "sql.ErrNoRows", which is returned when the query indicates no rows were found for the select command. This error definitely does not need to be attempted again. That's why we're looking for errors that are transient, that might happen for a few milliseconds/seconds for network reasons for example.

func IsTransientError(err error) bool {
    if err == nil {
        return false
    }

    // example : deadlock postgres
    if strings.Contains(err.Error(), "deadlock detected") {
        return true
    }

    // closed connection
    if strings.Contains(err.Error(), "connection reset by peer") {
        return true
    }

    // network/postgres timeout
    if strings.Contains(err.Error(), "timeout") {
        return true
    }

    // Max connections exceeded
    if strings.Contains(err.Error(), "too many connections") {
        return true
    }

    return false
}
Enter fullscreen mode Exit fullscreen mode

3. Using the function

Here's an example of retrying a database operation called in a login method in service layer:

func (s *UserService) Login(ctx context.Context, email, password string) (*UserOutput, error) {

    var user *domain.User
    err := retry.Retry(3, 200*time.Millisecond, func() error {
        var repoError error
        user, repoError = s.repo.GetUserByEmail(ctx, email)
        if repoError != nil {
            if repoError == sql.ErrNoRows {
                return ErrUserNotFound
            }
            return repoError
        }
        return nil
    })

    if err != nil {
        return nil, err
    }
    //... ... ...
Enter fullscreen mode Exit fullscreen mode

Top comments (0)