HAproxy: Difference between revisions

From Open-Xchange
No edit summary
No edit summary
 
(7 intermediate revisions by 3 users not shown)
Line 3: Line 3:
== Introduction ==
== Introduction ==


Where using a Keepalived based approach for Galera loadbalancing is not feasible, the next alternative is to use HAproxy.
HAproxy is a mature and popular option for userspace loadbalancing under Linux (and similar operating systems). Its main (original) purpose is probably HTTP load balancing.


== System Design ==
In Open-Xchange installations, for loadbalancing (reverse proxying) user session HTTP traffic towards our middleware, we require Apache, for reasons of correct handling of HTTP sticky sessions. See the corresponding "Configure Services" sections in our
[[AppSuite:Main_Page_Quickinstall#Quick_Installation_Guide|Quickinstall Guides]] for reference,  [[AppSuite:Open-Xchange_Installation_Guide_for_CentOS_7#Configure_services|e.g. the CentOS version]].


We present a solution where each OX node runs a HAproxy instance. This way we can implement a solution without the need for failover IPs or IP forwarding, which is often the reason why the Keepalived based approach is unavailable.
Thus, in Open-Xchange installations HAproxy thus typically acts in other roles:


We create two HAproxy "listener", one round-robin for the read requests, one active/passive for the write requests.
* As front-level loadbalancer in front of the Apache instances (in enterprise installations, however, usually other software is employed for that purpose), or
* As loadbalancer for connecting the OX middleware to the database (Galera) instances, or
* As loadbalancer for connecting [[AppSuite:Dovecot_Antiabuse_Shield|OX Abuse Shield]] to Dovecot and App Suite
 
We will cover in this article the latter two use cases.


== Software Installation ==
== Software Installation ==
This is generic for the different use cases and thus described generically here.


HAproxy should be shipped with the distribution.
HAproxy should be shipped with the distribution.


Wheezy note: haproxy is provided in wheezy-backports, see http://haproxy.debian.net/
Historical Wheezy note: haproxy is provided in wheezy-backports, see http://haproxy.debian.net/. More recent (Debian) distributions don't need this extra repo and provide it with their native repos.
 
# yum install haproxy
# apt-get install haproxy
 
== HAproxy for OX Abuse Shield ==
 
In the following we present a configuration which was feedbacked to us from a large customer installation. (Thanks a lot!)
 
Crucial are the following aspects:
 
* Detect capacity issues on the wforce backends to be resilient to DOS attacks
* In case of capacity issues, just do a HTTP deny. It makes no sense to use some cross-datacenter routing (if multiple datacenters are available) as the latency combined with the query volume makes this less then ideal. A tiny amount of unanswered calls is fine, and doesn't affect the auth process or user experience.
* Timeouts are such that HAproxy's timeout (35 msecs(!)) is smaller than the corresponding wforce service consumer timeout (e.g. <code>dovecot</code>: <code>auth_policy_server_timeout_msecs</code> would get a corresponding value of 100 msecs). In this way timeouts are a little more often, but graceful and have little to no impact on user experience.
 
=== Configuration ===
 
frontend wforce :8084
        mode http
        # detect capacity issues in wforce backend
        acl local_wforce_not_enough_capacity nbsrv(wforce_backend_primary) lt 2
        # deny request if primary backend doesn't have capacity
        http-request deny if local_wforce_not_enough_capacity
        default_backend wforce_backend_primary
backend wforce_backend_primary
        balance leastconn
        mode http
        timeout connect 35
        timeout server 30s
        timeout check 0
        option httpchk GET /?command=ping HTTP/1.0\r\nAuthorization:\ Basic\ --redacted--
        option http-keep-alive
        server wforce-01.example.net 192.168.0.1:8084 weight 10 check inter 5000 fall 5 rise 3
        server wforce-02.example.net 192.168.0.2:8084 weight 10 check inter 5000 fall 5 rise 3
        server wforce-03.example.net 192.168.0.3:8084 weight 10 check inter 5000 fall 5 rise 3
 
== HAproxy for Galera Loadbalancing ==
 
HAproxy is one of the options to use for [[OXLoadBalancingClustering_Database#Galera_database_setup|Galera]] loadbalancing. Please consider reading through [[OXLoadBalancingClustering_Database#Loadbalancer_options|the available options]] before deciding for one solution.
 
=== System Design ===


Short version:
We present a solution where each OX node runs a HAproxy instance. This way we can implement a solution without the need for additional loadbalancer (virtual) machines.


  echo "deb http://http.debian.net/debian wheezy-backports main" > /etc/apt/sources.list.d/wheezy-backports.list
We create two HAproxy "listener", one round-robin for the read requests, one active/passive for the write requests.
  apt-get update
  apt-get -t wheezy-backports install haproxy


== Configuration ==
=== Configuration ===


The following is a HAproxy configuration file, assuming the Galera nodes have the IPs 192.168.1.101..103:
The following is a HAproxy configuration file <code>/etc/haproxy/haproxy.cfg</code>, assuming the Galera nodes have the IPs 10.0.0.1..3.


   global
   global
Line 44: Line 91:
       maxconn          256000
       maxconn          256000
       timeout connect  60000
       timeout connect  60000
       timeout client    120000
       timeout client    20m
       timeout server    120000
       timeout server    20m
       option            dontlognull
       option            dontlognull
       option            redispatch
       option            redispatch
      option            allbackups
       # the http options are not needed here
       # the http options are not needed here
       # but may be reasonable if you use haproxy also for some OX HTTP proxying
       # but may be reasonable if you use haproxy also for some OX HTTP proxying
Line 54: Line 100:
       no option        httpclose
       no option        httpclose
    
    
   listen mysql-cluster
   listen mysql-read
       bind 127.0.0.1:3306
       bind 127.0.0.1:3306
       mode tcp
       mode tcp
       balance roundrobin
       balance roundrobin
       option httpchk
       option httpchk GET /
       server dav-db1 192.168.1.101:3306 check port 9200 inter 12000 rise 3 fall 3
       server db1 10.0.0.1:3306 check port 9200 inter 6000 rise 3 fall 3
       server dav-db2 192.168.1.102:3306 check port 9200 inter 12000 rise 3 fall 3
       server db2 10.0.0.2:3306 check port 9200 inter 6000 rise 3 fall 3
       server dav-db3 192.168.1.103:3306 check port 9200 inter 12000 rise 3 fall 3
       server db3 10.0.0.3:3306 check port 9200 inter 6000 rise 3 fall 3
    
    
   listen mysql-failover
   listen mysql-write
       bind 127.0.0.1:3307
       bind 127.0.0.1:3307
       mode tcp
       mode tcp
       balance roundrobin
       balance roundrobin
       option httpchk
       option httpchk GET /master
       server dav-db1 192.168.1.101:3306 check port 9200 inter 12000 rise 3 fall 3
      # maybe be more prudent with the master for the fall parameter
       server dav-db2 192.168.1.102:3306 check port 9200 inter 12000 rise 3 fall 3 backup
       server db1 10.0.0.1:3306 check port 9200 inter 6000 rise 3 fall 1
       server dav-db3 192.168.1.103:3306 check port 9200 inter 12000 rise 3 fall 3 backup
       server db2 10.0.0.2:3306 check port 9200 inter 6000 rise 3 fall 1
       server db3 10.0.0.3:3306 check port 9200 inter 6000 rise 3 fall 1
    
    
   #
   #
Line 84: Line 131:
   #    stats auth user:pass
   #    stats auth user:pass


You can see we use the httpchk option, which means that haproxy makes http requests to obtain node health. Therefore we need to configure something which answers those requests.
Note 1: the timeout options may seem exaggerated high, but they are required to ensure that it is not the loadbalancer shutting down MySQL connections while the systems still use it. Cf. <code>configdb.properties</code>:
 
The Percona Galera packages ship with a script /usr/bin/clustercheck which can be called like


  # /usr/bin/clustercheck <username> <password>
# Maximum time in milliseconds a connection will be used. After this time
  HTTP/1.1 200 OK
# the connection get closed.
  Content-Type: text/plain
maxLifeTime=600000
  Connection: close
  Content-Length: 40
 
  Percona XtraDB Cluster Node is synced.
  #


They also ship an xinetd service definition. On RHEL/CentOS the service needs to be added to /etc/services.
We got a default of 10 minutes, so allowing for some extra time to allow running queries to finish plus some overhead, 20 minutes look like a reasonable value for the connection timeout here.


You need a user for this service. Create as follows:
Note 2: If you are configuring a dedicated loadbalancer node which should loadbalancer for other clients on the network (rather than a distributed / colocated HAproxy instance which should only serve for localhost) change the <code>bind</code> parameters accordingly.


mysql -e 'grant process on *.* to "clustercheck"@"localhost" identified by "<password>";'
=== Health check service ===


Of course substitute here (and in the following) "<password>" by some reasonable password.
As you can see we use the <code>httpchk</code> option, so we assume a health check service to be available. Please have a look at the [[Clustercheck]] page how to configure such a service. Please be aware that we assume you use our customized, improved [[Clustercheck|<code>clustercheck</code>]] script, so please don't use the standard one.


Then adjust the xinetd configuration:
Contrary to other setup instructions which recommend to configure one node as ''regular'' node and the other two ones as <code>backup</code> nodes, we recommend to leverage a health check which declares only the node with <code>wsrep_local_index=0</code> as available. This way we ensure that even in corner cases, multiple distributed HAproxy instances can not end up with declaring different nodes as designated write nodes, which [[OXLoadBalancingClustering_Database#Write_requests|would be problematic]].


# default: on
=== Monitoring ===
# description: mysqlchk
service mysqlchk
{
# this is a config for xinetd, place it in /etc/xinetd.d/
        disable        = no
        flags          = REUSE
        socket_type    = stream
        port            = 9200
        wait            = no
        user            = nobody
        server          = /usr/bin/clustercheck
        server_args    = clustercheck <password>
        log_on_failure  += USERID
        only_from      = 0.0.0.0/0
        per_source      = UNLIMITED
        type            = UNLISTED
}
#


== Monitoring ==
Besided using the Galera check service configured before, you can also speak to the stats socket of HAproxy using <code>socat</code>.


Besided using the Galera check service configured before, you can also speak to the stats socket of haproxy using socat.
# echo "show stat" | socat unix-connect:/var/lib/haproxy/stats stdio


  echo "show stat" | socat unix-connect:/var/lib/haproxy/stats stdio
The output is a CSV with long lines unsuitable for pasting here. Please test on your own.


There are more commands available via this socket to enable / disable servers; see the haproxy documentation for details. (As of writing that documentation could be found here: http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2 that URL seems unstable.)
There are more commands available via this socket to enable / disable servers; see the haproxy documentation for details. (As of writing that documentation could be found here: http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2 that URL seems unstable.)

Latest revision as of 06:04, 28 July 2020

HAproxy Loadbalancer

Introduction

HAproxy is a mature and popular option for userspace loadbalancing under Linux (and similar operating systems). Its main (original) purpose is probably HTTP load balancing.

In Open-Xchange installations, for loadbalancing (reverse proxying) user session HTTP traffic towards our middleware, we require Apache, for reasons of correct handling of HTTP sticky sessions. See the corresponding "Configure Services" sections in our Quickinstall Guides for reference, e.g. the CentOS version.

Thus, in Open-Xchange installations HAproxy thus typically acts in other roles:

  • As front-level loadbalancer in front of the Apache instances (in enterprise installations, however, usually other software is employed for that purpose), or
  • As loadbalancer for connecting the OX middleware to the database (Galera) instances, or
  • As loadbalancer for connecting OX Abuse Shield to Dovecot and App Suite

We will cover in this article the latter two use cases.

Software Installation

This is generic for the different use cases and thus described generically here.

HAproxy should be shipped with the distribution.

Historical Wheezy note: haproxy is provided in wheezy-backports, see http://haproxy.debian.net/. More recent (Debian) distributions don't need this extra repo and provide it with their native repos.

# yum install haproxy
# apt-get install haproxy

HAproxy for OX Abuse Shield

In the following we present a configuration which was feedbacked to us from a large customer installation. (Thanks a lot!)

Crucial are the following aspects:

  • Detect capacity issues on the wforce backends to be resilient to DOS attacks
  • In case of capacity issues, just do a HTTP deny. It makes no sense to use some cross-datacenter routing (if multiple datacenters are available) as the latency combined with the query volume makes this less then ideal. A tiny amount of unanswered calls is fine, and doesn't affect the auth process or user experience.
  • Timeouts are such that HAproxy's timeout (35 msecs(!)) is smaller than the corresponding wforce service consumer timeout (e.g. dovecot: auth_policy_server_timeout_msecs would get a corresponding value of 100 msecs). In this way timeouts are a little more often, but graceful and have little to no impact on user experience.

Configuration

frontend wforce :8084
        mode http
        # detect capacity issues in wforce backend
        acl local_wforce_not_enough_capacity nbsrv(wforce_backend_primary) lt 2
        # deny request if primary backend doesn't have capacity
        http-request deny if local_wforce_not_enough_capacity
        default_backend wforce_backend_primary

backend wforce_backend_primary
        balance leastconn
        mode http
        timeout connect 35
        timeout server 30s
        timeout check 0
        option httpchk GET /?command=ping HTTP/1.0\r\nAuthorization:\ Basic\ --redacted--
        option http-keep-alive
        server wforce-01.example.net 192.168.0.1:8084 weight 10 check inter 5000 fall 5 rise 3
        server wforce-02.example.net 192.168.0.2:8084 weight 10 check inter 5000 fall 5 rise 3
        server wforce-03.example.net 192.168.0.3:8084 weight 10 check inter 5000 fall 5 rise 3

HAproxy for Galera Loadbalancing

HAproxy is one of the options to use for Galera loadbalancing. Please consider reading through the available options before deciding for one solution.

System Design

We present a solution where each OX node runs a HAproxy instance. This way we can implement a solution without the need for additional loadbalancer (virtual) machines.

We create two HAproxy "listener", one round-robin for the read requests, one active/passive for the write requests.

Configuration

The following is a HAproxy configuration file /etc/haproxy/haproxy.cfg, assuming the Galera nodes have the IPs 10.0.0.1..3.

 global
     log 127.0.0.1     local0
     log 127.0.0.1     local1 notice
     user              haproxy
     group             haproxy
     # this is not recommended by the haproxy authors, but seems to improve performance for me
     #nbproc 4
     maxconn           256000
     spread-checks     5
     daemon
     stats socket      /var/lib/haproxy/stats 
 
 defaults
     log               global
     retries           3
     maxconn           256000
     timeout connect   60000
     timeout client    20m
     timeout server    20m
     option            dontlognull
     option            redispatch
     # the http options are not needed here
     # but may be reasonable if you use haproxy also for some OX HTTP proxying
     mode              http
     no option         httpclose
 
 listen mysql-read
     bind 127.0.0.1:3306
     mode tcp
     balance roundrobin
     option httpchk GET /
     server db1 10.0.0.1:3306 check port 9200 inter 6000 rise 3 fall 3
     server db2 10.0.0.2:3306 check port 9200 inter 6000 rise 3 fall 3
     server db3 10.0.0.3:3306 check port 9200 inter 6000 rise 3 fall 3
 
 listen mysql-write
     bind 127.0.0.1:3307
     mode tcp
     balance roundrobin
     option httpchk GET /master
     # maybe be more prudent with the master for the fall parameter
     server db1 10.0.0.1:3306 check port 9200 inter 6000 rise 3 fall 1
     server db2 10.0.0.2:3306 check port 9200 inter 6000 rise 3 fall 1
     server db3 10.0.0.3:3306 check port 9200 inter 6000 rise 3 fall 1
 
 #
 # can configure a stats interface here, but if you do so,
 # change the username / password
 #
 #listen stats
 #    bind 0.0.0.0:8080
 #    mode http
 #    stats enable
 #    stats uri /
 #    stats realm Strictly\ Private
 #    stats auth user:pass

Note 1: the timeout options may seem exaggerated high, but they are required to ensure that it is not the loadbalancer shutting down MySQL connections while the systems still use it. Cf. configdb.properties:

# Maximum time in milliseconds a connection will be used. After this time
# the connection get closed.
maxLifeTime=600000

We got a default of 10 minutes, so allowing for some extra time to allow running queries to finish plus some overhead, 20 minutes look like a reasonable value for the connection timeout here.

Note 2: If you are configuring a dedicated loadbalancer node which should loadbalancer for other clients on the network (rather than a distributed / colocated HAproxy instance which should only serve for localhost) change the bind parameters accordingly.

Health check service

As you can see we use the httpchk option, so we assume a health check service to be available. Please have a look at the Clustercheck page how to configure such a service. Please be aware that we assume you use our customized, improved clustercheck script, so please don't use the standard one.

Contrary to other setup instructions which recommend to configure one node as regular node and the other two ones as backup nodes, we recommend to leverage a health check which declares only the node with wsrep_local_index=0 as available. This way we ensure that even in corner cases, multiple distributed HAproxy instances can not end up with declaring different nodes as designated write nodes, which would be problematic.

Monitoring

Besided using the Galera check service configured before, you can also speak to the stats socket of HAproxy using socat.

# echo "show stat" | socat unix-connect:/var/lib/haproxy/stats stdio

The output is a CSV with long lines unsuitable for pasting here. Please test on your own.

There are more commands available via this socket to enable / disable servers; see the haproxy documentation for details. (As of writing that documentation could be found here: http://cbonte.github.io/haproxy-dconv/configuration-1.5.html#9.2 that URL seems unstable.)