Sending Archives - ☩ Walking in Light with Christ - Faith, Computing, Diary ☩ Walking in Light with Christ

Posts Tagged ‘Sending’

Use haproxy to dynamically modify haproxy load balancer variables, view stastics, errors and much more via stats UNIX socket with socat via command line

Friday, December 15th, 2023

Haproxy could be configured to use the listen stats interface to provide a tiny web interface with statistics on all configured haproxy frontends / backends state status (UP / DOWN), current connections to proxy, errors and other interesting bandwidth information.

That is mostly useful but not every haproxy has it configured and if you did not configure the HAproxy load balancer machines on your own it might be, the previous person who build the LB infrastructure did not create the haproxy listener.

If that is the case and you still need to get various statistics on how haproxy performs and the status of active connections towards Frotnend i/ Backend interfaces this is still possible via configured stats socket (usually this is in Global or some of the other haproxy.cfg config sections..

It is possible to do many things with haproxy such as disable / enable frotnends / backends / servers

Lets say your Haproxy has a global section that looks like this:

global
stats socket /var/run/haproxy/haproxy.sock mode 0600 level admin #Creates Unix-Like socket to fetch stats
log /dev/log local0
log /dev/log local1 notice
chroot /var/lib/haproxy
user haproxy
group haproxy
daemon
maxconn 99999
nbproc 1
nbthread 2
cpu-map 1 0
cpu-map 2 1

…

1. Listing all available options that can be send via the haproxy.sock UNIX socket interface

root@pcfreak:/home/hipo/info# echo "show help" | socat stdio /var/run/haproxy/haproxy.sock
Unknown command. Please enter one of the following commands only :
help : this message
prompt : toggle interactive mode with prompt
quit : disconnect
show tls-keys [id|*]: show tls keys references or dump tls ticket keys when id specified
set ssl tls-key [id|keyfile] <tlskey>: set the next TLS key for the <id> or <keyfile> listener to <tlskey>
add ssl crt-list <filename> <certfile> [options] : add a line <certfile> to a crt-list <filename>
del ssl crt-list <filename> <certfile[:line]> : delete a line <certfile> in a crt-list <filename>
show ssl crt-list [-n] [] : show the list of crt-lists or the content of a crt-list <filename>
new ssl cert <certfile> : create a new certificate file to be used in a crt-list or a directory
set ssl cert <certfile> <payload> : replace a certificate file
commit ssl cert <certfile> : commit a certificate file
abort ssl cert <certfile> : abort a transaction for a certificate file
del ssl cert <certfile> : delete an unused certificate file
show ssl cert [] : display the SSL certificates used in memory, or the details of a <certfile>
set maxconn global : change the per-process maxconn setting
set rate-limit : change a rate limiting value
set severity-output [none|number|string] : set presence of severity level in feedback information
set timeout : change a timeout setting
show env [var] : dump environment variables known to the process
show cli sockets : dump list of cli sockets
show cli level : display the level of the current CLI session
show fd [num] : dump list of file descriptors in use
show activity : show per-thread activity stats (for support/developers)
operator : lower the level of the current CLI session to operator
user : lower the level of the current CLI session to user
clear counters : clear max statistics counters (add 'all' for all counters)
show info : report information about the running process [desc|json|typed]*
show stat : report counters for each proxy and server [desc|json|typed]*
show schema json : report schema used for stats
show sess [id] : report the list of current sessions or dump this session
shutdown session : kill a specific session
shutdown sessions server : kill sessions on a server
disable agent : disable agent checks (use 'set server' instead)
disable health : disable health checks (use 'set server' instead)
disable server : disable a server for maintenance (use 'set server' instead)
enable agent : enable agent checks (use 'set server' instead)
enable health : enable health checks (use 'set server' instead)
enable server : enable a disabled server (use 'set server' instead)
set maxconn server : change a server's maxconn setting
set server : change a server's state, weight or address
get weight : report a server's current weight
set weight : change a server's weight (deprecated)
show startup-logs : report logs emitted during HAProxy startup
clear table : remove an entry from a table
set table [id] : update or create a table entry's data
show table [id]: report table usage stats or dump this table's contents
add acl : add acl entry
clear acl <id> : clear the content of this acl
del acl : delete acl entry
get acl : report the patterns matching a sample for an ACL
show acl [id] : report available acls or dump an acl's contents
add map : add map entry
clear map <id> : clear the content of this map
del map : delete map entry
get map : report the keys and values matching a sample for a map
set map : modify map entry
show map [id] : report available maps or dump a map's contents
show events [] : show event sink state
show threads : show some threads debugging information
show peers [peers section]: dump some information about all the peers or this peers section
disable frontend : temporarily disable specific frontend
enable frontend : re-enable specific frontend
set maxconn frontend : change a frontend's maxconn setting
show servers conn [id]: dump server connections status (for backend <id>)
show servers state [id]: dump volatile server information (for backend <id>)
show backend : list backends in the current running config
shutdown frontend : stop a specific frontend
set dynamic-cookie-key backend : change a backend secret key for dynamic cookies
enable dynamic-cookie backend : enable dynamic cookies on a specific backend
disable dynamic-cookie backend : disable dynamic cookies on a specific backend
show errors : report last request and response errors for each proxy
show resolvers [id]: dumps counters from all resolvers section and
associated name servers
show pools : report information about the memory pools usage
show profiling : show CPU profiling options
set profiling : enable/disable CPU profiling
show cache : show cache status
trace <module> [cmd [args…]] : manage live tracing
show trace [] : show live tracing state

2. View haproxy running threads

root@pcfreak:/home/hipo/info# echo "show threads" | socat stdio /var/run/haproxy/haproxy.sock
Thread 1 : id=0x7f87b6e2c1c0 act=0 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=1 wantrdv=0
cpu_ns: poll=3061065069437 now=3061065077880 diff=8443
curr_task=0
* Thread 2 : id=0x7f87b6e20700 act=1 glob=0 wq=1 rq=0 tl=0 tlsz=0 rqsz=0
stuck=0 prof=0 harmless=0 wantrdv=0
cpu_ns: poll=2969050092523 now=2969050197848 diff=105325
curr_task=0x7f87b006f740 (task) calls=1 last=0
fct=0x560978846340(task_run_applet) ctx=0x7f87b0190720(<CLI>)
strm=0x56097a763560 src=unix fe=GLOBAL be=GLOBAL dst=<CLI>
rqf=c48200 rqa=0 rpf=80008000 rpa=0 sif=EST,200008 sib=EST,204018
af=(nil),0 csf=0x56097a776ef0,8200
ab=0x7f87b0190720,9 csb=(nil),0
cof=0x56097a77fb00,1300:PASS(0x7f87b019a680)/RAW((nil))/unix_stream(22)
cob=(nil),0:NONE((nil))/NONE((nil))/NONE(0)

3. Show haproxy server connections

root@pcfreak:/home/hipo/info# echo "show servers conn" | socat stdio /var/run/haproxy/haproxy.sock
# bkname/svname bkid/svid addr port – purge_delay used_cur used_max need_est unsafe_nb safe_nb idle_lim idle_cur idle_per_thr[2]
http-websrv/ha1server-1 3/1 192.168.0.209 80 – 5000 0 12 12 0 0 -1 0 0 0
http-websrv/ha1server-2 3/2 192.168.0.200 80 – 5000 1 142 142 0 0 -1 0 0 0
http-websrv/ha1server-3 3/3 192.168.1.30 80 – 5000 0 0 0 0 0 -1 0 0 0
http-websrv/ha1server-4 3/4 192.168.1.14 80 – 5000 0 0 0 0 0 -1 0 0 0
http-websrv/ha1server-5 3/5 192.168.0.1 80 – 5000 0 13 13 0 0 -1 0 0 0
https-websrv/ha1server-1 5/1 192.168.0.209 443 – 5000 0 59 59 0 0 -1 0 0 0
https-websrv/ha1server-2 5/2 192.168.0.200 443 – 5000 11 461 461 0 0 -1 0 0 0
https-websrv/ha1server-3 5/3 192.168.1.30 443 – 5000 0 0 0 0 0 -1 0 0 0
https-websrv/ha1server-4 5/4 192.168.1.14 443 – 5000 0 0 0 0 0 -1 0 0 0
https-websrv/ha1server-5 5/5 192.168.0.1 443 – 5000 1 152 152 0 0 -1 0 0 0
MASTER/cur-1 6/1 – 0 – 0 0 0 0 0 0 0 0

4. Show Load balancer servers state

root@pcfreak:/home/hipo/info# echo "show servers state" | socat stdio /var/run/haproxy/haproxy.sock
1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord
3 http-websrv 1 ha1server-1 192.168.0.209 2 0 254 254 3929 6 3 4 6 0 0 0 – 80 –
3 http-websrv 2 ha1server-2 192.168.0.200 2 0 255 255 3928 6 3 4 6 0 0 0 – 80 –
3 http-websrv 3 ha1server-3 192.168.1.30 2 0 252 252 3927 6 3 4 6 0 0 0 – 80 –
3 http-websrv 4 ha1server-4 192.168.1.14 2 0 253 253 3929 6 3 4 6 0 0 0 – 80 –
3 http-websrv 5 ha1server-5 192.168.0.1 2 0 251 251 1708087 6 3 4 6 0 0 0 – 80 –
5 https-websrv 1 ha1server-1 192.168.0.209 2 0 254 254 3929 6 3 4 6 0 0 0 – 443 –
5 https-websrv 2 ha1server-2 192.168.0.200 2 0 255 255 3928 6 3 4 6 0 0 0 – 443 –
5 https-websrv 3 ha1server-3 192.168.1.30 2 0 252 252 3927 6 3 4 6 0 0 0 – 443 –
5 https-websrv 4 ha1server-4 192.168.1.14 2 0 253 253 3929 6 3 4 6 0 0 0 – 443 –
5 https-websrv 5 ha1server-5 192.168.0.1 2 0 251 251 1708087 6 3 4 6 0 0 0 – 443 –
6 MASTER 1 cur-1 – 2 0 0 0 1708087 1 0 0 0 0 0 0 – 0 –

5. Get general haproxy info on variables that can be used for Load Balancer fine tuning

root@pcfreak:/home/hipo/info# echo "show info" | socat stdio /var/run/haproxy/haproxy.sock
Name: HAProxy
Version: 2.2.9-2+deb11u5
Release_date: 2023/04/10
Nbthread: 2
Nbproc: 1
Process_num: 1
Pid: 3103635
Uptime: 19d 18h11m49s
Uptime_sec: 1707109
Memmax_MB: 0
PoolAlloc_MB: 1
PoolUsed_MB: 0
PoolFailed: 0
Ulimit-n: 200059
Maxsock: 200059
Maxconn: 99999
Hard_maxconn: 99999
CurrConns: 8
CumConns: 19677218
CumReq: 2740072
MaxSslConns: 0
CurrSslConns: 0
CumSslConns: 0
Maxpipes: 0
PipesUsed: 0
PipesFree: 0
ConnRate: 1
ConnRateLimit: 0
MaxConnRate: 2161
SessRate: 1
SessRateLimit: 0
MaxSessRate: 2161
SslRate: 0
SslRateLimit: 0
MaxSslRate: 0
SslFrontendKeyRate: 0
SslFrontendMaxKeyRate: 0
SslFrontendSessionReuse_pct: 0
SslBackendKeyRate: 0
SslBackendMaxKeyRate: 0
SslCacheLookups: 0
SslCacheMisses: 0
CompressBpsIn: 0
CompressBpsOut: 0
CompressBpsRateLim: 0
ZlibMemUsage: 0
MaxZlibMemUsage: 0
Tasks: 32
Run_queue: 1
Idle_pct: 100
node: pcfreak
Stopping: 0
Jobs: 13
Unstoppable Jobs: 0
Listeners: 4
ActivePeers: 0
ConnectedPeers: 0
DroppedLogs: 0
BusyPolling: 0
FailedResolutions: 0
TotalBytesOut: 744390344175
BytesOutRate: 30080
DebugCommandsIssued: 0
Build info: 2.2.9-2+deb11u5

root@pcfreak:/home/hipo/info# echo "show errors" | socat stdio /var/run/haproxy/haproxy.sock
Total events captured on [14/Dec/2023:17:29:17.930] : 0

6. View all opened sessions and, the session age (time since it has been opened) and session exp (expiry)

root@pcfreak:/home/hipo/info# echo "show sess" | socat stdio /var/run/haproxy/haproxy.sock
0x56097a763560: proto=tcpv4 src=113.120.74.123:54651 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=37s calls=3 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,rx=1m58s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=,wx=1m58s,ax=] s0=[8,200000h,fd=24,ex=] s1=[8,40018h,fd=25,ex=] exp=1m51s
0x56097a812830: proto=tcpv4 src=190.216.236.134:35526 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=17s calls=3 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m42s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m42s,wx=,ax=] s0=[8,200008h,fd=40,ex=] s1=[8,200018h,fd=41,ex=] exp=12s
0x56097a784ad0: proto=tcpv4 src=103.225.203.131:33835 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=17s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m44s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m44s,wx=,ax=] s0=[8,200008h,fd=20,ex=] s1=[8,200018h,fd=21,ex=] exp=13s
0x7f87b0082cc0: proto=tcpv4 src=190.216.236.134:35528 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=14s calls=3 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m46s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m46s,wx=,ax=] s0=[8,200008h,fd=34,ex=] s1=[8,200018h,fd=35,ex=] exp=15s
0x7f87b0089e10: proto=tcpv4 src=40.130.105.242:50669 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=11s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m49s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m49s,wx=,ax=] s0=[8,200008h,fd=15,ex=] s1=[8,200018h,fd=16,ex=] exp=18s
0x7f87b010b450: proto=tcpv4 src=64.62.202.82:37562 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=7s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m52s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m52s,wx=,ax=] s0=[8,200008h,fd=26,ex=] s1=[8,200018h,fd=27,ex=] exp=22s
0x56097a7b8bc0: proto=tcpv4 src=85.208.96.211:54226 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=0s calls=2 rate=2 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m59s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m59s,wx=,ax=] s0=[8,200008h,fd=22,ex=] s1=[8,200018h,fd=23,ex=] exp=29s
0x7f87b008ec00: proto=tcpv4 src=3.135.192.206:60258 fe=http-in be=http-websrv srv=ha1server-2 ts=00 age=0s calls=2 rate=2 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,rx=1m59s,wx=1m59s,ax=] rp[f=80008000h,i=0,an=00h,rx=1m59s,wx=1m59s,ax=] s0=[8,200008h,fd=28,ex=] s1=[8,200018h,fd=29,ex=] exp=29s
0x56097a7b2490: proto=tcpv4 src=45.147.249.119:62283 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=0s calls=3 rate=3 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m59s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m59s,wx=,ax=] s0=[8,200008h,fd=17,ex=] s1=[8,200018h,fd=18,ex=] exp=29s
0x7f87b0114f90: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 age=0s calls=1 rate=1 cpu=0 lat=0 rq[f=c48200h,i=0,an=00h,rx=,wx=,ax=] rp[f=80008002h,i=0,an=00h,rx=,wx=,ax=] s0=[8,200008h,fd=30,ex=] s1=[8,204018h,fd=-1,ex=] exp=

root@pcfreak:/home/hipo/info#

7. Disabling an haproxy frontend via UNIX socket

If you get some frontend that gets broken and this is monitored in Zabbix or other monitoring tool used to monitor you can use the haproxy stats interface to disable frontend

root@pcfreak:/home/hipo/info# echo "disable frontend https-websrv" | socat stdio /var/run/haproxy/haproxy.sock
…

8. Show general haproxy statistics (could tell you much about customer connections health state) and state of connection to backend

Lets check uptime details for frontends / backends, that is done with show stat command.

root@pcfreak:/home/hipo/info# echo "show stat" | socat stdio /var/run/haproxy/haproxy.sock
#

pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp

,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,
pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max,check_status

,check_code,check_duration,hrsp_1xx,hrsp_2xx,hrsp_3xx,hrsp_4xx

,hrsp_5xx,hrsp_other,hanafail,req_rate,req_rate_max,req_tot,cli_abrt

,srv_abrt,comp_in,comp_out,comp_byp,comp_rsp,lastsess,last_chk

,last_agt,qtime,ctime,rtime,ttime,agent_status,agent_code,

agent_duration,check_desc,agent_desc,check_rise,

check_fall,check_health,agent_rise,

agent_fall,agent_health,addr,cookie,mode,

algo,conn_rate,conn_rate_max,conn_tot,intercepted

,dcon,dses,wrew,connect,reuse,cache_lookups,

cache_hits,srv_icur,src_ilim,qtime_max,ctime_max,

rtime_max,ttime_max,eint,idle_conn_cur,

safe_conn_cur,used_conn_cur,need_conn_est,

http-in,FRONTEND,,,0,142,99999,371655,166897324,

1462777381,0,0,62,,,,,OPEN,,,,,,,,,1,2,0,,,,0,0,0,

1080,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,0,1080,

371655,,0,0,0,,,,,,,,,,,0,,,,,

http-websrv,ha1server-1,0,0,0,12,,9635,3893561

,64880833,,0,,0,3,15,0,UP

,254,0,1,41,9,4686,34728,,1,3,1,,4924,,2,0,,56,L4OK

,,0,,,,,,,,,,,900,168,,,,,1292679,,,0,0,0,2843,,,,

Layer4 check passed,,2,3,4,,,,192.168.0.209:80,,tcp,,,,,,,,

0,9635,0,,,0,,0,15024,0,672888,0,0,0,0,12,

http-websrv,ha1server-2,0,0,0,142,,321867,

149300590,1350577153,,0,,

1,4,30,0,UP,255,1,0,37,10,4685,89418,,1,3,2,,111864,,2

,0,,1080,L4OK,,0,,,,,,,,,,,37161,4822,,,,,6,,,0,12,0,

2120,,,,Layer4 check passed,,2,3,4,,,,192.168.0.200:80,,tcp,,,,,,,,0,321867,

0,,,0,,0,30223,0,1783442,0,0,0,0,142,

List continues here
….
…
..
.

9. Using netcat to view UNIX socket instead of socat

If you don't have the socat command on the server but you have netcat installed, you can also send the commands to the running haproxy daemon via nc's capability to send via UNIX socket via nc -U option.

-U Use UNIX-domain sockets. Cannot be used together with -F or -x.

root@pcfreak:/home/hipo/info# echo "set server"|nc -U /var/run/haproxy/haproxy.sock
Require 'backend/server'.

10. Get only statistics about running LB Backends and Frontends

To get only haproxy statistics about running Load Balancer BACKENDs and FRONTENDs

root@pcfreak:/home/hipo/info# echo "show stat" | sudo socat unix-connect:/var/run/haproxy/haproxy.sock stdio | awk -F '.' '/BACKEND/ {print $1, $6}'
http-websrv,BACKEND,0,0,2,142,10000,371880,167022255,1462985601,0,0,,1,7,46,0,UP

,255,1,4,,0,1709835,0,,1,3,0,,118878,,1,0,,1080,,,,,,,,,,,,,,38782,5001,0,0,0,0,5,,,0,8,0,2034

,,,,,,,,,,,,,,tcp,source,,,,,,,0,371864,0,,,,,0,30223,0,1783442,0,,,,,
https-websrv,BACKEND,0,0,5,461,10000,2374328,3083873321,740021649129,0,0,,28,42,626,0,UP
,255,1,4,,0,1709835,0,,1,5,0,,474550,,1,1,,1081,,,,,,,,,,,,,,451783,72307,0,0,0,0,0,,,0,0,0,6651

,,,,,,,,,,,,,,tcp,source,,,,,,,0,2374837,0,,,,,0,32794,0,46414141,0,,,,,

As you can see there are two configured BACKENDs that are in UP state, the other possibility is that they're DOWN if haproxy can't reach the backend.

root@pcfreak:/home/hipo/info# echo "show stat" | sudo socat unix-connect:/var/run/haproxy/haproxy.sock stdio | awk -F '.' '/FRONTEND/ {print $1, $6}'
http-in,FRONTEND,,,2,142,99999,371887,167024040,1462990718,0,0,62,,,,,OPEN

,,,,,,,,,1,2,0,,,,0,1,0,1080,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,1080,371887,,0,0,0,,,,,,,,,,,0,,,,,
https-in,FRONTEND,,,4,461,99999,2374337,3083881912,740021909870,0,0,112,,,,,OPEN

,,,,,,,,,1,4,0,,,,0,1,0,1081,,,,,,,,,,,0,0,0,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,tcp,,1,1081,2374337,,0,0,0,,,,,,,,,,,0,,,,,
root@pcfreak:/home/hipo/info#

As you can see from the list of show help you can change maxconns supported, change the proxy rate-limit and even in real time change a haproxy.cfg configured section timeouts or even modify ACLs dynamicly for Backends and Frontends.

If you use those to make a modifications to the haproxy, that modifications should been written also to Haproxy's configured instance haproxy.cfg file.
If you want to check it reload the haproxy instance with the new written haproxy.cfg, through the Unix socket.

11. Shutting down specific opened sessions

Shutting down specific session that has been opened for too long is particularly useful to do, especially if you have some kind of VPN encryption device before the Haproxy server and an Application Backend server that is buggy and fails to properly close sessions at time, to cut off a specific sessions that has been hanging for days after reviewing it with "show sess".

root@pcfreak:/home/hipo/info# echo "shutdown session 0x56097a7707d0" | socat stdio /var/run/haproxy/haproxy.sock

12. Sending shutdown to backend on a certain configured LB service

To bring down a configured backend on a certain server after listing it:

root@pcfreak:/home/hipo/info# echo "disable server bk_mybackend/srv_myserver" | socat /var/run/haproxy.sock stdio

12. Sending multiple commands to haproxy socket

# echo "show info;show stat" | socat /var/run/haproxy/haproxy.sock stdio
…

13. Report table usage information or dump table data content

It is possible to view exact queued connections inside the sticky table. To get a list of available, available configured tables on the haproxy

root@pcfreak:/home/hipo/info# echo "show table" | socat /var/run/haproxy/haproxy.sock stdio
# table: https-websrv, type: ip, size:204800, used:498
# table: http-websrv, type: ip, size:204800, used:74

To get the exact record of queued IPs inside https-websrv.

root@pcfreak:/home/hipo/info# echo "show table https-websrv" | socat /var/run/haproxy/haproxy.sock stdio|head -10
# table: https-websrv, type: ip, size:204800, used:502
0x56097a7444e0: key=2.147.73.42 use=0 exp=1090876 server_id=2 server_name=ha1server-2
0x56097a792ac0: key=3.14.130.119 use=0 exp=1038004 server_id=2 server_name=ha1server-2
0x7f87b006a4e0: key=3.15.203.28 use=0 exp=1536721 server_id=2 server_name=ha1server-2
0x56097a7467f0: key=3.16.54.132 use=0 exp=387191 server_id=2 server_name=ha1server-2
0x7f87b0075f90: key=3.17.180.28 use=0 exp=353211 server_id=2 server_name=ha1server-2
0x56097a821b10: key=3.23.114.130 use=0 exp=1521100 server_id=2 server_name=ha1server-2
0x56097a7475b0: key=3.129.250.144 use=0 exp=121043 server_id=2 server_name=ha1server-2
0x7f87b004d240: key=3.134.112.27 use=0 exp=1182169 server_id=2 server_name=ha1server-2
0x56097a754c90: key=3.135.192.206 use=0 exp=1383882 server_id=2 server_name=ha1server-2

14. Show information about Haproxy startup

Sometimes, where logrotation is integrated on the server and haproxy's logs are log rotated to a central logging server, it might be hard to get information about Haproxy startup messages (warnings, errors etc.).
As digging through old haproxy logs might be tedious, you can simply get it via the stats interface.

root@pcfreak:/home/hipo/info# echo "show startup-logs" | socat unix-connect:/var/run/haproxy/haproxy.sock stdio

[WARNING] 327/231534 (3103633) : parsing [/etc/haproxy/haproxy.cfg:62] : 'fullconn' ignored because frontend 'http-in' has no backend capability. Maybe you want 'maxconn' instead ?
[WARNING] 327/231534 (3103633) : parsing [/etc/haproxy/haproxy.cfg:69] : 'maxconn' ignored because backend 'http-websrv' has no frontend capability. Maybe you want 'fullconn' instead ?
[WARNING] 327/231534 (3103633) : parsing [/etc/haproxy/haproxy.cfg:114] : 'maxconn' ignored because backend 'https-websrv' has no frontend capability. Maybe you want 'fullconn' instead ?
[WARNING] 327/231534 (3103633) : config : missing timeouts for frontend 'http-in'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
[WARNING] 327/231534 (3103633) : config : 'option forwardfor' ignored for frontend 'http-in' as it requires HTTP mode.
[WARNING] 327/231534 (3103633) : config : 'option forwardfor' ignored for backend 'http-websrv' as it requires HTTP mode.
[WARNING] 327/231534 (3103633) : config : missing timeouts for frontend 'https-in'.
| While not properly invalid, you will certainly encounter various problems
| with such a configuration. To fix this, please ensure that all following
| timeouts are set to a non-zero value: 'client', 'connect', 'server'.
[WARNING] 327/231534 (3103633) : config : 'option forwardfor' ignored for frontend 'https-in' as it requires HTTP mode.
[WARNING] 327/231534 (3103633) : config : 'option forwardfor' ignored for backend 'https-websrv' as it requires HTTP mode.

15. Disable / Enable health check for haproxy configured backend

Disable health checks is useful, especially on non production server environments, during integration phase of application with Haproxy load balancer.

The general syntax is like this:

> disable health backend/server1

root@pcfreak:/home/hipo/info# echo "show servers state" | socat unix-connect:/var/run/haproxy/haproxy.sock stdio 1
# be_id be_name srv_id srv_name srv_addr srv_op_state srv_admin_state srv_uweight srv_iweight srv_time_since_last_change srv_check_status srv_check_result srv_check_health srv_check_state srv_agent_state bk_f_forced_id srv_f_forced_id srv_fqdn srv_port srvrecord
3 http-websrv 1 ha1server-1 192.168.0.209 2 0 254 254 13709 6 3 4 6 0 0 0 – 80 –
3 http-websrv 2 ha1server-2 192.168.0.200 2 0 255 255 13708 6 3 4 6 0 0 0 – 80 –
3 http-websrv 3 ha1server-3 192.168.1.30 2 0 252 252 13707 6 3 4 6 0 0 0 – 80 –
3 http-websrv 4 ha1server-4 192.168.1.14 2 0 253 253 13709 6 3 4 6 0 0 0 – 80 –
3 http-websrv 5 ha1server-5 192.168.0.1 2 0 251 251 1717867 6 3 4 6 0 0 0 – 80 –
5 https-websrv 1 ha1server-1 192.168.0.209 2 0 254 254 13709 6 3 4 6 0 0 0 – 443 –
5 https-websrv 2 ha1server-2 192.168.0.200 2 0 255 255 13708 6 3 4 6 0 0 0 – 443 –
5 https-websrv 3 ha1server-3 192.168.1.30 2 0 252 252 13707 6 3 4 6 0 0 0 – 443 –
5 https-websrv 4 ha1server-4 192.168.1.14 2 0 253 253 13709 6 3 4 6 0 0 0 – 443 –
5 https-websrv 5 ha1server-5 192.168.0.1 2 0 251 251 1717867 6 3 4 6 0 0 0 – 443 –
6 MASTER 1 cur-1 – 2 0 0 0 1717867 1 0 0 0 0 0 0 – 0 –

Lets disable health checks for ha1server-1 server and http-websrv backend.

root@pcfreak:/home/hipo/info# echo "disable health http-websrv/ha1server-1" | socat unix-connect:/var/run/haproxy/haproxy.sock stdio

To enable back health checks

root@pcfreak:/home/hipo/info# echo "enable health http-websrv/ha1server-1" | socat unix-connect:/var/run/haproxy/haproxy.sock stdio

16. Change weight for server

if you have a round-robin Load balancing configured and already have a predefined configuration on how many percentage of the server to be sent to which application server (e.g. have a configured weight to dynamically change it via UNIX sock iface).

# Change weight by percentage of its original value

# socat unix-connect:/var/run/haproxy/haproxy.sock stdio

> set server be_app/webserv1 weight 50%

# Change weight in proportion to other servers
> set server be_app/webserv1 weight 100

root@pcfreak:/home/hipo/info# socat unix-connect:/var/run/haproxy/haproxy.sock stdio
set server http-websrv/ha1server-1 weight 50%
Backend is using a static LB algorithm and only accepts weights '0%' and '100%'.

17. Draining traffic from server / backend App in case of Maintenance

You can gradually drain traffic away from a particular server if those backend Application server should be put in maintenance mode for update or whatever. The drain option is very interesting and combined with scripting does open a lot of possibilities for the Load balancer system administrator to put an extra automation.

To drain, set server command with the state argument set to drain:

# Drain traffic
> set server backend_app/server1 state drain

# Allow server to accept traffic again
> set server backend_app/server1 state ready

root@pcfreak:/home/hipo/info# socat unix-connect:/var/run/haproxy/haproxy.sock stdio
set server http-websrv/ha1server-1 state drain

root@pcfreak:/home/hipo/info# socat unix-connect:/var/run/haproxy/haproxy.sock stdio
set server http-websrv/ha1server-1 state ready

18. Run Interactive Mode connection to haproxy UNIX stats socket

For a haproxies that has multiple configured proxied rules backends / frontends, it is nice to use the interactive mode.
Instead of processing a single line of semicolon separate commands, HAProxy takes one command at a time and waits for the user.
In interactive mode, HAProxy sends a “>” character and waits for input command. After command is submitted, HAProxy sends back the result and waits for a new command.
The interactive mode is especially useful during phase of integrating a new haproxy towards an application, where multiple things has to be tuned on the fly without, reloading the haproxy again and again.

On RPM based distros socat is compiled to have the readline interactive capability. Thus to use the haproxy haproxy stats connect interactive mode on RHEL / CentOS / Fedora and other RPM based distros simply use:

# socat /var/run/haproxy.sock readline
> show info
Name: HAProxy
Version: 2.2.9-2+deb11u5
Release_date: 2023/04/10
Nbthread: 2
Nbproc: 1
Process_num: 1
Pid: 3103635
Uptime: 19d 20h48m50s
Uptime_sec: 1716530
Memmax_MB: 0
PoolAlloc_MB: 1
PoolUsed_MB: 0
PoolFailed: 0
Ulimit-n: 200059
Maxsock: 200059
Maxconn: 99999
Hard_maxconn: 99999
CurrConns: 9
CumConns: 19789176
CumReq: 2757976
MaxSslConns: 0
CurrSslConns: 0
CumSslConns: 0
Maxpipes: 0
PipesUsed: 0
PipesFree: 0
ConnRate: 0
ConnRateLimit: 0
MaxConnRate: 2161
SessRate: 0
SessRateLimit: 0
MaxSessRate: 2161
SslRate: 0
SslRateLimit: 0
MaxSslRate: 0
SslFrontendKeyRate: 0
SslFrontendMaxKeyRate: 0
SslFrontendSessionReuse_pct: 0
SslBackendKeyRate: 0
SslBackendMaxKeyRate: 0
SslCacheLookups: 0
SslCacheMisses: 0
CompressBpsIn: 0
CompressBpsOut: 0
CompressBpsRateLim: 0
ZlibMemUsage: 0
MaxZlibMemUsage: 0
Tasks: 35
Run_queue: 1
Idle_pct: 100
node: pcfreak
Stopping: 0
Jobs: 14
Unstoppable Jobs: 0
Listeners: 4
ActivePeers: 0
ConnectedPeers: 0
DroppedLogs: 0
BusyPolling: 0
FailedResolutions: 0
TotalBytesOut: 744964070459
BytesOutRate: 0
DebugCommandsIssued: 0
Build info: 2.2.9-2+deb11u5

On Deb (Debian) based distributions such as Debian, Ubuntu Mint Linux, unfortunately the readline inractive mode is disabled due to licensing issues that makes readline not GPL license compliant.

root@pcfreak:/home/hipo/info# socat -V|awk 'NR < 5 || tolower($0) ~ /readline/'
socat by Gerhard Rieger and contributors – see www.dest-unreach.org
socat version 1.7.4.1 on Feb 3 2021 12:58:17
running on Linux version #1 SMP Debian 5.10.179-3 (2023-07-27), release 5.10.0-23-amd64, machine x86_64
features:
#undef WITH_READLINE

There is a workaround to emulate the Intearactive mode on Debians however like this:

root@pcfreak:/home/hipo/info# while [ 1 ]; do socat – /var/run/haproxy/haproxy.sock ; done

show table
# table: https-websrv, type: ip, size:204800, used:511
# table: http-websrv, type: ip, size:204800, used:67

show sess
0x56097a784ad0: proto=tcpv4 src=45.61.161.66:51416 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=1m13s calls=3 rate=0 cpu=0 lat=0 rq[f=848000h,i=0,an=00h,rx=47s,wx=,ax=] rp[f=80048000h,i=0,an=00h,rx=47s,wx=,ax=] s0=[8,200008h,fd=17,ex=] s1=[8,200018h,fd=23,ex=] exp=47s
0x56097a7707d0: proto=tcpv4 src=47.128.41.242:39372 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=16s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m45s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m45s,wx=,ax=] s0=[8,200008h,fd=35,ex=] s1=[8,200018h,fd=36,ex=] exp=14s
0x56097a781300: proto=tcpv4 src=54.36.148.40:17439 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=13s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m47s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m47s,wx=,ax=] s0=[8,200008h,fd=26,ex=] s1=[8,200018h,fd=28,ex=] exp=17s
0x56097a7fca80: proto=tcpv4 src=18.217.94.243:4940 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=7s calls=2 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m53s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m53s,wx=,ax=] s0=[8,200008h,fd=21,ex=] s1=[8,200018h,fd=22,ex=] exp=23s
0x7f87b00778c0: proto=tcpv4 src=85.208.96.206:51708 fe=https-in be=https-websrv srv=ha1server-2 ts=00 age=4s calls=3 rate=0 cpu=0 lat=0 rq[f=848202h,i=0,an=00h,rx=1m56s,wx=,ax=] rp[f=80048202h,i=0,an=00h,rx=1m56s,wx=,ax=] s0=[8,200008h,fd=20,ex=] s1=[8,200018h,fd=24,ex=] exp=26s
0x56097a80c1e0: proto=unix_stream src=unix:1 fe=GLOBAL be=<NONE> srv=<none> ts=00 age=3s calls=1 rate=0 cpu=0 lat=0 rq[f=c48202h,i=0,an=00h,rx=10s,wx=,ax=] rp[f=80008002h,i=0,an=00h,rx=,wx=,ax=] s0=[8,200008h,fd=15,ex=] s1=[8,204018h,fd=-1,ex=] exp=7s

To end the eternal loop press CTRL + z and kill first detached job %1 run:

# kiill %1

Sum it up what learned

What we learned in this article is how to use socat and netcat to connect and manage dynamically haproxy via its haproxy stats interface, without reloading the proxqy itself. We learned how to view various statistics and information on the proxy, its existing tables, caches, session information (such as age, and expiry). Also you've seen how to disable / enable configured backends as well as get available backends and frontends and their state.
You've seen how the drained option could be used to slowly drain connections towards configured backend, in case if you need to a maintenance on a backend node.
Also was pointed how to shutdown a specific long lived sessions that has been hanging and creating troubles towards app backends.

Finally, you've seen how to open an interactive connection towards the haproxy socket and send commands in a raw with socat (on distros where compiled with readline support) as well shown how to emulate the interactive mode of rest of distros whose socat is missing the readline support.

Tags: acl, certificate, client, debugging, Disabling, frontends, many things, Sending, threads, unix
Posted in Haproxy, Linux, Networking | No Comments »

Create Linux High Availability Load Balancer Cluster with Keepalived and Haproxy on Linux

Tuesday, March 15th, 2022

Configuring a Linux HA (High Availibiltiy) for an Application with Haproxy is already used across many Websites on the Internet and serious corporations that has a crucial infrastructure has long time
adopted and used keepalived to provide High Availability Application level Clustering.
Usually companies choose to use HA Clusters with Haproxy with Pacemaker and Corosync cluster tools.
However one common used alternative solution if you don't have the oportunity to bring up a High availability cluster with Pacemaker / Corosync / pcs (Pacemaker Configuration System) due to fact machines you need to configure the cluster on are not Physical but VMWare Virtual Machines which couldn't not have configured a separate Admin Lans and Heartbeat Lan as we usually do on a Pacemaker Cluster due to the fact the 5 Ethernet LAN Card Interfaces of the VMWare Hypervisor hosts are configured as a BOND (e.g. all the incoming traffic to the VMWare vSphere HV is received on one Virtual Bond interface).

I assume you have 2 separate vSphere Hypervisor Physical Machines in separate Racks and separate switches hosting the two VMs.
For the article, I'll call the two brand new brought Virtual Machines with some installation automation software such as Terraform or Ansible – vm-server1 and vm-server2 which would have configured some recent version of Linux.

In that scenario to have a High Avaiability for the VMs on Application level and assure at least one of the two is available at a time if one gets broken due toe malfunction of the HV, a Network connectivity issue, or because the VM OS has crashed.
Then one relatively easily solution is to use keepalived and configurea single High Availability Virtual IP (VIP) Address, i.e. 10.10.10.1, which would float among two VMs using keepalived so at a time at least one of the two VMs would be reachable on the Network.

haproxy_keepalived-vip-ip-diagram-linux

Having a VIP IP is quite a common solution in corporate world, as it makes it pretty easy to add F5 Load Balancer in front of the keepalived cluster setup to have a 3 Level of security isolation, which usually consists of:

1. Physical (access to the hardware or Virtualization hosts)
2. System Access (The mechanism to access the system login credetials users / passes, proxies, entry servers leading to DMZ-ed network)
3. Application Level (access to different programs behind L2 and data based on the specific identity of the individual user,
special Secondary UserID, Factor authentication, biometrics etc.)

1. Install keepalived and haproxy on machines

Depending on the type of Linux OS:

On both machines

[root@server1:~]# yum install -y keepalived haproxy
…

If you have to install keepalived / haproxy on Debian / Ubuntu and other Deb based Linux distros

[root@server1:~]# apt install keepalived haproxy –yes
…

2. Configure haproxy (haproxy.cfg) on both server1 and server2

Create some /etc/haproxy/haproxy.cfg configuration

[root@server1:~]# vim /etc/haproxy/haproxy.cfg

#———————————————————————
# Global settings
#———————————————————————
global
log 127.0.0.1 local6 debug
chroot /var/lib/haproxy
pidfile /run/haproxy.pid
stats socket /var/lib/haproxy/haproxy.sock mode 0600 level admin
maxconn 4000
user haproxy
group haproxy
daemon
#debug
#quiet

#———————————————————————
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#———————————————————————
defaults
mode tcp
log global
# option dontlognull
# option httpclose
# option httplog
# option forwardfor
option redispatch
option log-health-checks
timeout connect 10000 # default 10 second time out if a backend is not found
timeout client 300000
timeout server 300000
maxconn 60000
retries 3

#———————————————————————
# round robin balancing between the various backends
#———————————————————————

listen FRONTEND_APPNAME1
bind 10.10.10.1:15000
mode tcp
option tcplog
# #log global
log-format [%t]\ %ci:%cp\ %bi:%bp\ %b/%s:%sp\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
balance roundrobin
timeout client 350000
timeout server 350000
timeout connect 35000
server app-server1 10.10.10.55:30000 weight 1 check port 68888
server app-server2 10.10.10.55:30000 weight 2 check port 68888

listen FRONTEND_APPNAME2
bind 10.10.10.1:15000
mode tcp
option tcplog
#log global
log-format [%t]\ %ci:%cp\ %bi:%bp\ %b/%s:%sp\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
balance roundrobin
timeout client 350000
timeout server 350000
timeout connect 35000
server app-server1 10.10.10.55:30000 weight 5
server app-server2 10.10.10.55:30000 weight 5

You can get a copy of above haproxy.cfg configuration here.
Once configured roll it on.

[root@server1:~]# systemctl start haproxy
[root@server1:~]# ps -ef|grep -i hapro
root 285047 1 0 Mar07 ? 00:00:00 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid
haproxy 285050 285047 0 Mar07 ? 00:00:26 /usr/sbin/haproxy -Ws -f /etc/haproxy/haproxy.cfg -p /run/haproxy.pid

Bring up the haproxy also on server2 machine, by placing same configuration and starting up the proxy.

[root@server1:~]# vim /etc/haproxy/haproxy.cfg
…

…

3. Configure keepalived on both servers

We'll be configuring 2 nodes with keepalived even though if necessery this can be easily extended and you can add more nodes.
First we make a copy of the original or existing server configuration keepalived.conf (just in case we need it later on or if you already had something other configured manually by someone – that could be so on inherited servers by other sysadmin)

[root@server1:~]# mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig
[root@server2:~]# mv /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.orig

a. Configure keepalived to serve as a MASTER Node

[root@server1:~]# vim /etc/keepalived/keepalived.conf

Master Node
global_defs {
router_id server1-fqdn # The hostname of this host.

enable_script_security
# Synchro of the state of the connections between the LBs on the eth0 interface
lvs_sync_daemon eth0

notification_email {
linuxadmin@notify-domain.com # Email address for notifications
}
notification_email_from keepalived@server1-fqdn # The from address for the notifications
smtp_server 127.0.0.1 # SMTP server address
smtp_connect_timeout 15
}

vrrp_script haproxy {
script "killall -0 haproxy"
interval 2
weight 2
user root
}

vrrp_instance LB_VIP_QA {
virtual_router_id 50
advert_int 1
priority 51

state MASTER
interface eth0
smtp_alert # Enable Notifications Via Email

authentication {
auth_type PASS
auth_pass testp141

}
### Commented because running on VM on VMWare
## unicast_src_ip 10.44.192.134 # Private IP address of master
## unicast_peer {
## 10.44.192.135 # Private IP address of the backup haproxy
## }

# }
# master node with higher priority preferred node for Virtual IP if both keepalived up
### priority 51
### state MASTER
### interface eth0
virtual_ipaddress {
10.10.10.1 dev eth0 # The virtual IP address that will be shared between MASTER and BACKUP
}
track_script {
haproxy
}
}

To dowload a copy of the Master keepalived.conf configuration click here

Below are few interesting configuration variables, worthy to mention few words on, most of them are obvious by their names but for more clarity I'll also give a list here with short description of each:

vrrp_instance – defines an individual instance of the VRRP protocol running on an interface.
state – defines the initial state that the instance should start in (i.e. MASTER / SLAVE )state –
interface – defines the interface that VRRP runs on.
virtual_router_id – should be unique value per Keepalived Node (otherwise slave master won't function properly)
priority – the advertised priority, the higher the priority the more important the respective configured keepalived node is.
advert_int – specifies the frequency that advertisements are sent at (1 second, in this case).
authentication – specifies the information necessary for servers participating in VRRP to authenticate with each other. In this case, a simple password is defined.
only the first eight (8) characters will be used as described in to note is Important thing
man keepalived.conf – keepalived.conf variables documentation !!! Nota Bene !!! – Password set on each node should match for nodes to be able to authenticate !
virtual_ipaddress – defines the IP addresses (there can be multiple) that VRRP is responsible for.
notification_email – the notification email to which Alerts will be send in case if keepalived on 1 node is stopped (e.g. the MASTER node switches from host 1 to 2)
notification_email_from – email address sender from where email will originte
! NB ! In order for notification_email to be working you need to have configured MTA or Mail Relay (set to local MTA) to another SMTP – e.g. have configured something like Postfix, Qmail or Postfix

b. Configure keepalived to serve as a SLAVE Node

[root@server1:~]# vim /etc/keepalived/keepalived.conf

#Slave keepalived
global_defs {
router_id server2-fqdn # The hostname of this host!

enable_script_security
# Synchro of the state of the connections between the LBs on the eth0 interface
lvs_sync_daemon eth0

notification_email {
linuxadmin@notify-host.com # Email address for notifications
}
notification_email_from keepalived@server2-fqdn # The from address for the notifications
smtp_server 127.0.0.1 # SMTP server address
smtp_connect_timeout 15
}

vrrp_script haproxy {
script "killall -0 haproxy"
interval 2
weight 2
user root
}

vrrp_instance LB_VIP_QA {
virtual_router_id 50
advert_int 1
priority 50

state BACKUP
interface eth0
smtp_alert # Enable Notifications Via Email

authentication {
auth_type PASS
auth_pass testp141
}
### Commented because running on VM on VMWare
## unicast_src_ip 10.10.192.135 # Private IP address of master
## unicast_peer {
## 10.10.192.134 # Private IP address of the backup haproxy
## }

### priority 50
### state BACKUP
### interface eth0
virtual_ipaddress {
10.10.10.1 dev eth0 # The virtual IP address that will be shared betwee MASTER and BACKUP.
}
track_script {
haproxy
}
}

Download the keepalived.conf slave config here

c. Set required sysctl parameters for haproxy to work as expected

[root@server1:~]# vim /etc/sysctl.conf
#Haproxy config
# haproxy
net.core.somaxconn=65535
net.ipv4.ip_local_port_range = 1024 65000
net.ipv4.ip_nonlocal_bind = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3

4. Test Keepalived keepalived.conf configuration syntax is OK

[root@server1:~]# keepalived –config-test
(/etc/keepalived/keepalived.conf: Line 7) Unknown keyword 'lvs_sync_daemon_interface'
(/etc/keepalived/keepalived.conf: Line 21) Unable to set default user for vrrp script haproxy – removing
(/etc/keepalived/keepalived.conf: Line 31) (LB_VIP_QA) Specifying lvs_sync_daemon_interface against a vrrp is deprecated.
(/etc/keepalived/keepalived.conf: Line 31) Please use global lvs_sync_daemon
(/etc/keepalived/keepalived.conf: Line 35) Truncating auth_pass to 8 characters
(/etc/keepalived/keepalived.conf: Line 50) (LB_VIP_QA) track script haproxy not found, ignoring…

I've experienced this error because first time I've configured keepalived, I did not mention the user with which the vrrp script haproxy should run,
in prior versions of keepalived, leaving the field empty did automatically assumed you have the user with which the vrrp script runs to be set to root
as of RHELs keepalived-2.1.5-6.el8.x86_64, i've been using however this is no longer so and thus in prior configuration as you can see I've
set the user in respective section to root.
The error Unknown keyword 'lvs_sync_daemon_interface' is also easily fixable by just substituting the lvs_sync_daemon_interface and lvs_sync_daemon and reloading
keepalived etc.

Once keepalived is started and you can see the process on both machines running in process list.

[root@server1:~]# ps -ef |grep -i keepalived
root 1190884 1 0 18:50 ? 00:00:00 /usr/sbin/keepalived -D
root 1190885 1190884 0 18:50 ? 00:00:00 /usr/sbin/keepalived -D

Next step is to check the keepalived statuses as well as /var/log/keepalived.log

If everything is configured as expected on both keepalived on first node you should see one is master and one is slave either in the status or the log

[root@server1:~]#systemctl restart keepalived

[root@server1:~]# systemctl status keepalived|grep -i state
Mar 14 18:59:02 server1-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) Entering MASTER STATE

[root@server1:~]# systemctl status keepalived

● keepalived.service – LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Mon 2022-03-14 18:15:51 CET; 32min ago
Process: 1187587 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1187589 (code=exited, status=0/SUCCESS)

Mar 14 18:15:04 server1lb-fqdn Keepalived_vrrp[1187590]: Sending gratuitous ARP on eth0 for 10.44.192.142
Mar 14 18:15:50 server1lb-fqdn systemd[1]: Stopping LVS and VRRP High Availability Monitor…
Mar 14 18:15:50 server1lb-fqdn Keepalived[1187589]: Stopping
Mar 14 18:15:50 server1lb-fqdn Keepalived_vrrp[1187590]: (LB_VIP_QA) sent 0 priority
Mar 14 18:15:50 server1lb-fqdn Keepalived_vrrp[1187590]: (LB_VIP_QA) removing VIPs.
Mar 14 18:15:51 server1lb-fqdn Keepalived_vrrp[1187590]: Stopped – used 0.002007 user time, 0.016303 system time
Mar 14 18:15:51 server1lb-fqdn Keepalived[1187589]: CPU usage (self/children) user: 0.000000/0.038715 system: 0.001061/0.166434
Mar 14 18:15:51 server1lb-fqdn Keepalived[1187589]: Stopped Keepalived v2.1.5 (07/13,2020)
Mar 14 18:15:51 server1lb-fqdn systemd[1]: keepalived.service: Succeeded.
Mar 14 18:15:51 server1lb-fqdn systemd[1]: Stopped LVS and VRRP High Availability Monitor

[root@server2:~]# systemctl status keepalived|grep -i state
Mar 14 18:59:02 server2-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Entering BACKUP STATE

[root@server1:~]# grep -i state /var/log/keepalived.log
Mar 14 18:59:02 server1lb-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Entering MASTER STATE

a. Fix Keepalived SECURITY VIOLATION – scripts are being executed but script_security not enabled.

When configurating keepalived for a first time we have faced the following strange error inside keepalived status inside keepalived.log

Feb 23 14:28:41 server1 Keepalived_vrrp[945478]: SECURITY VIOLATION – scripts are being executed but script_security not enabled.

To fix keepalived SECURITY VIOLATION error:

Add to /etc/keepalived/keepalived.conf on the keepalived node hosts
inside

global_defs {}

After chunk

enable_script_security

include

# Synchro of the state of the connections between the LBs on the eth0 interface
lvs_sync_daemon_interface eth0

5. Prepare rsyslog configuration and Inlcude additional keepalived options
to force keepalived log into /var/log/keepalived.log

To force keepalived log into /var/log/keepalived.log on RHEL 8 / CentOS and other Redhat Package Manager (RPM) Linux distributions

[root@server1:~]# vim /etc/rsyslog.d/48_keepalived.conf

#2022/02/02: HAProxy logs to local6, save the messages
local7.* /var/log/keepalived.log
if ($programname == 'Keepalived') then -/var/log/keepalived.log
if ($programname == 'Keepalived_vrrp') then -/var/log/keepalived.log
& stop

[root@server:~]# touch /var/log/keepalived.log

Reload rsyslog to load new config

[root@server:~]# systemctl restart rsyslog
[root@server:~]# systemctl status rsyslog

● rsyslog.service – System Logging Service
Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/rsyslog.service.d
└─rsyslog-service.conf
Active: active (running) since Mon 2022-03-07 13:34:38 CET; 1 weeks 0 days ago
Docs: man:rsyslogd(8)
https://www.rsyslog.com/doc/
Main PID: 269574 (rsyslogd)
Tasks: 6 (limit: 100914)
Memory: 5.1M
CGroup: /system.slice/rsyslog.service
└─269574 /usr/sbin/rsyslogd -n

Mar 15 08:15:16 server1lb-fqdn rsyslogd[269574]: — MARK —
Mar 15 08:35:16 server1lb-fqdn rsyslogd[269574]: — MARK —
Mar 15 08:55:16 server1lb-fqdn rsyslogd[269574]: — MARK —

If once keepalived is loaded but you still have no log written inside /var/log/keepalived.log

[root@server1:~]# vim /etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -S 7"

[root@server2:~]# vim /etc/sysconfig/keepalived
KEEPALIVED_OPTIONS="-D -S 7"

[root@server1:~]# systemctl restart keepalived.service
[root@server1:~]# systemctl status keepalived

● keepalived.service – LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Thu 2022-02-24 12:12:20 CET; 2 weeks 4 days ago
Main PID: 1030501 (keepalived)
Tasks: 2 (limit: 100914)
Memory: 1.8M
CGroup: /system.slice/keepalived.service
├─1030501 /usr/sbin/keepalived -D
└─1030502 /usr/sbin/keepalived -D

Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

[root@server2:~]# systemctl restart keepalived.service
[root@server2:~]# systemctl status keepalived
…

6. Monitoring VRRP traffic of the two keepaliveds with tcpdump

Once both keepalived are up and running a good thing is to check the VRRP protocol traffic keeps fluently on both machines.
Keepalived VRRP keeps communicating over the TCP / IP Port 112 thus you can simply snoop TCP tracffic on its protocol.

[root@server1:~]# tcpdump proto 112

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:08:07.356187 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:08.356297 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:09.356408 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:10.356511 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:11.356655 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20

[root@server2:~]# tcpdump proto 112

tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:08:07.356187 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:08.356297 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:09.356408 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:10.356511 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20
11:08:11.356655 IP server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20

As you can see the VRRP traffic on the network is originating only from server1lb-fqdn, this is so because host server1lb-fqdn is the keepalived configured master node.

It is possible to spoof the password configured to authenticate between two nodes, thus if you're bringing up keepalived service cluster make sure your security is tight at best the machines should be in a special local LAN DMZ, do not configure DMZ on the internet !!! 🙂 Or if you eventually decide to configure keepalived in between remote hosts, make sure you somehow use encrypted VPN or SSH tunnels to tunnel the VRRP traffic.

[root@server1:~]# tcpdump proto 112 -vv
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 262144 bytes
11:36:25.530772 IP (tos 0xc0, ttl 255, id 59838, offset 0, flags [none], proto VRRP (112), length 40)
server1lb-fqdn > vrrp.mcast.net: vrrp server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20, addrs: VIPIP_QA auth "testp431"
11:36:26.530874 IP (tos 0xc0, ttl 255, id 59839, offset 0, flags [none], proto VRRP (112), length 40)
server1lb-fqdn > vrrp.mcast.net: vrrp server1lb-fqdn > vrrp.mcast.net: VRRPv2, Advertisement, vrid 50, prio 53, authtype simple, intvl 1s, length 20, addrs: VIPIP_QA auth "testp431"

Lets also check what floating IP is configured on the machines:

[root@server1:~]# ip -brief address show
lo UNKNOWN 127.0.0.1/8
eth0 UP 10.10.10.5/26 10.10.10.1/32

The 10.10.10.5 IP is the main IP set on LAN interface eth0, 10.10.10.1 is the floating IP which as you can see is currently set by keepalived to listen on first node.

[root@server2:~]# ip -brief address show |grep -i 10.10.10.1

An empty output is returned as floating IP is currently configured on server1

To double assure ourselves the IP is assigned on correct machine, lets ping it and check the IP assigned MAC currently belongs to which machine.

[root@server2:~]# ping 10.10.10.1
PING 10.10.10.1 (10.10.10.1) 56(84) bytes of data.
64 bytes from 10.10.10.1: icmp_seq=1 ttl=64 time=0.526 ms
^C
— 10.10.10.1 ping statistics —
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.526/0.526/0.526/0.000 ms
[root@server2:~]# arp -an |grep -i 10.44.192.142
? (10.10.10.1) at 00:48:54:91:83:7d [ether] on eth0
[root@server2:~]# ip a s|grep -i 00:48:54:91:83:7d
[root@server2:~]#

As you can see from below output MAC is not found in configured IPs on server2.

[root@server1-fqdn:~]# /sbin/ip a s|grep -i 00:48:54:91:83:7d -B1 -A1
eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
link/ether 00:48:54:91:83:7d brd ff:ff:ff:ff:ff:ff
inet 10.10.10.1/26 brd 10.10.1.191 scope global noprefixroute eth0

Pretty much expected MAC is on keepalived node server1.

7. Testing keepalived on server1 and server2 maachines VIP floating IP really works

To test the overall configuration just created, you should stop keeaplived on the Master node and in meantime keep an eye on Slave node (server2), whether it can figure out the Master node is gone and switch its
state BACKUP to save MASTER. By changing the secondary (Slave) keepalived to master the floating IP: 10.10.10.1 will be brought up by the scripts on server2.

Lets assume that something went wrong with server1 VM host, for example the machine crashed due to service overload, DDoS or simply a kernel bug or whatever reason.
To simulate that we simply have to stop keepalived, then the broadcasted information on VRRP TCP/IP proto port 112 will be no longer available and keepalived on node server2, once
unable to communicate to server1 should chnage itself to state MASTER.

[root@server1:~]# systemctl stop keepalived
[root@server1:~]# systemctl status keepalived
● keepalived.service – LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: inactive (dead) since Tue 2022-03-15 12:11:33 CET; 3s ago
Process: 1192001 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 1192002 (code=exited, status=0/SUCCESS)

Mar 14 18:59:07 server1lb-fqdn Keepalived_vrrp[1192003]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:32 server1lb-fqdn systemd[1]: Stopping LVS and VRRP High Availability Monitor…
Mar 15 12:11:32 server1lb-fqdn Keepalived[1192002]: Stopping
Mar 15 12:11:32 server1lb-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) sent 0 priority
Mar 15 12:11:32 server1lb-fqdn Keepalived_vrrp[1192003]: (LB_VIP_QA) removing VIPs.
Mar 15 12:11:33 server1lb-fqdn Keepalived_vrrp[1192003]: Stopped – used 2.145252 user time, 15.513454 system time
Mar 15 12:11:33 server1lb-fqdn Keepalived[1192002]: CPU usage (self/children) user: 0.000000/44.555362 system: 0.001151/170.118126
Mar 15 12:11:33 server1lb-fqdn Keepalived[1192002]: Stopped Keepalived v2.1.5 (07/13,2020)
Mar 15 12:11:33 server1lb-fqdn systemd[1]: keepalived.service: Succeeded.
Mar 15 12:11:33 server1lb-fqdn systemd[1]: Stopped LVS and VRRP High Availability Monitor.

On keepalived off, you will get also a notification Email on the Receipt Email configured from keepalived.conf from the working keepalived node with a simple message like:

=> VRRP Instance is no longer owning VRRP VIPs <=

Once keepalived is back up you will get another notification like:

=> VRRP Instance is now owning VRRP VIPs <=

[root@server2:~]# systemctl status keepalived
● keepalived.service – LVS and VRRP High Availability Monitor
Loaded: loaded (/usr/lib/systemd/system/keepalived.service; enabled; vendor preset: disabled)
Active: active (running) since Mon 2022-03-14 18:13:52 CET; 17h ago
Process: 297366 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
Main PID: 297367 (keepalived)
Tasks: 2 (limit: 100914)
Memory: 2.1M
CGroup: /system.slice/keepalived.service
├─297367 /usr/sbin/keepalived -D -S 7
└─297368 /usr/sbin/keepalived -D -S 7

Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: Remote SMTP server [127.0.0.1]:25 connected.
Mar 15 12:11:33 server2lb-fqdn Keepalived_vrrp[297368]: SMTP alert successfully sent.
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: (LB_VIP_QA) Sending/queueing gratuitous ARPs on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1
Mar 15 12:11:38 server2lb-fqdn Keepalived_vrrp[297368]: Sending gratuitous ARP on eth0 for 10.10.10.1

[root@server2:~]# ip addr show|grep -i 10.10.10.1
inet 10.10.10.1/32 scope global eth0

As you see the VIP is now set on server2, just like expected – that's OK, everything works as expected. If the IP did not move double check the keepalived.conf on both nodes for errors or misconfigurations.

To recover the initial order of things so server1 is MASTER and server2 SLAVE host, we just have to switch on the keepalived on server1 machine.

[root@server1:~]# systemctl start keepalived

The automatic change of server1 to MASTER node and respective move of the VIP IP is done because of the higher priority (of importance we previously configured on server1 in keepalived.conf).

What we learned?

So what we learned in this article?
We have seen how to easily install and configure a High Availability Load balancer with Keepalived with single floating VIP IP address with 1 MASTER and 1 SLAVE host and a Haproxy example config with few frontends / App backends. We have seen how the config can be tested for potential errors and how we can monitor whether the VRRP2 network traffic flows between nodes and how to potentially debug it further if necessery.
Further on rawly explained some of the keepalived configurations but as keepalived can do pretty much more,for anyone seriously willing to deal with keepalived on a daily basis or just fine tune some already existing ones, you better read closely its manual page "man keepalived.conf" as well as the official Redhat Linux documentation page on setting up a Linux cluster with Keepalived (Be prepare for a small nightmare as the documentation of it seems to be a bit chaotic, and even I would say partly missing or opening questions on what does the developers did meant – not strange considering the havoc that is pretty much as everywhere these days.)

Finally once keepalived hosts are prepared, it was shown how to test the keepalived application cluster and Floating IP does move between nodes in case if one of the 2 keepalived nodes is inaccessible.

The same logic can be repeated multiple times and if necessery you can set multiple VIPs to expand the HA reachable IPs solution.

high-availability-with-two-vips-example-diagram

The presented idea is with haproxy forward Proxy server to proxy requests towards Application backend (servince machines), however if you need to set another set of server on the flow to process HTML / XHTML / PHP / Perl / Python programming code, with some common Webserver setup ( Nginx / Apache / Tomcat / JBOSS) and enable SSL Secure certificate with lets say Letsencrypt, this can be relatively easily done. If you want to implement letsencrypt and a webserver check this redundant SSL Load Balancing with haproxy & keepalived article.

That's all folks, hope you enjoyed.
If you need to configure keepalived Cluster or a consultancy write your query here 🙂

Tags: Ansible, application cluster, backup, Below, cluster config, common, configured, Configuring, CPU, Easy, everything, haproxy, howto, incoming traffic, inside, installation, keepalived, linux?, long time, Mar, master node, network, nightmare, Sending, server2, servers, Set, Slave, smtp, something, Stopping, test keepalived, var, vip address, virtual machines, Warning Journal
Posted in Cloud services, Educational, Haproxy, Keepalived, System Administration | No Comments »

How to work around STARTTLS Qmail Thunderbird / Outlook mail sending (error) issues

Wednesday, October 26th, 2011

work-around-starttls-qmail-thunderbird-outlook-mail-sending-error-message

After configuring a new Qmail+POP3+IMAP with vpopmail install based on Thibs QmailRocks I faced some issues with configuring mail accounts in Mozilla Thunderbird. The problem is also present in Microsoft Outlook Express as some colleagues working on Windows reported they can't configure there email accounts in Outlook either.

The issue was like this, the mail server is running fine, and I can send without issues directly from the server shell with mail command, however in Thunderbird I could only fetch the messages via POP3 or IMAP, whever I give a try to send one I got the error:

Sending of Message Failed The message could not be sent using SMTP server for an unknown reason. Please verify that SMTP server settings are correct and try again, or contact your network administrator

Here is a screenshot preseting the issue, taken from my Thunderbird:

Message sending Qmail STARTTLS failed unknown reason

The reason for this error is an automatic setting that is being configured in Thunderbird in New Account Creation time:
Thunderbird queries the mail server and asks for the type of encryptions available for both POP3 and SMTP MX primary host.
Seeing that it supports STARTTLS data transfer encryption mail protocol for both POP3 / IMAP, Thunderbirds auto configuration does place STARTTLS to be used with SMTP and POP3

The incorrect setting which is being automatically filled in can be checked in following these Thunderbird menus:

Edit -> Account Settings -> Outgoing Server (SMTP)

If the configured mail account MX server is let's say mail.exampledomain.com one needs to Edit the settings for this SMTP auto configured domains and he will see some example settings like the one shown in the below screenshot:

SMTP Server Outgoing Server incorrect settings STARTTLS reason / problem

You can see from above's screenshot that the auto configured Connection Security setting is improperly set to: STARTTLS. Usually STARTTLS should be working on SMTP port 25, however it seems the problem consists in the fact that the MAIL FROM and RCPT TO is sent in incorrec time (ain't sure if its before or after the encryption).

Therefore the consequence of this failure to use STARTTLS being detected as the correct encryption type for SMTP lead that the new configured mail server clients were unable tot properly connect and send emails via the SMTP listening server on port 25.

I give a try and changing the Connection Security:STARTTLS to Connection SecuritySSL/TLS immediately resolved the SMTP sending issues. Therefore as I found out the SMTP server is working just fine configured to use my QMAIL on port 465 with Connection Security: SSL/TLS and hence to work around the SMTP sending issues, decided to completely disable the STARTTLS encryption to be reported as a supported encryption by qmail-smtpd

On Thibs QmailRocks and some other Qmail installstions based more or less on qmail.jms1.net service damemontools scripts, this can be done by simply changing a line:

DENY_TLS=0

DENY_TLS=1

The qmail start up scripts which these change has to be done if one has configured a mail server based on QmailRocks Thibs updated tutorial are:

1. /service/qmail-smtpd
2. /service/qmail-smtpdssl
A quick way to do the DENY_TLS=0 to DENY_TLS=1 changes via sed is like this:

qmail# sed -e 's#DENY_TLS=0#DENY_TLS=1#g' /service/qmail-smtpd/run >> /tmp/qmail-smtpd-run;qmail# sed -e 's#DENY_TLS=0#DENY_TLS=1#g' /service/qmail-smtpdssl/run >> /tmp/qmail-smtpdssl-run;qmail# mv /tmp/qmail-smtpd-run /service/qmail-smtpd/run qmail# mv /tmp/qmail-smtpdssl-run /service/qmail-smtpdssl/run

After the correct modifications, of course as usual a qmail restart is required, e.g.:

qmail# qmailctl restart ...

Making this changes, irradicated the sending issues. It's best practice that the account which had issues with sending before is deleted and recreated from scratch.
Hope this helps somebody out there who encounters the same issue. Cheers 😉

Tags: account creation, account settings, administratorHere, anunknown, auto configuration, com, configure, configured, connection security, creation time, email accounts, encryption, Express, host, How to, inco, issue, mail account, mail accounts, mail command, mail protocol, mail server, Microsoft, microsoft outlook express, Mozilla, mozilla thunderbird, mx server, network administrator, Outgoing, outgoing server, outlook, outlook mail, place, pop, Qmail, QmailRocks, reason, screenshot, Sending, server settings, server smtp, Shell, smtp, smtp port 25, starttls, thunderbirds, time, TLS, tmp, type, working
Posted in FreeBSD, Linux, Qmail, System Administration, Various, Web and CMS | 5 Comments »

Fixing Qmail 451 qq temporary problem (#4.3.0) / @4000000050587780174c60dc status: qmail-todo stop processing asap / status: exiting

Wednesday, September 19th, 2012

I’m in process of installing plain new Qmail mail (SMTP) server following QmailRocks updated: Thibs QmailRocks install guide for Debian 6.0 Squeeze
The install went smoothly so far and I’m already doing this installation for about 5 hours or so. I’m done with the minor install and following Thibs instructions to Implement validrcptto feature to Qmail.

Anyone who works with Qmail, should already know the lack of validrcptto tons of SPAM problems and useless Qmail load, because of QMAIL attempts to delivery to the local mail server unexisting mail boxes ….

Fixing this whole mess is implemented with the validrcptto. I myself has installed numerous times validrcptto and almost ever I ended up in some kind of mess before fixing it once and for all, this time of course (quite traditionally) the “story” repeated to piss me off for a while 🙂

After following steps literally as described on Thibs great Qmail install tutorial!, I ended up with a Qmail mail server unable to deliver properly e-mails.

To debug why mails are not properly delivered by the mail server I used telnet:

root@qmail-host:/var/qmail/control# telnet localhost 25 Trying 127.0.0.1... Connected to localhost. Escape character is '^]'. 220 This is Mail Pc-Freak.NET ESMTP HELO localhost 250 This is Mail Pc-Freak.NET MAIL FROM:<hipo@www.pc-freak.net> 250 ok RCPT TO:<hipo@www.pc-freak.net> 250 ok DATA 354 go ahead asdfdsfafsd . 451 qq temporary problem (#4.3.0)

Some time, back while configuring another Qmail fresh install, I ended up with exactly same delivery error – I’ve take time to document how I fixed this weird qq temporary issue here

As I thought one error in “normal” Software can correspondent to one cause, I red my previous post and checked closely all that was in past wrong whether I encountered the err; guess what this time it wasn’t due to non-running (missing) clamav-daemon. Still though this was not the issue, it partially pointed me to the cause (problem with qmail-scanner.pl / spamd /pyzor / razor / dcc or whatever of this overall complexity ..).

First logical think was to check in logs. In /var/log/qmail/qmail-smtpd/current everything was looking good; my log looked like so:

root@qmail-host:/# tail -n 10 /var/log/qmail/qmail-smtpd/current @40000000505877b91ab3aba4 tcpserver: end 23727 status 0 @40000000505877b91ab3af8c tcpserver: status: 0/30 @40000000505877f6273acefc tcpserver: status: 1/30 @40000000505877f6273ba9bc tcpserver: pid 23882 from 127.0.0.1 @40000000505877f6273f8dd4 tcpserver: ok 23882 mail.www.pc-freak.net:127.0.0.1:25 localhost:127.0.0.1::46769 @40000000505877fd1a3c647c qmail-smtpd[23882]: MFCHECK pass [127.0.0.1] www.pc-freak.net @40000000505877fd1a3c935c qmail-smtpd[23882]: MAIL FROM: @400000005058780123ba5eb4 qmail-smtpd[23882]: RCPT TO: @4000000050587ccd179210b4 tcpserver: end 23882 status 256 @4000000050587ccd1792149c tcpserver: status: 0/30 root@qmail-host:/# tail -n 5 /var/log/qmail/qmail-smtpd/current @40000000505877fd1a3c647c qmail-smtpd[23882]: MFCHECK pass [127.0.0.1] www.pc-freak.net

Second guess was to check in /var/log/qmail/qmail-send/current, there found errors like:

root@qmail-host:/# tail -n 10 /var/log/qmail/qmail-send/current @4000000050584f8e0b799194 status: local 0/10 remote 0/120 @4000000050584f8e0b79957c end msg 9610091 @4000000050584fde2f5ebf44 status: qmail-todo stop processing asap @4000000050584fde2f5ec32c status: exiting @4000000050584fde32d2a884 status: local 0/10 remote 0/120 @4000000050584fe8136a44ac status: qmail-todo stop processing asap @4000000050584fe8136a4894 status: exiting @4000000050584fe8138b884c status: local 0/10 remote 0/120 @4000000050585014232903c4 status: qmail-todo stop processing asap @4000000050585014232907ac status: exiting @40000000505850142363e5fc status: local 0/10 remote 0/120 @40000000505851030773efa4 status: qmail-todo stop processing asap @40000000505851030774320c status: exiting @400000005058510307b5f214 status: local 0/10 remote 0/120

s you can see yourself, the errors are not giving any insight on what could be the reason, so I checked in /var/log/mail.log, just to find more errors there:

Sep 18 16:22:04 qmail-host qmail-scanner-queue.pl: X-Qmail-Scanner-2.10st:[pcfreak134797452279623171] d_m: output spotted from /usr/bin/reformime -x/var/spool/qscan/tmp/qmail-host/I134797452279623171/ (sh: /usr/bin/reformime: not found#012) - that shouldn't happen!

As the error points out, the whole issues are caused by missing binary – /usr/bin/reformime. Logically I had to install reformime, so did a quick apt-cache search reformime and saw reformime is part of maildrop deb package. I thought it is installed but after checking with:

dpkg -a |grep -i maildrop

Realized it is missing and install it:

qmail-host:/# apt-get --yes install maildrop ....

That’s all after a qmail restart, i.e.:

qmail-host:/# qmailctl restart * Stopping qmail-smtpdssl. * Stopping qmail-smtpd. * Sending qmail-send SIGTERM and restarting. * Restarting qmail-smtpd. * Restarting qmail-smtpdssl. * Restarting qmail-pop3d.

qq temporary error got solved and from there on qmail received and sent mails normally with validrcptto enabled. Cheers 😉

Tags: exiting, Fixing Qmail, logs, net, processing, root, Sending, status, time, var
Posted in Qmail, System Administration | 1 Comment »

Thunderbird mail check problem fix to: “An error occurred sending mail: Unable to establish a secure link with SMTP server smtp.examplehost.com using STARTTLS since it doesn’t advertise that feature.”

Wednesday, August 22nd, 2012

Some clients of one of the qmail servers mail domain complained that there are problems sending e-mails with Thunderbird (pop / imap) client.

The exact Thunderbird sending error is:

Unable to establish a secure link with SMTP server smtp.examplehost.com using STARTTLS since it doesn't advertise that feature. Switch off STARTTLS for that server or contact your service provider.

For for almost half an hour I pondered why the heck this odd error happens in sending mails with a fresh new Thunderbird (auto) configured mail address.
Few months back some clients were experiencing similar STARTTLS errors so I went back to check my previous post to get an idea what was wrong then in order to determine if the current reported error had to do with the previous one. My previous post is here – How to work around STARTTLS Qmail Thunderbird / Outlook mail sending (error) issues

After reading on the previous error and some assumptions I found out the whole problem lays in incorrectly set DNS records.
By default Thunderbird (and probably other mail clients) are configuring automatically as SMTP server (smtp.examplehost.com) if the DNS record for smtp.examplehost.com points to an IP address / host which belongs to another mail server, everytime thunderbird tries to send email the incorrect smtp.examplehost.com is used, hence the mail sending fails with the err:

Unable to establish a secure link with SMTP server smtp.examplehost.com using STARTTLS since it doesn't advertise that feature. Switch off STARTTLS for that server or contact your service provider.

In my case the DNS for examplehost.com which is the mail server host was managed by Godaddy’s DNS-es:

ns49.domaincontrol.com ns50.domaincontrol.com

The A record for our domain smtp.examplehost.com was by default set in GoDaddy to point to incorrect IP, so the fix was simply to change the Domain alias of smtp.examplehost.com to the proper mail host.

Another thing I had to do is change variables in /var/qmail/supervise/qmail-smtpd/run and /var/qmail/supervise/qmail-smtpdssl/run

In both files I changed variables:

SSL=0 ALLOW_INSECURE_AUTH=0

SSL=1 ALLOW_INSECURE_AUTH=1

Also variables FORCE_TLS and DENY_TLS in /var/qmail/supervise/{qmail-smtpd,qmail-smtpdssl}/runs should be:

FORCE_TLS=0 DENY_TLS=1

Though the problem was occuring in Mozilla Thunderbird, i’m sure same email sending problem will be present if Microsoft Outlook Express or any other desktop pop3 client is used.
After this changes I had to restart qmail server through qmailctl:

# qmailctl stop; sleep 5; qmailctl start

This fixed clients mail sending issues … hope this will help to others looking for way to remove STARTTLS, TLS, SSL qmail support …

Tags: check, ERROR, occurred, problem, Sending, Thunderbird
Posted in Qmail | 2 Comments »

Fix to mail forwarding error “Received-SPF: none (domain.com: domain at maildomain does not designate permitted sender hosts)

Tuesday, October 18th, 2011

I’m Configuring a new Exim server to relay / forward mail via a remote Qmail SMTP server
Even though I configured properly the exim to forward via my relaying mail server with host mail.domain.com, still the mail forwarding from the Exim -> Qmail failed to work out with an error:

Fix to mail forwarding error "Received-SPF: none (domain.com: domain at maildomain does not designate permitted sender hosts)

I pondered for a while on what might be causing this “mysterous” error just to realize I forgot to add the IP address of my Exim mail server in the Qmail relay server

To solve the error I had to add in /etc/tcp.smtp on my Qmail server a record for my Exim server IP address xx.xx.xx.xx, like so:

debian-server:~# echo 'xx.xx.xx.xx:allow,RELAYCLIENT="",QS_SPAMASSASSIN="0"' >> /etc/tcp.smtp

The QS_SPAMASSASSIN=”0″ as you might have guessed instructs Qmail not to check the received mails originating from IP xx.xx.xx.xx with spamassassin.

Finally on the Qmail server to load up the new tcp.smtp settings I had to rebuild /etc/tcp.smtp.cdb and restart qmail :

– reload qmail cdb

linux-server:/var/qmail# qmailctl cdb Reloaded /etc/tcp.smtp. - restart qmail
linux-server:/var/qmail# qmailctl restart Restarting qmail: * Stopping qmail-smtpdssl. * Stopping qmail-smtpd. * Sending qmail-send SIGTERM and restarting. * Restarting qmail-smtpd. * Restarting qmail-smtpdssl.

This solved the issue and now mails are forwarded without problems via the Qmail SMTPD.

Tags: cdb, com, Configuring, domain, exim, forward mail, Forwarding, hosts, issue, Linux, mail, mail domain, mail server, none, Qmail, qmailctl, qs, quot, quot quot, relay, relay server, RELAYCLIENT, relaying mail, sender, Sending, server ip address, serverTo, SMTPD, smtpThe, spamassassin, SPF, var, while
Posted in Everyday Life, Linux, Qmail, System Administration, Various | No Comments »

☩ Walking in Light with Christ – Faith, Computing, Diary

Posts Tagged ‘Sending’

Create Linux High Availability Load Balancer Cluster with Keepalived and Haproxy on Linux

1. Install keepalived and haproxy on machines

2. Configure haproxy (haproxy.cfg) on both server1 and server2

…

3. Configure keepalived on both servers

4. Test Keepalived keepalived.conf configuration syntax is OK

5. Prepare rsyslog configuration and Inlcude additional keepalived options
to force keepalived log into /var/log/keepalived.log

6. Monitoring VRRP traffic of the two keepaliveds with tcpdump

7. Testing keepalived on server1 and server2 maachines VIP floating IP really works

What we learned?

Daily Bible quote

GET ARTICLE UPDATES

Useful blog? Help it:

Links to Other Places

Recent Posts

Ads

Categories

About Myself

Recent Comments

Top Post Views

blogtopsites

Posts Tagged ‘Sending’

Use haproxy to dynamically modify haproxy load balancer variables, view stastics, errors and much more via stats UNIX socket with socat via command line

1. Listing all available options that can be send via the haproxy.sock UNIX socket interface

2. View haproxy running threads

3. Show haproxy server connections

4. Show Load balancer servers state

5. Get general haproxy info on variables that can be used for Load Balancer fine tuning

6. View all opened sessions and, the session age (time since it has been opened) and session exp (expiry)

7. Disabling an haproxy frontend via UNIX socket

8. Show general haproxy statistics (could tell you much about customer connections health state) and state of connection to backend

9. Using netcat to view UNIX socket instead of socat

10. Get only statistics about running LB Backends and Frontends

11. Shutting down specific opened sessions

12. Sending shutdown to backend on a certain configured LB service

12. Sending multiple commands to haproxy socket

13. Report table usage information or dump table data content

14. Show information about Haproxy startup

15. Disable / Enable health check for haproxy configured backend

16. Change weight for server

17. Draining traffic from server / backend App in case of Maintenance

18. Run Interactive Mode connection to haproxy UNIX stats socket

Sum it up what learned

Create Linux High Availability Load Balancer Cluster with Keepalived and Haproxy on Linux

1. Install keepalived and haproxy on machines

2. Configure haproxy (haproxy.cfg) on both server1 and server2

…

3. Configure keepalived on both servers

4. Test Keepalived keepalived.conf configuration syntax is OK

5. Prepare rsyslog configuration and Inlcude additional keepalived options to force keepalived log into /var/log/keepalived.log

6. Monitoring VRRP traffic of the two keepaliveds with tcpdump

7. Testing keepalived on server1 and server2 maachines VIP floating IP really works

What we learned?

How to work around STARTTLS Qmail Thunderbird / Outlook mail sending (error) issues

Fixing Qmail 451 qq temporary problem (#4.3.0) / @4000000050587780174c60dc status: qmail-todo stop processing asap / status: exiting

Thunderbird mail check problem fix to: “An error occurred sending mail: Unable to establish a secure link with SMTP server smtp.examplehost.com using STARTTLS since it doesn’t advertise that feature.”

Fix to mail forwarding error “Received-SPF: none (domain.com: domain at maildomain does not designate permitted sender hosts)

Daily Bible quote

GET ARTICLE UPDATES

Useful blog? Help it:

Links to Other Places

Recent Posts

Ads

Categories

About Myself

Recent Comments

Tags

Top Post Views

blogtopsites

5. Prepare rsyslog configuration and Inlcude additional keepalived options
to force keepalived log into /var/log/keepalived.log