Revise README.md, and change timeout definitions. Adds sleep in ssh monitor.

This commit is contained in:
Anthony J. Martinez 2021-12-04 12:02:46 -06:00
parent 64a39b77ac
commit 5413a1f8b5
Signed by: ajmartinez
GPG Key ID: A2206FDD769DBCFC
2 changed files with 28 additions and 15 deletions

View File

@ -99,9 +99,12 @@ file passed to the `-e` option on execution:
* `UP_HOST`: IP or FQDN of the host to probe during network verification. Default: `example.com`
* `UP_PORT`: The port to attempt a connection to on `UP_HOST`, integer. Default: `443`
* `UP_TIMEOUT`: The timeout duration for the connection attempt to `UP_HOST:UP_PORT`
* `UP_TIMEOUT_MIN`: The minimum timeout duration for the connection attempt to `UP_HOST:UP_PORT`
* This value takes an integer.
* The default value is `2`.
* `UP_TIMEOUT_MAX`: The maximum timeout duration for the connection attempt to `UP_HOST:UP_PORT`
* This value takes an integer.
* The default value is `8`.
* `INITIAL_MIN`: The initial random delay minimum value, in seconds. Default: `5`
* Minimum delay before verifying network connection exists after a failure
* `INITIAL_MAX`: The initial random delay maximum value, in seconds. Default: `15`
@ -114,12 +117,12 @@ file passed to the `-e` option on execution:
#### Example Manager Config
The following would override `UP_TIMEOUT`, and `BACKOFF_MAX`. All other default values
The following would override `UP_TIMEOUT_MIN`, and `BACKOFF_MAX`. All other default values
would remain in tact.
`mgr.env`
```
UP_TIMEOUT=3
UP_TIMEOUT_MIN=3
BACKOFF_MAX=120
```
@ -134,10 +137,12 @@ working. These include verification that the SSH PID is running, confirmation th
default route exists and is capable of carrying traffic, and finally an optional check
that indeed the SSH connection itself carries traffic to the remote host. For the last
check the presence of a `LocalForward` declaration on the line immediately following a
`# Monitor Port` comment in your tunnel configuration file is used. This is parsed to
determine the port to use to attempt recieving an SSH banner. If the banner is present
within the configured timeout, the connection is deemed OK, otherwise it's assumed
down. In this case the zombie PID is killed so a new process can be started and managed.
`# Monitor Port` comment in your tunnel configuration file is used. This is parsed to
determine the port to use to attempt recieving an SSH banner. If the banner is present
within the configured timeout, the connection is deemed OK. This continues in a loop of
increasing timeouts between `UP_TIMEOUT_MIN` and `UP_TIMOUT_MAX` with a (3) second sleep
between loops. Should all attempts be exhausted the connection is deemed down, and the
zombie PID is killed so a new process can be started and managed.
### Recommendations
@ -149,8 +154,8 @@ the `log()` function as suits the user.
### Contribution
Unless you explicitly state otherwise, any contribution intentionally submitted
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any
additional terms or conditions.
for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual
licensed as above, without any additional terms or conditions.
### Contact

View File

@ -12,8 +12,8 @@ UP_HOST=example.com
UP_PORT=443
# Default Timeouts for connection verification
UP_TIMEOUT=2
UP_TIMEOUT_MAX=$(( UP_TIMEOUT * 3 ))
UP_TIMEOUT_MIN=2
UP_TIMEOUT_MAX=8
# Default Delay Settings
INITIAL_MIN=5
@ -159,10 +159,18 @@ check_ssh_monitor_port() {
return 0
else
# Try to connect to a defined host with increasing timeouts
for ((timeout=UP_TIMEOUT; timeout<=UP_TIMEOUT_MAX; timeout++)); do
if nc -vw "${timeout}" -i "${timeout}" localhost "${MON_PORT}" 2>/dev/null | grep -q SSH; then
for ((timeout=UP_TIMEOUT_MIN; timeout<=UP_TIMEOUT_MAX; timeout++)); do
if nc -vw "${timeout}" localhost "${MON_PORT}" < /dev/null 2>/dev/null | grep -q SSH; then
return 0
fi
# Rest between attempts, useful on mobile connections with high latency
# In a default configuration this will spend around one minute attempting
# to verify the SSH connection is alive. Take care that the total backoff
# and loop time here does not exceed the values set on either the client
# or server for alive intervals within SSH itself. Should you exceed those
# limits, outages may be artificially prolonged.
sleep 3
done
# Retries exhausted, the SSH connection is not passing traffic adequately
@ -182,8 +190,8 @@ verify_network() {
# Check to verify a default route exists
if ip r s | grep -q default; then
# Try to connect to a defined command with increasing timeouts
for ((timeout=UP_TIMEOUT; timeout<=UP_TIMEOUT_MAX; timeout++)); do
if nc -zw "${timeout}" "${UP_HOST}" "${UP_PORT}"; then
for ((timeout=UP_TIMEOUT_MIN; timeout<=UP_TIMEOUT_MAX; timeout++)); do
if nc -zw "${timeout}" "${UP_HOST}" "${UP_PORT}" < /dev/null; then
return 0
fi
done