- 
                Notifications
    
You must be signed in to change notification settings  - Fork 103
 
Description
My network is like that:
Win 10 pcs <->  |  samba , Ubuntu server 20.04 (initiator) |  <---- 50 Mbit dsl  ---->  | Ubuntu server 20.04 (replica) ,samba  | <-> Win 10 pcs |
The 2 ubuntu servers run only osync and samba.  Osync syncs 2 folders between  the 2 Ubuntu servers.
Initiator: In the night there is also a cron job that runs fsync (not o sync) to backup the initiator folder to another local disk .
Replica: In the night at Sundays there is also a cron job that runs fsync (not osync) to backup the replica folder to another local disk .
These are the only tasks that the 2 servers run.
It hangs and I have to restart the service manually. So i cannot leave it unattended.
To Reproduce
Unfortunately  this happens randomly, from once per day to ten times per day, so i don't know how to help you reproduce it.
Expected behavior
Kill the procs that are still running, and then continue monitor the folder for changes.
** Deviated behavior**
It kills the procs that are still running but then hungs.
Logs
I run osync as a service, it works fine, but randomly it become unresponsive. And this is what log says when that happens:
.
.
.
TIME: 2999 - Current tasks still running with pids [3402351].
TIME: 3001 - (WARN):Max soft execution time exceeded for task [Sync] with pids [3402351].
TIME: 3004 - Sent mail using sendmail command without attachment.
TIME: 3059 - Current tasks still running with pids [3402351].
TIME: 3119 - Current tasks still running with pids [3402351].
TIME: 3179 - Current tasks still running with pids [3402351].
TIME: 3239 - Current tasks still running with pids [3402351].
TIME: 3299 - Current tasks still running with pids [3402351].
TIME: 3359 - Current tasks still running with pids [3402351].
TIME: 3419 - Current tasks still running with pids [3402351].
TIME: 3479 - Current tasks still running with pids [3402351].
TIME: 3539 - Current tasks still running with pids [3402351].
TIME: 3599 - Current tasks still running with pids [3402351].
TIME: 3601 - (ERROR):Max hard execution time exceeded for task [Sync] with pids [3402351]. Stopping task execution.
TIME: 3601 - (CRITICAL):Cannot create replica file list in [/var/fs/].
TIME: 3601 - (WARN):Command was [/usr/bin/rsync --rsync-path="(o_O) rsync" -rltD -8 --modify-window=2 --omit-dir-times --no-whole-file  -p -o -g --executability  --exclude ".osync_workdir"   -e "/usr/bin/ssh  -i /home/gaionaus/.ssh/id_rsa  -p 22" --list-only gaionaus@94.69.211.28:"/var/fs/" 2> "/tmp/osync.treeList.target.error.3400238.20220118T052539.886827061" | (grep -E "^-|^d|^l" || :) | (awk '{$1=$2=$3=$4="" ;print substr($0,5)}' || :) | (awk 'BEGIN { FS=" -> " } ; { print 
TIME: 3601 - (WARN):Command output
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
TIME: 3601 - Task with pid [3402351] stopped successfully.
TIME: 3604 - Sent mail using sendmail command without attachment.
TIME: 3605 - (ERROR):osync finished with errors.
TIME: 3608 - Sent mail using sendmail command without attachment.
Tue Jan 18 06:25:47 UTC 2022 - (ERROR):osync child exited with error.
Tue Jan 18 06:25:47 UTC 2022 - #### Monitoring now.
Tue Jan 18 06:35:47 UTC 2022 - #### 600 timeout reached, running sync.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Tue Jan 18 06:35:47 UTC 2022 - osync 1.2 script begin.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)
Environment (please complete the following information):
Osync Version:
PROGRAM_VERSION=1.2
PROGRAM_BUILD=2017032101
IS_STABLE=yes
- OS: ubuntu 20.04
 - Bitness: x64
 - Shell : bash
 
Additional context
It will stay on the last line for ever: "TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)"
And I have to RESTART the service manually .
What seems strange in the above log is that part.  "/usr/bin/rsync --rsync-path="(o_O) rsync" "
It is like a pattern that it does not get replaced?
some settings from osync conf file :
RSYNC_OPTIONAL_ARGS="--modify-window=2 --omit-dir-times"
SOFT_MAX_EXEC_TIME=3000
HARD_MAX_EXEC_TIME=3600
KEEP_LOGGING=60
MIN_WAIT=120
MAX_WAIT=600