Skip to content

osync hangs and the service has to be restarted manually. #234

@gaionaus

Description

@gaionaus

My network is like that:
Win 10 pcs <-> | samba , Ubuntu server 20.04 (initiator) | <---- 50 Mbit dsl ----> | Ubuntu server 20.04 (replica) ,samba | <-> Win 10 pcs |

The 2 ubuntu servers run only osync and samba. Osync syncs 2 folders between the 2 Ubuntu servers.
Initiator: In the night there is also a cron job that runs fsync (not o sync) to backup the initiator folder to another local disk .
Replica: In the night at Sundays there is also a cron job that runs fsync (not osync) to backup the replica folder to another local disk .
These are the only tasks that the 2 servers run.

It hangs and I have to restart the service manually. So i cannot leave it unattended.

To Reproduce
Unfortunately this happens randomly, from once per day to ten times per day, so i don't know how to help you reproduce it.

Expected behavior
Kill the procs that are still running, and then continue monitor the folder for changes.

** Deviated behavior**
It kills the procs that are still running but then hungs.

Logs
I run osync as a service, it works fine, but randomly it become unresponsive. And this is what log says when that happens:
.
.
.
TIME: 2999 - Current tasks still running with pids [3402351].
TIME: 3001 - (WARN):Max soft execution time exceeded for task [Sync] with pids [3402351].
TIME: 3004 - Sent mail using sendmail command without attachment.
TIME: 3059 - Current tasks still running with pids [3402351].
TIME: 3119 - Current tasks still running with pids [3402351].
TIME: 3179 - Current tasks still running with pids [3402351].
TIME: 3239 - Current tasks still running with pids [3402351].
TIME: 3299 - Current tasks still running with pids [3402351].
TIME: 3359 - Current tasks still running with pids [3402351].
TIME: 3419 - Current tasks still running with pids [3402351].
TIME: 3479 - Current tasks still running with pids [3402351].
TIME: 3539 - Current tasks still running with pids [3402351].
TIME: 3599 - Current tasks still running with pids [3402351].
TIME: 3601 - (ERROR):Max hard execution time exceeded for task [Sync] with pids [3402351]. Stopping task execution.
TIME: 3601 - (CRITICAL):Cannot create replica file list in [/var/fs/].
TIME: 3601 - (WARN):Command was [/usr/bin/rsync --rsync-path="(o_O) rsync" -rltD -8 --modify-window=2 --omit-dir-times --no-whole-file -p -o -g --executability --exclude ".osync_workdir" -e "/usr/bin/ssh -i /home/gaionaus/.ssh/id_rsa -p 22" --list-only gaionaus@94.69.211.28:"/var/fs/" 2> "/tmp/osync.treeList.target.error.3400238.20220118T052539.886827061" | (grep -E "^-|^d|^l" || :) | (awk '{$1=$2=$3=$4="" ;print substr($0,5)}' || :) | (awk 'BEGIN { FS=" -> " } ; { print $1 }' || :) | (grep -v "^.$" || :) | sort > "/tmp/osync.treeList.target.3400238.20220118T052539.886827061"].
TIME: 3601 - (WARN):Command output
rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(644) [Receiver=3.1.3]
TIME: 3601 - Task with pid [3402351] stopped successfully.
TIME: 3604 - Sent mail using sendmail command without attachment.
TIME: 3605 - (ERROR):osync finished with errors.
TIME: 3608 - Sent mail using sendmail command without attachment.
Tue Jan 18 06:25:47 UTC 2022 - (ERROR):osync child exited with error.
Tue Jan 18 06:25:47 UTC 2022 - #### Monitoring now.
Tue Jan 18 06:35:47 UTC 2022 - #### 600 timeout reached, running sync.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Tue Jan 18 06:35:47 UTC 2022 - osync 1.2 script begin.
TIME: 0 - -------------------------------------------------------------
TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)

Environment (please complete the following information):
Osync Version:
PROGRAM_VERSION=1.2
PROGRAM_BUILD=2017032101
IS_STABLE=yes

  • OS: ubuntu 20.04
  • Bitness: x64
  • Shell : bash

Additional context
It will stay on the last line for ever: "TIME: 0 - Sync task [sync_link] launched as gaionaus@fs1 (PID 3613865)"
And I have to RESTART the service manually .
What seems strange in the above log is that part. "/usr/bin/rsync --rsync-path="(o_O) rsync" "
It is like a pattern that it does not get replaced?

some settings from osync conf file :
RSYNC_OPTIONAL_ARGS="--modify-window=2 --omit-dir-times"
SOFT_MAX_EXEC_TIME=3000
HARD_MAX_EXEC_TIME=3600
KEEP_LOGGING=60
MIN_WAIT=120
MAX_WAIT=600

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions