1

I have a problem similar to Chrony 3.1 refuses to sync with ntp server

Scenario:

A newly installed server using SLES15 SP2 is running chrony 3.2. I had configured two NTP server pools that run the official ntpd 4.2.8p15 (it's all Intranet).

Problem:

Chrony "pulls" servers from the pool, but it never gets responses from the servers, and I wonder why. Is it a problem in chrony, a problem in ntpd, or a problem in my setup?

Debugging:

(I'm using a hacked version of tcpdump that improves NTP packet decoding) A request from ntpd seems to look like this (actually it's an anycast request, monitored from remote):

10:22:29.373395 IP (tos 0xb8, ttl 4, id 21390, offset 0, flags [DF], proto UDP (17), length 100)
    172.20.16.13.123 > 239.192.123.21.123: [udp sum ok] NTP leap indicator=0 (Nominal), Version=4, Mode=3 (Client), length=72
    Stratum 2 (secondary reference), poll 6 (64s), precision -24
    Root Delay: 0.000106, Root dispersion: 0.004196, Reference-ID: 0xac140219
    Reference Timestamp:  3808714798.372973455 (2020-09-10T08:19:58.372973)
    Originator Timestamp: 0.000000000
    Receive Timestamp:    0.000000000
    Transmit Timestamp:   3808714949.372178320 (2020-09-10T08:22:29.372178)
    MAC: Key ID: 421, SHA1-Digest=48d73ad9 5b1d2401 9a8d3c02 91b849cb 28400475

In contrast the queries from chrony (monitored locally) look like this:

08:52:33.338684 IP (tos 0x0, ttl 64, id 4141, offset 0, flags [DF], proto UDP (17), length 76)
    h31.51625 > h03.ntp: [bad udp cksum 0x7894 -> 0xea6e!] NTPv4, length 48
        Client, Leap indicator:  (0), Stratum 0 (unspecified), poll 10 (1024s), precision 32
        Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: (unspec)
          Reference Timestamp:  0.000000000
          Originator Timestamp: 0.000000000
          Receive Timestamp:    0.000000000
          Transmit Timestamp:   502153526.517788040 (2052/01/06 06:33:42)
            Originator - Receive Timestamp:  0.000000000
            Originator - Transmit Timestamp: 502153526.517788040 (2052/01/06 06:33:42)

10:12:22.173989 IP (tos 0x0, ttl 64, id 58250, offset 0, flags [DF], proto UDP (17), length 76)
    h31.39573 > nm1.ntp: [bad udp cksum 0x6a92 -> 0x02d5!] NTP leap indicator=0 (Nominal), Version=4, Mode=3 (Client), length=48
    Stratum 0 (unspecified), poll 9 (512s), precision 32
    Root Delay: 0.000000, Root dispersion: 0.000000, Reference-ID: 00000000
    Reference Timestamp:  0.000000000
    Originator Timestamp: 0.000000000
    Receive Timestamp:    0.000000000
    Transmit Timestamp:   1885145870.079837521 (2095-11-03T02:06:06.079838)

At least the transmit timestamp looks odd, and I don't know whether the other fields are valid.

It could be that the problem is chrony's request packets, but it could also be that some filtering on the servers make the requests be ignored. I've verified that the packets arrive on at least one pool server, but I saw no response.

Actually one server outside of the pools (the one in the last packet shown) responds like this, keeping the odd originator timestamp:

10:12:22.174191 IP (tos 0xb8, ttl 63, id 30184, offset 0, flags [DF], proto UDP (17), length 76)
    nm1.ntp > h31.39573: [udp sum ok] NTP leap indicator=0 (Nominal), Version=4, Mode=4 (Server), length=48
    Stratum 3 (secondary reference), poll 9 (512s), precision -23
    Root Delay: 0.000518, Root dispersion: 0.025527, Reference-ID: 0xac141002
    Reference Timestamp:  3808714309.712800696 (2020-09-10T08:11:49.712801)
    Originator Timestamp: 1885145870.079837521 (2095-11-03T02:06:06.079838)
    Receive Timestamp:    3808714342.174128206 (2020-09-10T08:12:22.174128)
    Transmit Timestamp:   3808714342.174187417 (2020-09-10T08:12:22.174187)

More Debug Info

# chronyc -n
chrony version 3.2
Copyright (C) 1997-2003, 2007, 2009-2017 Richard P. Curnow and others
chrony comes with ABSOLUTELY NO WARRANTY.  This is free software, and
you are welcome to redistribute it under certain conditions.  See the
GNU General Public License version 2 for details.

chronyc> tracking
Reference ID    : 00000000 ()
Stratum         : 0
Ref time (UTC)  : Thu Jan 01 00:00:00 1970
System time     : 0.000000009 seconds slow of NTP time
Last offset     : +0.000000000 seconds
RMS offset      : 0.000000000 seconds
Frequency       : 86.905 ppm slow
Residual freq   : +0.000 ppm
Skew            : 0.000 ppm
Root delay      : 1.000000000 seconds
Root dispersion : 1.000000000 seconds
Update interval : 0.0 seconds
Leap status     : Not synchronised
chronyc> sources
210 Number of sources = 8
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^? 172.20.16.3                   0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.1                   0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.13                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.14                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.5                   0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.12                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.11                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^- 172.20.2.1                    3  10   377   667   +16.2s[ +16.2s] +/-   36ms
chronyc> sourcestats
210 Number of sources = 8
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
172.20.16.3                 0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.1                 0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.13                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.14                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.5                 0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.12                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.11                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.2.1                 22  10  232m     -0.650      0.003   +16.2s    17us
chronyc> activity
200 OK
8 sources online
0 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address
chronyc> ntpdata

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : [UNSPEC] (00000000)
Remote port     : 0
Local address   : [UNSPEC] (00000000)
Leap status     : Normal
Version         : 0
Mode            : Invalid
Stratum         : 0
Poll interval   : 0 (1 seconds)
Precision       : 0 (1.000000000 seconds)
Root delay      : 0.000000 seconds
Root dispersion : 0.000000 seconds
Reference ID    : 00000000 ()
Reference time  : Thu Jan 01 00:00:00 1970
Offset          : +0.000000000 seconds
Peer delay      : 0.000000000 seconds
Peer dispersion : 0.000000000 seconds
Response time   : 0.000000000 seconds
Jitter asymmetry: +0.00
NTP tests       : 000 000 0000
Interleaved     : No
Authenticated   : No
TX timestamping : Invalid
RX timestamping : Invalid
Total TX        : 672
Total RX        : 0
Total valid RX  : 0

Remote address  : 172.20.2.1 (AC140201)
Remote port     : 123
Local address   : 172.20.16.31 (AC14101F)
Leap status     : Normal
Version         : 4
Mode            : Server
Stratum         : 3
Poll interval   : 10 (1024 seconds)
Precision       : -23 (0.000000119 seconds)
Root delay      : 0.000534 seconds
Root dispersion : 0.036041 seconds
Reference ID    : AC141002 ()
Reference time  : Thu Oct 08 08:20:28 2020
Offset          : -16.152969360 seconds
Peer delay      : 0.000214426 seconds
Peer dispersion : 0.000000195 seconds
Response time   : 0.000017658 seconds
Jitter asymmetry: +0.23
NTP tests       : 111 111 1111
Interleaved     : No
Authenticated   : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX        : 1969
Total RX        : 1969
Total valid RX  : 1969
chronyc> clients
Hostname                      NTP   Drop Int IntL Last     Cmd   Drop Int  Last
===============================================================================
chronyc> serverstats
NTP packets received       : 0
NTP packets dropped        : 0
Command packets received   : 81
Command packets dropped    : 0
Client log records dropped : 0
chronyc> rtcdata
513 RTC driver not running
chronyc> quit
# journalctl -b SYSLOG_IDENTIFIER=chronyd
-- Logs begin at Wed 2020-09-30 13:32:17 CEST, end at Thu 2020-10-08 11:27:08 CEST. --
Sep 30 13:33:04 h31 chronyd[3522]: chronyd version 3.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER +>
Sep 30 13:33:04 h31 chronyd[3522]: Enabled HW timestamping (TX only) on em3
Sep 30 13:33:04 h31 chronyd[3522]: Enabled HW timestamping (TX only) on em4
Sep 30 13:33:04 h31 chronyd[3522]: Frequency -86.905 +/- 0.107 ppm read from /var/lib/chrony/drift

1 Answer 1

1

I solved the problem, and the problem really was a bad mask in a ntpd's restrict directive, effectively causing NTP time queries not to be answered by all but one server. In addition I had set minsources 3 in /etc/chrony.conf.

What makes this problem interesting is how chronyd handles that (see "More Debug Info" in question):

  • OK, reach in the output of sources is 0 which may indicate a bunch of different problems.

  • ntpdata outputs a lot of data when actually there is none. An important clue that I had missed is Total RX being zero, as well as Total valid RX. But still this could have many kinds of reasons.

  • serverstats indicating NTP packets received being zero seems odd, as 172.20.2.1 obviously did send responses.

  • activity saying 8 sources online and 0 sources offline seems to be confusing very much: Shouldn't be a source that is not responding be considered "offline", and not "online"?

In contrast here is the output after the problem had been solved (having three sources responding):

Oct 08 11:29:32 h31 systemd[1]: Starting NTP client/server...
Oct 08 11:29:32 h31 chronyd[18823]: chronyd version 3.2 starting (+CMDMON +NTP +REFCLOCK +RTC +PRIVDROP -SCFILTER >
Oct 08 11:29:32 h31 chronyd[18823]: Enabled HW timestamping (TX only) on em3
Oct 08 11:29:32 h31 chronyd[18823]: Enabled HW timestamping (TX only) on em4
Oct 08 11:29:32 h31 chronyd[18823]: Frequency -86.905 +/- 0.107 ppm read from /var/lib/chrony/drift
Oct 08 11:29:32 h31 systemd[1]: Started NTP client/server.
Oct 09 08:09:43 h31 chronyd[18823]: Selected source 172.20.2.1
Oct 09 08:09:43 h31 chronyd[18823]: System clock wrong by -16.101294 seconds, adjustment started
Oct 09 08:09:27 h31 chronyd[18823]: System clock was stepped by -16.101294 seconds
Oct 09 08:11:36 h31 chronyd[18823]: Selected source 172.20.16.3
chronyc> tracking
Reference ID    : AC141003 (172.20.16.3)
Stratum         : 3
Ref time (UTC)  : Fri Oct 09 06:21:18 2020
System time     : 0.000007615 seconds fast of NTP time
Last offset     : +0.000007168 seconds
RMS offset      : 0.000022300 seconds
Frequency       : 87.841 ppm slow
Residual freq   : +0.002 ppm
Skew            : 0.090 ppm
Root delay      : 0.000269273 seconds
Root dispersion : 0.002195312 seconds
Update interval : 64.6 seconds
Leap status     : Normal
chronyc> sources
210 Number of sources = 9
MS Name/IP address         Stratum Poll Reach LastRx Last sample
===============================================================================
^? 172.20.16.13                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.1                   0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.5                   0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.12                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.14                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^? 172.20.16.11                  0  10     0     -     +0ns[   +0ns] +/-    0ns
^- 172.20.2.1                    3   9   377   239    +15us[  +27us] +/-   27ms
^- 172.20.16.2                   2   8   377    65   +208us[ +215us] +/- 8147us
^* 172.20.16.3                   2   6   377    64    +27us[  +34us] +/- 4417us
chronyc> sourcestats
210 Number of sources = 9
Name/IP Address            NP  NR  Span  Frequency  Freq Skew  Offset  Std Dev
==============================================================================
172.20.16.13                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.1                 0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.5                 0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.12                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.14                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.16.11                0   0     0     +0.000   2000.000     +0ns  4000ms
172.20.2.1                  7   5   51m     +0.254      0.070   +105us    23us
172.20.16.2                 6   3   21m     +0.219      0.218   +227us    27us
172.20.16.3                15   7   907     +0.002      0.074    +52ns    19us
chronyc> activity
200 OK
9 sources online
0 sources offline
0 sources doing burst (return to online)
0 sources doing burst (return to offline)
0 sources with unknown address
chronyc> ntpdata
...
Remote address  : 172.20.2.1 (AC140201)
Remote port     : 123
Local address   : 172.20.16.31 (AC14101F)
Leap status     : Normal
Version         : 4
Mode            : Server
Stratum         : 3
Poll interval   : 9 (512 seconds)
Precision       : -23 (0.000000119 seconds)
Root delay      : 0.000366 seconds
Root dispersion : 0.026947 seconds
Reference ID    : AC14100E ()
Reference time  : Fri Oct 09 06:11:14 2020
Offset          : -0.000026963 seconds
Peer delay      : 0.000219559 seconds
Peer dispersion : 0.000000190 seconds
Response time   : 0.000020624 seconds
Jitter asymmetry: +0.20
NTP tests       : 111 111 1111
Interleaved     : No
Authenticated   : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX        : 297
Total RX        : 296
Total valid RX  : 296

Remote address  : 172.20.16.2 (AC141002)
Remote port     : 123
Local address   : 172.20.16.31 (AC14101F)
Leap status     : Normal
Version         : 4
Mode            : Server
Stratum         : 2
Poll interval   : 8 (256 seconds)
Precision       : -23 (0.000000119 seconds)
Root delay      : 0.000305 seconds
Root dispersion : 0.007904 seconds
Reference ID    : AC140219 ()
Reference time  : Fri Oct 09 06:14:48 2020
Offset          : -0.000215189 seconds
Peer delay      : 0.000180311 seconds
Peer dispersion : 0.000000190 seconds
Response time   : 0.000057180 seconds
Jitter asymmetry: +0.50
NTP tests       : 111 111 1111
Interleaved     : No
Authenticated   : Yes
TX timestamping : Daemon
RX timestamping : Daemon
Total TX        : 466
Total RX        : 453
Total valid RX  : 453

Remote address  : 172.20.16.3 (AC141003)
Remote port     : 123
Local address   : 172.20.16.31 (AC14101F)
Leap status     : Normal
Version         : 4
Mode            : Server
Stratum         : 2
Poll interval   : 6 (64 seconds)
Precision       : -24 (0.000000060 seconds)
Root delay      : 0.000168 seconds
Root dispersion : 0.006165 seconds
Reference ID    : AC140219 ()
Reference time  : Fri Oct 09 06:18:14 2020
Offset          : -0.000028130 seconds
Peer delay      : 0.000198109 seconds
Peer dispersion : 0.000000131 seconds
Response time   : 0.000038736 seconds
Jitter asymmetry: +0.00
NTP tests       : 111 111 1111
Interleaved     : No
Authenticated   : No
TX timestamping : Daemon
RX timestamping : Daemon
Total TX        : 16
Total RX        : 16
Total valid RX  : 16
chronyc> serverstats
NTP packets received       : 0
NTP packets dropped        : 0
Command packets received   : 353
Command packets dropped    : 0
Client log records dropped : 0
chronyc> rtcdata
513 RTC driver not running

It seems as if there are a few bugs in chronyd or chronyc.

2
  • Can you please elaborate on how you solved this? Commented Aug 22, 2021 at 7:20
  • @ChristopherStanley What is "this"? Commented Aug 23, 2021 at 7:42

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.