Setting the clock ahead to see what breaks

[ comments ]

Software, technology, sysadmin war stories, and more. Feed

Given that we're now within 15 years of the signed 32-bit time_t craziness, I decided to start playing around with my own stuff to see how things are doing. I wanted to see what would break and what would work.

One thing I particularly wanted to see was how my smaller systems would work. It's basically a given that my 64 bit Linux boxes are going to be fine since time_t is already wider, and it won't explode in 2038. But that's far from the whole story. 32 bit machines still exist, and are more common than some would think thanks to the existence of things like Raspberry Pis.

Unless you deliberately install the 64-bit flavor of Raspbian, you're going to get a 32-bit system. With the version of glibc it's currently running, you will hit the wall. It's easy enough to try - you'll notice that you can't actually set the clock that far ahead:

root@rpi4b:/tmp# date -s "2038-01-19 03:14:08 UTC"
date: invalid date ‘2038-01-19 03:14:08 UTC’

So, okay, put on your "time to do evil" hat, set it one second earlier, and wait for the fun to happen. Starting from scratch again, it does this:

root@rpi4b:/tmp# systemctl stop chrony
root@rpi4b:/tmp# date -s "2038-01-19 03:14:07 UTC"
Mon 18 Jan 2038 07:14:07 PM PST
root@rpi4b:/tmp#
Message from syslogd@rpi4b at Jan 18 19:14:07 ...
 systemd[1]: Failed to run main loop: Invalid argument

Broadcast message from systemd-journald@rpi4b (--- XXXX-XX-XX XX:XX:XX):

systemd[1]: Failed to run main loop: Invalid argument

Message from syslogd@rpi4b at Jan 18 19:14:07 ... systemd[1]: Freezing execution.

Broadcast message from systemd-journald@rpi4b (--- XXXX-XX-XX XX:XX:XX):

systemd[1]: Freezing execution.

Yee haw! Look at that sucker burn. I particularly dig the XX-XX stuff. It's like a cartoon character who's been knocked out.

Now, before you whip out the pitchforks, keep in mind that systemd is just the messenger here. It's just working with what it's been given.

Also, the system is actually still up here. systemd has just basically checked out and is not going to do much more for you. It's not even going to take an ordinary "reboot" since that's really just a request to init (pid 1, so systemd again) to reboot the box. You're going to need to use "reboot -f" and suffer whatever badness might happen to stuff on the box. It's like pulling the plug, so have fun with that.

What happened? If you dig around in the remains, you will find that an assertion in systemd fired. It's refusing to continue unless clock_gettime() returns 0. Clearly, it returned something else. systemd saw this not-zero value and decided to protect itself by effectively stopping.

So you think "I know, I'll try this again, and strace pid 1 this time, and see what was in fact returned". You get something like this right before it croaks:

clock_gettime64(CLOCK_REALTIME, {tv_sec=2147483648, tv_nsec=898182}) = 0

... what? It returned 0? Yes... and no. Look at it closely.

clock_gettime64 returned 0. But systemd called clock_gettime. strace is showing you the system call... but that system call happens by way of a C library function which in this case is being provided by glibc 2.31. If you were to open up glibc's source code and go digging around for clock_gettime(), you'd find this:

  ret = __clock_gettime64 (clock_id, &tp64);

if (ret == 0) { if (! in_time_t_range (tp64.tv_sec)) { __set_errno (EOVERFLOW); return -1; }

First call the (64-bit capable) syscall. Then assuming that succeeds (and it does, per strace), then see if it'll fit in a (32-bit) time_t. It won't, so set errno to EOVERFLOW, and return -1.

That's what systemd gets, and so it blows up.

glibc is saying "I can't fit this into that, so I'm failing this call".

This is wrapped in a bunch of preprocessor #if tests such that it only runs when __TIMESIZE isn't set to 64, but guess what? On this particular combination of hardware and software, __TIMESIZE is in fact 32. Grovel around in the headers if you like and follow the bouncing ball starting here:

./arm-linux-gnueabihf/bits/timesize.h:#define __TIMESIZE	__WORDSIZE

... or just write something dumb to printf(..., __TIMESIZE) and see.

To be clear, this is glibc 2.31 on the 32 bit build of Raspbian/Raspberry Pi OS 11 (bullseye) on a Pi 4B. Newer versions of the OS will almost certainly not behave this way, since glibc itself is marching down the road to having 64-bit time even on 32-bit machines. Once that's done and rolled up into a release, expect this to go away.

...

And yes, NetBSD and OpenBSD tore off this band-aid about 10 years ago, and it's a done deal now. I know. Cheers to that.

[ comments ]


Older Post Newer Post