Thursday, January 24, 2019

How the quest to prevent time from running out has led me to all corners of the Linux kernel

The y2038 problem in Linux is about the time data type representation. Because of the way time is represented in Linux, a signed 32 bit number can not support times beyond January 2038. The solution is to use 64 bit timestamps.

I came into the problem as Arnd Bergmann’s intern for Outreachy. Outreachy is a benevolent program which helps new programmers to get into open source development. The mentors for the kernel projects are usually experienced kernel developers.

I chose the y2038 problem because it would let me touch all the subsystems in the kernel and it actually did more than that. It also involves user space, C library, POSIX and C standards. And I found that the problem is really about interfaces between layers.

This article is about how solving one problem does not usually involve just solving that one problem in the kernel: the complexity of interrelated things in the kernel (there is always 1 more cleanup that is needed before the change) and the interactions with the community as a newcomer.

One of the problems we tackled was the virtual file system.

VFS is a filesystem abstraction layer. So even if some of the filesystems like the ext4 were able to represent timestamps beyond the year 2038 on a 32 bit system, they cannot do so without the VFS layer supporting it.

This change to VFS was one of the patch series that took the longest to get consensus and get merged in.

Problem Description: The in-kernel representation of inode timestamps was in struct timespec which is not y2038 safe. Change the representation to struct timespec64 which is y2038 safe.

The problem is an example of how the y2038 problem touched many sub systems in the kernel.

The first version of the series was posted by Arnd in 2014. There were a few open issues and there was some feedback about adding timestamp range checking at that time.

I posted the first request for comments for this in Jan of 2016. The RFC was really about asking if there was any opposition to the approach. This was  not a typical RFC for the kernel community. The series cover letter said we could change it this way, and provided a few example changes of how this would be done. There was some confusion here as to what we were trying to get across in the series.

So again I posted a series (actually 3) for solving the problem in 3 separate ways. This was a pared down version of the earlier series addressing only the core issue. This was also atypical. Thomas suggested that he slightly preferred using one of the approaches to solving the problem.

Now, we had all the patches done this way. But, we had to get rid of some old time interfaces before we could actually do the change. When I posted a series of this, Linus did not like one of the interfaces (current_fs_time(sb)) since it took the superblock as an argument, to access timestamp granularity. But the timestamps are really a feature of the inode, not the super block. So we ended up getting rid of this API.

Now, the original series had to be redone again. But, it seemed like doing a flag day patch was a brute force approach to the problem. So we ended up doing just that.  We even went a step further to do it with a coccinelle script. This changed over 80 files. The challenge here was to make the changes rudimentary to avoid regressions. We finally ended up getting the patches merged in June 2018. We have not yet heard of any regressions the change introduced.

So by the end of this whole exercise, we had gotten rid of three in-kernel APIs, rearranged some of the filesystem timestamp handling, handled print formats to support larger timestamps, analyzed 32 bit architecture object dumps and rewritten at least five versions of the series from scratch. And, this was just one of the problems we solved for the kernel.

Y2038 has been one of my favorite projects.