Squid developers are striving (some may say “with varying levels of success”) to have a clean and safe codebase. One of the available tools is to rely on the compiler and instruct it to treat all warnings as errors to ensure all possibly meaningful signal on issues is caught and acted on. This is done with the -Wall -Werror compiler flags
However, we are not using these flags in ./configure – we set it in SQUID_CXXFLAGS which is a custom variable that’s later refrenced by Makefile.am and used during the build phase. Why is that?
The answer is: doing so would break key autoconf macros such as AC_CHECK_LIB, AC_SEARCH_LIBS, or AC_CHECK_FUNCS. These macros try to compile and link a test executable which calls a function (say, log defined in -lm ) with a standard signature: returning char and having no arguments. Having -Werror turned on while using these functions will result in (tested with gcc 11.4):
configure:52533: gcc -o conftest -Wall -g -O2 -Werror -g conftest.c -lm >&5 conftest.c:366:6: error: conflicting types for built-in function 'log'; expected 'double(double)' [-Werror=builtin-declaration-mismatch] 366 | char log (); | ^~~ conftest.c:1:1: note: 'log' is declared in header '<math.h>' 1 | /* confdefs.h */ cc1: all warnings being treated as errors
CFLAGS="-Werror -Wall -Wno-builtin-declaration-mismatch" ./configure will force ignoring this specific error and ensure that everything builds.
This is GCC specific; Clang doesn’t support this flag and will not exhibit the same behaviour
I could not find any mention of this in a quick internet search; hopefully this writeup will save someone else a bit of time and experimentation
Once an entire dependency tree is exploded, a single c++ include file can become huge, and easily span tens if not hundreds of source files; these will need to be parsed for each compilation unit (c++ file), resulting in a large amount of duplicate work.
So compiler writers came up with the clever idea to optionally save an intermediate state of key headers, to reduce the amount of duplicate work. gcc , clang, msvc all support some variant of the precompiled headers idea.
How do they work in practice?i
Each compiler has its own quirks.
On GCC
A precompiled header has the same name as the header it accelerates with an additional .gch suffix, placed in the same directory as the header file it refers to. It is generated by calling the compiler with the same exact command line arguments as used to build the code, with the additional switches -x c++-header . If a precompiled header is present, it will be automatically used
On clang
A precompiled header has the same name as the header it accelerates with an additional .pch suffix. It is generated by calling the compiler with the same arguments as used to build the code, with the additional switches -x c++-header -emit-pch (the latter might be implicit if the former is supplied). To use it, it is not enough that it be present; the compiler switch -include-pch <pch-file-path> must be used.
On top of this: clang internal documentation highlights that there can only be one precompiled header and it must be included at the beginning of the translation unit
On MSVC
This is not yet a specific target for squid
Could it work for squid?
Yes, in theory. Our coding guidelines mandate that each c++ file start including “squid.h”, which in turn includes our whole portability abstraction layer, which in turn takes in several system files. On my Ubuntu Linux system, a total of 206 header files have to be read and parsed just for this purpose for each of the over 800 files that make up squid. Sounds promising!
Does it work for squid?
In short, unfortunately not. I have experimented with a feature branch, and the results are not what I was hoping for, under several dimensions.
The good: performance gains
I ran some checks, on a NUC6i7KYB (Intel(R) Core(TM) i7-6770HQ CPU @ 2.60GHz, with 16 GiB core, SSD). The test command was
git clean -fdx && ./bootstrap.sh && ./configure && time (make -s -j12 all && make -s -j12 check)
Over 6 attempts, wall clock time averaged 10 minutes and 18 seconds without precompiled headers, and 9 minutes and 48 seconds with them, so a roughly 30 seconds (or about 5%) compile time improvement with gcc. Good but not earth shaking.
The bad: poor integration with the autotools toolchain (gcc edition)
Autotools’ stance on precompiled headers is pretty clear:
This is how I’ve done it. It’s hacky, but some parts of it may not apply to other projects’ setup.
In configure.ac, define an user argument --enable-precompiled-headers , and react to it with an automake conditional ENABLE_PCH_GCC .
In src/Makefile.am , define a custom Makefile rule that builds the precompiled header:
src/Makefile.am is touching files in include/ This is necessary because include/ doesn’t have a Makefile.am of its own, and the top level Makefile.am doesn’t have access to the CXXCOMPILE variable.
srcdir shouldn’t be mucked about at build time; that’s what builddir is for
Then, add a section
if ENABLE_PCH_GCC PCH_FILE=$(top_srcdir)/include/squid.h.gch endif
which is then referenced in
BUILT_SOURCES = \
dnl ... \
$(PCH_FILE)
This will pull in the precompiled header in the list of dependencies of squid, unit tests, and files to clean up. We can’t really control the order this gets built in, but it isn’t a big deal: if we need to compile anything before the precompiled header is built, everything will still work, just without the speed bump.
The worse: poor integration with the autotools toolchain (clang edition)
Clang has one extra problem compared to gcc: to actually use the precompiled header, it needs the -include-pch <file> option. If the option is used, the file needs to be there, or the build will fail.
Which makes being unable to control the build order a showstopper. We would need to build the precompiled header file without that flag before we do anything else. But looking at the generated src/Makefile:
One way to make sure that we only add the –include-pch option would be to send it down the recursive make invocation, except we don’t control that.
That’s it, I give up
The benefit is just not worth the number of hacks and complexity.
What could make it work?
gcc gets this behaviour right; it would be great if clang took inspiration from them. At the very least, do not fail building if the included pch was missing. This would enable treating it for what it is: an optimisation.
My main contribution to the Squid Web Cache project is these days running the project’s infrastructure. A lot of it is the project’s CI/CD farm.
In order to run it, we rely on a very kind donation by DigitalOcean . We use a VM hosted there to run the main jenkins instance and part of the jobs for the x86-64 architecture. We are then using the jenkins digitalocean plugin to spin up instances (droplets) on demand when we need to have more throughput from our build jobs.
In order to maximise how we use our resources, we rely on docker to run all of our target linux userlands. This allows us to decouple the runtime environment from the machine that’s running it, and to ensure consistency across builds (also coming up: a proper staging system).
In this post I’ll focus on how we spin up these instances, the whole setup is a bit more convoluted.
The digitalocean plugin is quite well integrated and easy to use; TBH I haven’t tried plugins for EC2 or GCP, but my other reference point, jclouds, was much harder to configure and set up.
Given our prerequisites, we need ondemand instances to only contain the docker runtime and java, which is needed to run the jenkins slaves as unlike other setups I’ve found online, these run outside the docker containers.
In order to do that, we supply to the “User Data” section this config snippet:
#cloud-config
apt_upgrade: true
package_upgrade: true
packages:
- openjdk-11-jre-headless
- docker.io
users:
- name: name of the jenkins user on the executor machine
groups: docker
shell: /bin/bash
ssh-authorized-keys: ssh-rsa ssh public key of the user jenkins runs under
These actions will be run when the droplet is launched, and prep the executor for jenkins to ssh into it and run the test jobs. In order to give the droplet time to do that, we need to wait for it with this init script:
#!/bin/bash
echo "starting init script"
while ! cloud-init status|grep -qF 'done'
do
echo "waiting for cloud-init to be done"
sleep 10
done
The next tricky bit is in the Droplet section, in the node Labels section we define a label for triggering the instance startup when needed. It can be anything, in our case docker-build-host, and an instance cap.
Referencing this label in the projects’ configuration matrix will trigger the spinup and imaging. Jenkins will then connect to the droplet via ssh and use docker run commands to test the various runtime environments
The Squid Wiki is hosted on an own instance of MoinMoin. We picked it at the time as it had fewer external dependencies than other engines, and it fit the bill.
Over time, and as the number of pages grew, its strengths became limitations, and I’m currently exploring whether to switch to a different engine. Mediawiki is the go-to choice for most people, so that’s what I investigated first.
W3C has developed a tool to convert from one to the other, but it hasn’t been updated in some time, to the point where MediaWiki API changes have bit-rotten it. It doesn’t help that this tends to be a one-off activity, so it
Open Source to the rescue! I have patched it to support current API and it worked for me ™. While waiting for the PR to be approved, feel free to use my fork of it
If one was not following the mailing lists it might seem the Squid project has gone the way of the Dodo. In truth it is quite the opposite. The dev team and Foundation board are working so hard it has been difficult to find time for these additional updates.
So to recap the major projects going on since last update;
The largest change has been our move to git for source code repository. That has been a long road taking up a lot of time over several years. A great big thanks to the various people working on that.
Following along from that we now have github (squid-cache account) as our code repository for public access. The Squid Projects repository is no longer directly available for general access. Our code submission process has changed from accepting .patch files to github PR requests. So developers working on code changes please convert to that (I can still work with patch submissions, but it is significantly more trouble than having your own github account for submission updates/edits). Anyone who forked one of the Squid github repositories prior to the 2017 transition should fork the new repository and convert their code changes.
The new code submission process has resolved quite a few issues we had with the old submit and auditing/QA process. There are still a few quirky behaviours caused by github and our automation that cause trouble from time to time – but overall it is a big improvement on what we had before.
The largest issue we face now for QA and code development is manpower. We now have automated change tracking, content integration helping out with QA and a committer bot taking a huge load off my shoulders as maintainer. Our submission process is open and public – so anyone can read the proposals and should also be able to comment about any bugs you can see that have not already been pointed out. Anyone with an interest in the Squid code is encouraged to participate in that process.
In the shadow of all those very time consuming alterations to the Squid Project systems the dev team has also managed to iron out several of the major bugs blocking Squid-4 release. Just one of the long-standing bugs remain. A few regressions in recent code have brought up some new major bugs, but those are for the most part already fixed or soon will be. So watch this space for news on further progress there.
You must be logged in to post a comment.