Previous | Contents | Next

Chapter 6: Maintenance

There are few optional tasks that need to be executed by the administrator from time to time or during the initial configuration.

6.1 Manual cache cleanup

If a package is no longer downloadable by APT clients then its files are also not referenced in any volatile (index) file and can be removed. This rule also applies to most volatile files at the distribution level. I.e. the Release file references some Packages and Sources files or Diff-Index file, and those do reference most other non-volatile files (binary packages, source packages, index diffs, ...).

To run this cleanup action manually visit the report page in a browser and trigger the Expiration operation there.

There are different flags configuring the parameters of this tracking described below. Usually just the filename is sufficient to consider a file in the cache as a valid (downloadable) file. This is ok in most cases but sometimes leads to false positives, i.e. when another repository in the cache refers to a file with the same name but the reference to the original location is gone. On the other hand there can be cases where the assignment to different repositories happened by mistake and administrator would like to merge repositories later on.

For most files the checksum values are also provided in the index files and so the file contents can be validated as well. This requires reading of the whole cache archive to generate local checksums. It should also not be done when apt-cacher-ng is being used (file locking is not used here).

Usually it's necessary to bring various index files (Release,Sources,Packages,Index) in sync with the repository. This is necessary because apt works around the whole file download by fetching small patches for the original file, and this mode of operation is not supported yet by apt-cacher-ng (and might still be unreliable). When this synchronization fails, the index files might be incomplete or obsolete or damaged, and they might no longer contain references to some files in the cache. Abortion of the cleanup process is advisable in this case.

There is also a precaution mechanism designed to prevent the destruction of cache contents when some volatile index files have been lost temporarily. The results of cache examination are stored in a list with the date when the particular files became orphaned. The removals are only executed after few days (configurable, see configuration file) unless they are removed from this list in the meantime.

Parameters of Expiration:

Stop cleanup on errors during index update step
Index files update is done first, on errors the expiration will be interrupted.
Validate by file name AND file directory
This option can be used to remove distribution stages. Example: to remove "oldstable" one just needs to delete the "Release" files in the cache and run Expiration with this option two times. There are some issues with this mode operation, see above for details.
Validate by file name AND file contents (through checksum)
Checking file contents where possible, also attempt to detect incorrect file size information in the cached metadata. Note: the check results are stored only once, future calls without this option can overwrite the results again. Use action buttons (see below) to delete corrupted files after the scan.
Force the download of index files
Sometimes it may be needed to redownload all index files, explicitly replacing the cached versions. This flag enables this behaviour.
Purge unreferenced files after scan
Avoid the use of the orphan list and delete files instead. This option is dangerous and should not be used unless when absolutely no mistakes/problems can happen. Instead, it's possible to view the orphan list later and delete then (see below).
More verbosity
Shows more information, e.g. each scanned file when used with some of the other options. This might result in a very large HTML page, making the watching HTML browser very slow.

In additional to the default scan run, there are some "Direct Action" buttons in the Web frontend. It's possible to see the temporary list of files that have been identified as orphaned (unreferenced), and it's possible to delete all files from that list immediately. To be used carefully!

6.2 Automated cache cleanup

A script called expire-caller.pl is shipped with the package. This script effectively implements a HTTP client which operates like a human would do when running the expiration manually (see above). It can also extract the operator password and unix socket file path from the local configuration file. On Debian installations it is called by the file /etc/cron.daily/apt-cacher-ng so it should run automatically as daily cron task. The results are usually not reported unless an error occurs, in which case some hints are written to the standard error output (i.e. sent in cron mails).

The operator script can take some options from the environment, also see the cron script for details:

ACNGIP=10.0.1.3
The network address for remote connection may be guessed incorrectly by the operator script. This variable can specify an explicit target to connect to, e.g. the same IP as the one used by the clients (unless this network connection is somehow restricted in the local setup).
HOSTNAME=localOrPublicName
When an error occurs, the operator script most likely adds an URL to be opened for further investigation. The host name of in this URL can be customized, i.e. can be set to a public domain name representing the server as accessible from the administrator's machine.

6.3 Distribution release removal

Sometimes it's needed to remove all files from a distribution, i.e. when a new release became Stable and older package files are still lying around. In perfect conditions the reference tracking described above should take care of it and remove them soon.

However, this solution will fail if the release files are still available on the server AND apt-cacher-ng learned their real location (i.e. the code name instead of not the release state name) and so they are refreshed during regular expiration.

After all, if the old release is no longer used by local cache users then the extra disk usage becomes a problem. This problem will go away after many months when the old release files are finally deleted on the servers, then the package expiration will start complaining for some days (the expiration delay) and only then the finally unreferenced files will be removed.

To speed up this process, the local administrator can remove the traces of the old distribution release from the archive. Either the top-level "Release" files, or even the whole index file trees relevant for certain releases.

To make this task easier, a "brutal" script called distkill.pl is shipped with apt-cacher-ng. It runs interactively, it scans the package directory and presents an overview of assumed index file trees, providing the option to remove some immediately. The script should be used with extreme care! See section 7.2 for example of its output.


Comments to blade@debian.org
[Eduard Bloch, Tue, 20 Nov 2007 00:03:24 +0100]