Apache Log – can they be used to see if a file is ever called/used

user26676 asked:

This is an odd question but my gut tells me there is an easy way to do this:

I have a project that is always in development and is in PHP and it is 14 years old. Despite every attempt to keep on top of developing it there are large numbers of files in there. The PHP bit is ok, I can do what I need via a database log in every header.

I am taking about the apache stuff – the css, the gifs, the png, the old jquery references, the old js files that I may or may not ever recruit. There are around 3,000+ of these files.

Many are image references to old images no longer ever used. Some are jQuery libs that I long since stopped using. The thing is they all look like something I remember doing way back when, and there are a lot of legacy decay routines that sometimes need these old images/css/js/{insert here} to function.

Basically this isn’t a website it’s an PHP engine that can throw up lots of things and it is hard to track, so I simply leave these old references in.

What I want is a way to traverse the Apache logs for installations that have been live for over a year and positively ascertain whether each individual that image or css or whatever has NEVER been referenced nor pulled up since the server was created.

Is there a way to item by item verify whether Apache ever used it? I have lots of servers that run this code it would be nice to run this against every server so it would (ideally) be a way of getting distinct file calls (and a count?) from Apache logs. Urls or unc paths would be fine.

My answer:

If your filesystem isn’t set to discard atimes (e.g. ext3/4 with noatime) you can just use a simple find to locate files that haven’t been accessed in some time.

For instance, to find files that haven’t been accessed in a year or more:

find /srv/www/ancientproject -atime +365 -print

This may not solve your problem, though, for many of the same reasons voretaq7 points out. The file might be requested 15 minutes after you delete it, for instance.

View the full question and answer on Server Fault.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.