Comparing files in PHP

2007-10-23 18:21:38

One would say this ought to be easy and straightforward, but unfortunatly this is not entirely the case. Searching for it doesn't help as you don't get many usable hits. I hope it helps you.

The script below will take two directories and compares them on a filecontents level (recursive). That means it checks whether a file exists in both dirs and if so whether the contents are equal. If either condition fails, the filename will be returned in $checkme.

The script uses Linux' diff command. It works like this on my server. I can't guarantuee that it works the same on your server. I'm pretty sure it doesn't on a Windows. You're gonna have to check for yourself and find out.

(Yes of course there are bigger classes and stuff, but sometimes you don't want to depend on them)

PS. I'm not certain why the indentation doesn't show, the code should be contained in a

 which shows tabs, spaces and returns. I'll check this out later.

Code:


// Directory compare script
// (c) qFox, 2007
// http://qfox.nl

// NOTE: This script asumes _no spaces or colons_ are present in any of the directories
// of filenames parsed! This is crucial for the parsing part. But you can easily
// change this example to your needs if you do require spaces or colons.
// Yes, it may not be optimal, but I'm a lazy coder and the script is occasionally
// used by myself. I didn't feel like doing it the perfect ideal way. Do it yourself.

// let your operating system work the differences out between two dirs
// array to contain differences reported by diff
$differences = array();
// array to put in filenames that differ
$checkme = array();
// the two dirs to be compared, relative to the current dir (can be given absolute)
$dir1 = "development";
$dir2 = "release";
// now let linux execute diff and put the output into the array
// exec will return the string of the last line, if any
// if no string was returned (which diff does if there was nothing different) then
// exec() evaluated to false, hence the if
if (exec("diff -rq $dir1 $dir2", $differences)) {
// debug results:
//echo "
["; print_r($differences); echo "]
";

// now process each line
foreach ($differences as $d) {
// On the OS of my webserver, there are only two possible results:
// 1 "Only in development/anotherdir: somefile.php"
// and
// 2 "Files development/anotherdir/somefile.php and release/anotherdir/somefile.php differ"
// we now need to determine for each line, what type of line it is

// nasty but easy, we explode the string, if it contains a colon it was of type 1
// in which case we want to combine the dir and filename
// if the result is an array of size 1, there was no colon, and it was of type 2
if (sizeof($c = explode(": ",$d)) == 2) {
// reporting file not present in one dir, $c[1] now contains the filename
// the ": " is stripped with exploding, remove the userfriendly string from $c[0]
// and implode $c with a slash to re-create the dir-file string
$c[0] = str_ireplace("Only in ","",$c[0]);
$f = implode("/",$c);
// echo "Missing file: $f
"; // debug
}
else {
// reporting of changes in files
// this one returns both directory names, but they ought to be equal
// when the search-dirs $dir1 and $dir2 are removed, and thats the only part
// we want anyways, so:
$c = explode(" and ",$d);
$f = str_ireplace(" differ","",$c[1]);
// echo "Difference in files: ".$f."
"; // debug
}
// strip the two search-dirs
// we prefix a space to the filename to ensure we only remove it
// from the start. Just in case the searchpath occurs twice.
// (This explicitly requires no spaces in your filenames. If you
// do require spaces, you can obviously replace the space by
// another character...)
$f = str_ireplace(" software/statsscript/dev/",""," ".$f);
$f = str_ireplace(" software/statsscript/release/",""," ".$f);
// echo "Filename: $f
"; // debug
$checkme[] = $f;
} // end foreach
}
// at this point, checkme should contain all the files that either differ or
// do not exist in one of the two search directories
?>