Indeed. Since most websites use more or less all the same backends (phpbb2 or 3, wordpress, mediawiki to name just a few), my script could be clever enough to recognize such parts and only download the lean ,,meat''.
Oh, and i switched from tar.gz to info-zip to rar. A website archive compressed with tar.gz needs to be uncompressed when i try to access any files. Wich sucks if the archive has hundreds of megs.
Info-zip seems to have a problem with zip-files bigger than 2 gigs on Debian Lenny, wich is a joke, considering that we are in 2009 and operating systems are internally 64 bit safe for a decade
.
Then i tried afio, cpio and finally rar, wich seems to work perfectly on top of posix and win32.
If time permits i can present something to test on osx in the mid of the week. [
] ]'>