Hacking wget

Filed in: web Add comments

Today was a perfect day for hacking wget.
I wanted to mirror some site, and to my astonishment found out, that wget doesn’t support regular expressions for including the paths to mirror. Only a list of directories for including/excluding could be provided.
This was easily remedied using the excellent pcre library.
The patch is quick and dirty, and if one would like to compile wget with it, -lpcre should be added to LIBS in Makefile.
Patch for wget which adds regular expression matching for urls

Update: Well, it seems that there was a similar patch available for years: http://www.mail-archive.com/wget@sunsite.dk/msg07395.html

Leave a Reply