You are hereGrabbing all the PDF files from a website

Grabbing all the PDF files from a website


By edwin - Posted on 12 August 2009

I found a website (not listed) below that had the whole service manual for my car at no charge (normally about $300 US). The site gave me 74 links to PDF files! As excited as I was, I wasn't in the mood to click on 74 links. So I googled around and found the following one-liner.

wget `lynx -dump URL | grep .pdf$ | sed 's/[[:blank:]]\+[[:digit:]]\+\. //g'`
where URL is the URL where all the PDF files are.

11 seconds later I had all 74 PDF files. Cheers!

$ wget -A.pdf URL
$ #read manual at next time ;)

Oh! U can save a text manual from the web site:
$ wget -A.htm -k -r -np -nH --cut-dirs=N -D domain.com domain.com/path/to/manual.htm
wget it is the powerful tool...

P.S. Sorry for my english. Hello from .RU =)

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd> <blockquote> <img> <span>
  • Lines and paragraphs break automatically.
  • Pairs of<blockquote> tags will be styled as a block that indicates a quotation.
  • You can enable syntax highlighting of source code with the following tags: <code>, <blockcode>. The supported tag styles are: <foo>, [foo].
  • Images can be added to this post.

More information about formatting options