Good utility for HTML > PDF conversion: wkhtmltopdf

Found this on Google code while working to convert 400+ HTML files as saved by Mozilla Firefox. Firefox creates an HTML file and an associated directory containing the stylesheets, images and other content, so it was easy enough to run this against all the HTML files and get a good rendering of all the items to PDF format. As a bonus tip, if you’ve got a bunch of PDF files in named subfolders, (“Apple Custard/Index.html” for example) you can do the following with bash in a POSIX environment:

  1. find . -iname ‘*.html’ | awk -F [/]+ ‘{print $2}’ | sed s/’ ‘/’\\ ‘/g > list.txt
  2. while read line; do wkhtmltopdf “$line”/*.html “$line”.pdf; done < list.txt

Happy hacking.