Arch Linux

Using wkhtmltoimage to produce screenshots of a website

wkhtmltoimage is a tool that I found to be quite useful when producing screendumps of websites. The homepage may be found here. I'm using it on several servers and have not had any difficulties yet.

With newer versions there are almost no system prerequisites. Depending on the distribution you may have to install one or two X11 libraries to get the fonts right. For example some users reported that they had to install urw-fonts or libx11-dev. Otherwise the static binary runs out of the box, tested on Ubuntu, RHEL and Arch Linux.

Usage in the simplest case amounts to just "wkhtmltoimage [input file] [output file]" where the input file can either be a web address or a local html file. Additionally the application comes with a lot of optional parameters which include e.g. disabling JavaScript, using custom style sheets or even modifying cookies.

There is also a version that produces PDF instead of images called wkhtmltopdf. It can be found on the same website.

Synchronizing folders in Linux by using rsync

Linux has a nifty little tool called rsync that should be available on the distribution of your choice (provided it was updated at least once since the stone age). From the man pages:

Rsync is a fast and extraordinarily versatile file copying tool. It can copy locally, to/from another host over any remote shell, or to/from a remote rsync daemon. It offers a large number of options that control every aspect of its behavior and permit very flexible specification of the set of files to be copied. It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.

So rsync is perfect for the job of keeping folders on different machines up to date. The proposed solution uses a central server approach looks as follows:

  • Rsync is running in daemon mode on the server. For authentication a file with user:password pairs is used. Information on how to set up an rsync daemon can be found in the related man pages.
  • On the client machines rsync is executed by a small script. Authentication again happens through a file stating username and password that is passed to rsync via the --password-file parameter. This allows for automation which is achieved by execution through a cron job as well as during boot and shutdown. How to achieve this is dependant on the distribution in use. On e.g. Arch Linux rc.local and /etc/rc.local.shutdown would be good places for executing the script on boot and shutdown respectively. Information on how to set up a cron job may be found in the man pages for crontab

Caveats:

  • The user credentials for authentication are stored in a plain text file. While this is necessary for automated execution it may be a no-go for some people. The alternatives are not using a password at all or using ssh as connection protocol. Both have options have their own pitfalls.
  • The setup (i.e. the exact calls and paths) on the client side will vary from distribution to distribution and will most likely need tweaking on each new machine; no solution that works out of the box.

Taggings:

Subscribe to Arch Linux