Posts Tagged 'kvm'

Setting KVM hostname per DHCP

The Problem

One problem with virtual machines is, when you clone one you also copy the complete configuration including hostname, static IPs, etc. To fix this you need to boot the cloned VM, edit the config and reboot it. The problem is that you will have at least temporary hostname and/or IP conflicts.


The better approach is to obtain hostnames and IPs via DHCP.
Then when you clone a machine (e.g. using virt-manager) for testing software updates or other changes, you simply remove the external NIC from the cloned VM, and the internal NIC gets a new MAC assigned automatically.
Then you update the DHCP server configuration (e.g. /etc/dnsmasq.conf) and add the new MAC, assigned IP and hostname there. Then when booting the new VM it automatically gets the correct hostname and new IP, without the need of changing the VM’s configuration files.



After the changes have been tested successfully you can apply the changes to the real system (you still should have a backup).
Don’t remove the cloned ‘test-vm’. Just shut it down and keep it for the next time.

When you need to test again some new changes on the machine you have already the complete clone configuration and MAC address DHCP setup. So you simply need to replace the clone’s disk file (e.g. redmine-clone.qcow) with the latest version of your VM’s disk file (e.g. redmine.qcow). Then you can start the test machine and everything should work just fine without any conflicts.

Example SSH Session:

# log into the VM's host system
ssh -l root blade7
cd /var/lib/libvirt/images
# shutdown the VM before copying the file
virsh shutdown Redmine
cp redmine.qcow redmine-test.qcow
# Restart VM
virsh start Redmine
# Start Clone-VM
virsh start Redmine-test

Configuring the DHCP-Client on Debian (VM Guest)

On Debian you simply need to set the hostname in /etc/hostname to “localhost” to enable receiving the hostname via DHCP.
The DHCP client itself is already configured to request “host-name” info via DHCP. See /etc/dhcp/dhclient.conf. There you should find the option “host-name” in the list of requested DHCP options.

About DNSmasq

As a side note I should mention that using DNSmasq is a great solution. It is a DHCP server and DNS server in one application.
This means no matter if you are adding hostnames manually to the DHCP configuration or getting hostnames via DHCP from any DHCP client, these names can be also resolved via DNS automatically, without any further configuration.


Optimizing speed in KVM image synchronization using rsync

An addition to my last post I think it might be useful to explain how to efficiently copy KVM images over network.

In I explained how to efficiently handle sparse files. But what I didn’t mention is how to get the best transfer rates.

By default rsync uses SSH, which encrypts the whole traffic and gives you a good amount of security. If you have a local GigaBit-LAN the CPU becomes the bottleneck due to the encryption.

Rsync has also a solution for that problem. You can start the rsync daemon on one side and use the rsync tool on the other side with the “rsync://user@host/share” syntax to transfer the data. This way rsync uses its own efficient protocol (port 873) without encryption.
This way I’m able to achieve over 100MB/s transfer rate in my local LAN.

Rsync daemon configuration:
max connections = 1
log file = /var/log/rsync.log
timeout = 300

path = /root/backups
comment = Backup images
read only = yes
list = yes
uid = root
gid = root
hosts allow = backupsserver
hosts deny = all

On my backupserver I extended my backup script to start the rsync daemon on the other side, start the rsync operation, and then kill the rsync process again.
This way the rsync port (873) is open only for a short time.
The “hosts allow” directive prevents connects from untrusted computers.
In addition it is possible to use “auth users = ” to setup password authentication. See man(1) rsync for details.

A small excerpt of my backup script:

echo "starting remote rsync daemon for faster backups" >> /tmp/rsync_script_tmpfile
ssh $HOSTNAME rsync --daemon >> /tmp/rsync_script_tmpfile
echo "backing up $BACKUPDIR" >> /tmp/rsync_script_tmpfile
rsync $OPTIONS rsync://$HOSTNAME/$BACKUPDIR $ARCHIVEROOT/$HOSTNAME >> /tmp/rsync_script_tmpfile
echo "stopping remote rsync daemon" >> /tmp/rsync_script_tmpfile
ssh $HOSTNAME killall rsync >> /tmp/rsync_script_tmpfile

Rsync and sparse files

Sparse files are a great feature of Linux filesystems. They become very handy when working with virtualization technologies like KVM. You don’t need to think long on how big you make a VM disk, just create a disk which is definitely big enough (I’m using 20GB normally for my linux based servers). If only 1GB is used the file uses only this amount of physical disk space and not the whole 20GB.

QEmu creates sparse files already by default when using raw images.
Example: qemu-img create myserver.img 20G
When adding the “s” option to the ls command you see the real used size in the first column.

ls -lhs
realsize                           virtualsize
0 -rw-r--r-- 1 gergap gergap 20.0G Aug 10 11:27 myserver.img

However these sparse files are a problem when copying them, especially when you need to move a disk image to another machine over network.

Local copies: When copying files locally with tools that are not aware of sparse files the whole 20GB will be copied. It may sound strange, but that’s the desired behavior. A sparse file with 20GB should look like a normal file to applications, so they see to complete 20GB, even though the most data is just zeros.

Luckily the “cp” command is aware of sparse files and will autodetect if a source is a sparse file. Then also the copy will become a sparse file and only the real data gets copied which is much faster. If the source is not sparse you can use “cp --sparse=always source dest“, then the destination will become a sparse file.

Now lets come to network transfer. Most admins are using rsync, which can copy a lot of files very quickly over SSH. rsync is very efficient in detecting what files have changed and only transmits the files that have been changed. So it’s easy to keep e.g. an FTP mirror in sync with its source or to implement backup strategies.

KVM images are different. You don’t have many files, but the files you have are huge sparse files. You don’t want to transmit 20GB over network if only a few MB have changed in the disk image. Even transmitting 1GB of actually used data takes quite a long time.

The solution is to use the “--inplace” option of rsync. This option only transmits the changed blocks of a file, not the whole file. The problem with “--inplace” is that is does not create sparse files.

But rsync can handle sparse files when passing the “--sparse” option. Unfortunately “--sparse” and “--inplace” cannot be used together.

Solution: When copying the file the first time, which means it does not exist on the target server use “rsync --sparse“. This will create a sparse file on the target server and copies only the used data of the sparse file.

When the file already exists on the target server and you only want to update it use “rsync --inplace“. This will only transmit the changed blocks and can also append to the existing sparse file.

I hope rsync will become more smart in the future and allows the combination of “--inplace --sparse” or can even autodetect the best strategy. But for now we have at least a working solution.

I hope this blog was helpful for understanding sparse files and rsync.