How to troubleshoot iscsi and kvm

Abiquo uses the pool resource from KVM to create the pools. Although, those pools are not persisted during reboots, and are only persisted on iscsid. We can find some ocasions where we need to fix different situations in KVM with the iscsi connections. So we will explain some usefull tips and procedures in order to do so.

First of all, we need to know what kind of storage server we are using. In case of Netapp and Nexenta, all the volumes are exposed through the same iscsi target. On LVM, every volume is exposed in it's own iscsi target. Im going to use always as example, the Netapp case.

Let's first explain how the iscsi is configured in KVM. Abiquo creates a pool in KVM to make it create the iscsi session to the storage server.

[root@bc2blade2 ~]# virsh pool-list
Nombre               Estado     Inicio automático
-----------------------------------------
ABQ_dca2e9d8-13f3-4b82-b7fd-c7f1621c0746-0 activo     no        
default              activo     si

[root@bc2blade2 ~]# virsh pool-dumpxml ABQ_dca2e9d8-13f3-4b82-b7fd-c7f1621c0746-0
<pool type='iscsi'>
  <name>ABQ_dca2e9d8-13f3-4b82-b7fd-c7f1621c0746-0</name>
  <uuid>8841e6f6-73b4-6fb3-eebb-1817c2150d5a</uuid>
  <capacity>0</capacity>
  <allocation>0</allocation>
  <available>0</available>
  <source>
    <host name='10.60.13.18'/>
    <device path='iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12'/>
  </source>
  <target>
    <path>/dev/disk/by-path/ip-10.60.13.18:3260-iscsi-iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12-lun-1</path>
    <permissions>
      <mode>0700</mode>
      <owner>-1</owner>
      <group>-1</group>
    </permissions>
  </target>
</pool>

To see this connection in iscsi, we use the next commad:

[root@bc2blade2 ~]# iscsiadm -m session
tcp: [1] 10.60.13.18:3260,1 iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12

Now, if we add new volumes on the same netapp, what we are doing is performing a rescan of the session to see the new luns created on that target. In case the session was not properly rescanned, this can be manually performed through the command:

[root@bc2blade2 ~]# iscsiadm -m session -R
Rescanning session [sid: 1, target: iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12, portal: 10.60.13.18,3260]

Now, we need to remember that restarting KVM will make dissapear the pool definitions. This shouldn't be a problem, because the iscsiconnections are persisted in iscsid.

Just as an extra information, the iscsi devices are softlinked on /dev/disk/by-path. The VM definitions will point to that direction:

[root@bc2blade2 ~]# virsh dumpxml ABQ_dca2e9d8-13f3-4b82-b7fd-c7f1621c0746 | grep "source dev"
      <source dev='/dev/disk/by-path/ip-10.60.13.18:3260-iscsi-iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12-lun-1'/>

As we can see, this address has the service IP of the storage server. In case we access through various IP's of the storage server (so we have various sessions for the same storage server), iscsiadm will create one softlink for each storage server ip for the same volume. In this situation, we have found few cases where this missbehaves, and the sessions are not correctly refreshed for both IP's. This can make Abiquo unable to deploy because it's unable to access to the volume. The manual refresh of the session should fix the problem.

Abiquo will only access the storage server through one IP, so it will only create one session. But a restart of the iscsid service, will make it connect to all the available IP's of the storage server, which can cause the error mentioned previously.

Finally, let's see how to reconnect a session that somehow have been lost in iscsid. The simptoms will be easy to recognize because the virtualmachine won't be able to be started.
We first scan the netapp. This will show us the targets available:

[root@bc2blade2 ~]# iscsiadm -m discovery -t st -p 10.60.13.18:3260
10.60.13.18:3260,1 iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12

Then, we can connect to it like this:
[root@bc2blade2 ~]# iscsiadm -m node -T iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12 -p 10.60.13.18:3260,1 --login
Logging in to [iface: default, target: iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12, portal: 10.60.13.18,3260] (multiple)
Login to [iface: default, target: iqn.2008-03.com.abiquo.localhost.localdomain:35de915e-0dca-4d20-b6ab-b4e809c1de12, portal: 10.60.13.18,3260] successful.

Remember that when you perform the discovery and the login command, you need to use the IP configured on Abiquo.

Have more questions? Submit a request

0 Comments

Please sign in to leave a comment.
Powered by Zendesk