May 22, 2012

Configuring Cisco 1000v with NAS datastores

Should you?
Can you?

The devil is in semantics. We’ve got an environment where the Service Console is configured with a traditional vSwitch and the VMKernel ports for NAS datastores (and VMotion) and are configured on Cisco 1000v. The toe-stepping happens when one of the hosts in the cluster is rebooted. It appears that Service Console is triggering VMKernel to mount the datastores before the 1000v has fully brought the interfaces up. This causes the NAS datastores to timeout during its first attempt (before 1000v is fully up), which in turn causes the ESX hosts to generate HA errors (especially if you are using NFS datastores for SWAP files). When the ESX Server is finally up, it’ll be without it’s swap datastores.

Reading the 1000v Switch FAQ, the answer is not that much clear either.

Q. Should the distributed virtual switch (DVS) also manage the service console virtual network interface card (vNIC) or should the vNIC be on a traditional vSwitch?
A. The service console vNIC as well as the VMotion vNIC and Small Computer System Interface over IP (iSCSI) interfaces can be on the Cisco Nexus 1000V DVS. Be sure to create the VLANs used by these interfaces as system VLANs.

The question was: “should the DVS manage Service Console“.
The answer is: “It can“. (Huh!)

May be the author was thinking – “It depends”.

Based on what we are seeing, it stands to reason that in a NAS datastore environment, it is almost necessary for Service Console be managed by DVS, so that the timing and the order of Service Console and VMKernel nics can be managed, thereby preventing the above situation. Let the 1000v come up and bring up the NICS for Service Console, which in turn can trigger VMKernel and the NAS datastore mounting and etc etc..

Still need to chase this ball a bit. Stay tuned.

Never trust the written documentation. One’s own experimentation will prove what is true. – Tao of Shao

Comments

  1. Brad Hedlund says:

    Do you the ‘system vlan’ command in the Port Profile in use by the VMKernel?

  2. Brad Hedlund says:

    Oh, also, do you have ‘system vlan’ configured in the Port Profiles used as uplinks? And is the VMKernel’s VLAN listed in this command?

    Thanks,
    Brad

  3. Yes – They are system vlans. Today we moved SC and CPM (Control, Packet and Management) over to the vDS under the thinking that the vDS is going to get the SC and VMkernel available more cleanly so we can mount the NAS datastores. . That really did not help things all that much. We still see the ESX Server boot up, *hang* and NAS datastores for about 10 minutes and finally mount the datastores. During that time, we see the links come up *twice* followed by NAS datastore mounting.

    The VMKernel IP and the storage IPs are in same subnets and otherwise there are NO nfs errors (RPC or any other kind). It’s only at the boot time we get this delay.

  4. Brad Hedlund says:

    OK… Do you have ‘spanning-tree portfast trunk’ enabled on switchports connecting the ESX host?

    Thanks,
    Brad

  5. Christian Elsen says:

    Hi,

    there is no problem in moving the VMKernel interface used for NFS or iSCSI to the Nexus 1000V. You just need to configure the VLAN that is being used for NFS or iSCSI as a system VLAN on the uplink port-profile AND the actual port-profile used for the VMKernel interface.

    Here’s a sample config. Let’s say you will use VLAN 14 for NFS or iSCSI.

    port-profile uplink_port_profile_1
      capability uplink
      vmware port-group
      switchport trunk allowed vlan 10-14,21-22
      channel-group auto mode on sub-group cdp
      no shutdown
      system vlan 10-11,21-22
      state enabled
     
    port-profile iSCSI_or_NFS
      vmware port-group
      switchport mode access
      switchport access vlan 10
      no shutdown
      system vlan 10
      state enabled

    Same applies for the Service Console!

    Hope that helps

    Chris

  6. @Brad and @Christian – Thanks for your help. Brad – Just spoke to Pierre a few mins ago and he mentioned the same thing that @Christian also mentioned.

    To answer your question, I asked for the network configs and verified that we did not have portfast enabled. (I remember asking for it, however).

    Things to check

    1. We know that the system vlans are enabled on the uplink port profile. We need to ensure that the system vlans are enabled on the VMkernel vDS interface port profile as well. I didn’t think there was a way to check that using vCenter but Christian’s example shows me the way.

    2. Enable spanning tree portfast trunk on the port channels

    3. We are also using LACP for link aggregation. We’ll need to see if there are any config parameters that might allow us to bring these ports up any faster.

    Please stay tuned and muchas gracias!

  7. Thanks to Brad and Christian’s for their pointers. The missing piece was the addition of system vlans on the port profile. We only had it on one side (uplink port profile). With those corrections, we have a clean vSphere 4 with 1000v configuration. Below is the configuration we used on the VSMs and works like a charm.

    Next up – NX 5Ks and 7Ks. Good times.

    VMKernel

    port-profile vmkernel-uplink
      capability uplink
      vmware port-group
      switchport mode trunk
      switchport trunk allowed vlan 151
      channel-group auto mode active
      no shutdown
      system vlan 151
      state enabled
     
    port-profile vmkernel
      vmware port-group
      switchport mode access
      switchport access vlan 151
      no shutdown
      system vlan 151
      state enabled

    VM Network

    port-profile vmnetwork-uplink
      capability uplink
      vmware port-group
      switchport mode trunk
      switchport trunk allowed vlan 106,997-998
      channel-group auto mode active
      no shutdown
      state enabled
     
    port-profile vmnetwork-106
      vmware port-group
      switchport mode access
      switchport access vlan 106
      no shutdown
      state enabled

    Service Console

    port-profile service-console-uplink
      capability uplink
      vmware port-group
      switchport mode trunk
      switchport trunk allowed vlan 105
      channel-group auto mode active
      no shutdown
      system vlan 105
      state enabled
     
    port-profile service-console
      vmware port-group
      switchport mode access
      switchport access vlan 105
      no shutdown
      system vlan 105
      state enabled

Speak Your Mind

*