Showing posts with label ESX 3.x Service Console Commands. Show all posts
Showing posts with label ESX 3.x Service Console Commands. Show all posts
Johnny Zhang
Many of us believe "vmware -v" will show you the current version of your ESX server (With all the patches applied). This is actually not true. This command only shows the version of a component.
In ESX 3.x "vmware -v" will show you the version number of "VMware-esx-vmx"
And in ESX 4.x "vmware -v" will show you the version of "vmware-esx-vmware-release"

In addition, vCenter will show the version of your "vmware-hostd"
Please keep in mind those numbers can all be different.
Johnny Zhang
You always want to know more about your servers, your network settings your storage options. "esxcfg-info" is the best place to dig things up. You want to know more about your storage? Type:
"esxcfg-info -s | less -RS"
You can check your VMFS alignment here, when the Starting sector set as 128 you know your VMFS is aligned correctly (VMFS aligned on 64KB boundary)
You can map the LUN back to your service console device. It will also show you the type of the file system on the LUN. In this case fb is VMFS
Johnny Zhang
There are times we want to find out what drivers are configured to load up when VMkernel boots up.
You can find this out by typing:
"esxcfg-modele -q"
You can also check what drivers are currently loaded:
" esxcfg-modele -l"
Johnny Zhang
Every time when you create a VMFS datastore, a copy of VMFS metadata will write to your LUN with following information:
  • Block size
  • Number of extents
  • Volume capacity
  • VMFS version
  • Label
  • VMFS UUID
You can use "vmkfstools -P -h /vmfs/volumes/LUN_Lable" to query the file system information

Note: information based on Infrastructure 3 DSA Manual
Johnny Zhang
Normally when you hit a VMotion issue, or a 64 bit VM can not be power on from your 64 bit ESX server. You might asked to reboot the host and check the BIOS if HV (Hardware Virtualization) is enabled. You can check this without reboot your server
esxcfg-info | grep -i "hv support'

It will return a number between 0 to 3
  • 0 is Not present

  • 1 is Not supported

  • 2 is disabled

  • 3 is enabled

So in my case my server supports HV but is disabled under the BIOS.
Johnny Zhang
Sometimes we will get a hardware device just won't work on ESX server. There are many different reasons for that. We'd like to first find out if the device is supported and the right device driver is loaded. You can check the HCL online, but what about you are in front of the ESX server and it does not have Internet connection?
You can check the device against vmware-devices.map file. this file is located at /etc/vmware directory on your ESX server.
Let's use a NIC on my server as an example
esxcfg-nics -l
We can see the device is Broadcom Corporation NetXtreme II 5706 Gigabit Ethernet and the driver is bnx2.
Now we will chaeck this against vmware-devices.map file
grep -i "bnx2" vmware-devices.map
We can see the line "device,0x14e4,0x164a,nic,NetXtreme II 5706 Gigabit Ethernet,bnx2.o" We know the device is supported, and the loaded driver is also correct. Now we need look elsewhere for the problem.
Johnny Zhang
Each ESX server will keep a list of VMotion history since it first boot up. You can find that by
cat /proc/vmware/migration/history
It will show you if the VM was migrated to this host or from this host. In my case VM1244 is migrated from this host to host 10.21.51.133. This IP is the VMotion IP address. The VMotion id is a very important part. Both hosts involved in the VMotion will have the same VMotion id. So you can track the VM from one host to another by using this id.
For example, my VMotion id is 1253064812716632. So I can use this against vmkernel logs on both hosts
On the source:
grep 1253064812716632 vmkernel
From the log we can see:
src ip = <10.21.51.132> dest ip = <10.21.51.133> Dest wid = 1111. The source will always show both VMotion IP address for source and destination hosts, it will also show the new world ID for the VM on the new host, in this case 1111.
On destination:
grep 1253064812716632 vmkernel
On destination we see:
src ip = <10.21.51.132> dest ip = <0.0.0.0> Dest wid = -1 using SHARED swap
Note, the destination IP will always show 0.0.0.0 and world id -1. Another way for us to tell this is the destination host.
We will talk about more on how to track VMotion from other files.
Johnny Zhang
By default, users can try to log into a Linux or in this case ESX server as many time as they want. Someone can sit there all day try to crack the password or just write up a script let it do the trick. You can change the behavior by add the following lines to /etc/pam.d/system-auth:

auth required /lib/security/pam_tally.so no_magic_root
account required /lib/security/pam_tally.so deny=3
no_magic_root

This will lock out the user after 3 attempts
(Keep in mind you might want to give more than 3 attempts before lock users out, just in case you forgot your password)

You can also setup the log to monitor it after this

To create the file for logging failed login attempts, execute the following commands:
touch /var/log/faillog
chown root:root /var/log/faillog
chmod 600 /var/log/faillog

Note: This will only work with VI3 since PAM on Redhat 5 (where ESX 4.x service console based on) does not work with those options

Johnny Zhang
If you are on a ESX host service console (ESX 3.x or ESX 4.0), you want to run a quick summary on the data stores this host can see, but either don't have access to VI client or just simplely lazy to launch it, you can get that by typing
vmware-vim-cmd hostsvc/datastore/listsummary

You can got more about a specific datastore by
vmware-vim-cmd hostsvc/datastore/info datastore_name
Johnny Zhang
Anyone work with ESX server would know sometimes you need to restart hostd service on the ESX server to "refresh" information on ESX. Normally we do this by typing "service mgmt-vmware restart"
However, sometimes this is not good enough. Every time you restart hostd it will map to 4 files under
/var/lib/vmware/hostd/stats.
If any of those file got corrupted by any reason, restart hostd will not help. Every time when hostd restart called, the service will check if those files exist, if not it will create them before starting the service. So we can remove those files to make sure hostd service starts from scrach
"rm -rf /var/lib/vmware/hostd/stats/*" Make sure you are in the right direct tory when you run the rm command especially with the -rf switch
Johnny Zhang
There are many useful log files on a ESX server to help you find what cause a problem. However, there may be to many of them how do you know which one to look at? I will base on my own experiences to explain what I think is important. I will start with HA logs. HA is configured on vCenter. However, once configured it will function without vCenter. It will build it's own HA domain (This domain always use the name 'vmware'). If you have more than 5 servers in a HA cluster, there will be 5 primaries and the rest will be secondary. The log file location after ESX 3.5 U2 is at /var/log/vmware/aam directory
The most impportant log would be "vmware_server_name.log". You can find many HA related info or errors from here.
Even with 5 primaries, there will be only one cluster manager which is the one holding all the rules. You can find which one is the cluster manager by

less vmware_server_name.log | grep -i "VMWareClusterManager submitted to run on node "| uniq
You can see how HA changed the cluster manager from host to host, and the last record is the current manager.
You can also find out if there was a isolation event happened by
less vmware_bs-bcs-h132.log | grep -i ISOLATED
When HA detect heartbeat failure on one host, it will try to find out if this is a agent failure or host failure. The isolation event will only trigger when there is a host failure. HA does this by ping the host after the heartbeat is gone
less vmware_bs-bcs-h132.log | grep -i "Ping Node results:"
When the host replys the ping it will mark as ALIVE. When the host is not responding it will mark as DEAD and HA event will trigger. (More to come in the future)
Johnny Zhang
The service console's /proc directory gives us some good information. How do I know my DRS setting for the VMs on this hosts?

less -RS /proc/vmware/sched/drm-stats

gid 0 is the resource group for the host. Check balloon and swap to monitor the host resource usage.
Johnny Zhang
There are times when we need to test on all vmkernel network connections. It's own vmkernel interface, iSCSI network, NFS mounts, vmkernel network gateway. You can use one command to ping them all at once

vmkping -D

Johnny Zhang
There are times we need to find out information about the physical network cards.
We all know how to list the NICs by using:

esxcfg-nics -l
Now we know we have two physical NICs. vmnic0 and vmnic1. what about if I want to know more information about them? The command ethtool will help us.

ethtool -i vmnic0
This will give us the driver version and the firmware level this NIC is currently at. You still want more information?

ethtool vmnic0
netstat -int
Will give you information on the health of your physical NICs' packets receive and transmit
Johnny Zhang
How do you know if your virtual machine's configuration file (vmname.vmx) is good? You might not find the typo in there by looking at it. And what about you have 30, 40, 50 of them on this host? You can run one command to do a quick check for you
vmware-configcheck
This command will go through all the steps necessary to power on a VM. Once it finish with all the steps it will give the VM a "PASS" tag. If you have problem with the configuration file. The problem VM will taged as "FAIL"
Johnny Zhang
Many times, we need to find storage information on ESX servers when work with the storage teams. 'vmware-vim-cmd' commands can help us to pull those information quickly
There are many useful information we can pull out from 'vmware-vim-cmd' commands. I will list three of them here.

1. vmware-vim-cmd "hostsvc/summary/fsvolume"

This command will link your data store's friendly name with the path and the LUN Ids. It will also allow you to verify the file system type and if any of them went read only on you.

2. vmware-vim-cmd "hostsvc/summary/hba"
This command gives information on the physical HBA's PCI Ids, currently used driver, model and speed.

3.
vmware-vim-cmd "hostsvc/summary/scsilun"
This is extremely useful since the UUID out put from this command is what the SAN team really understand.