Have you ever looked in OpenManage Essentials and seen the above when looking at a device? I recently had this experience when checking on a number of older servers that we were not receiving alerts properly for.
Checking on the iDRAC and server it appeared that the management agents were running and correctly configured so that the OME server could contact the device, but attempts to discover and inventory were still failing. What was going on?
The Dell Troubleshooting Tool is an excellent utility by Dell to interrogate devices using variety of protocols. Querying the iDRAC using WSMAN soon found the problem:
Many of thew older servers had internal SSL certificates installed on them, which had subsequently expired. As most of the servers had been decommissioned, renewing the certificates had been overlooked.
Getting rid of the expired certificate is not as straight forward as it should be, with no ability in the iDRAC6 web interface to delete the certificate. This was resolved by accessing the IDRAC via SSH and using the racadm command:
After the iDRAC interface rebooted to apply changes, OME was then able to discover and inventory the iDRAC interface.
The Dell troubleshooting tool proved to be a very useful tool in the infrastructure admin’s toolbox for dealing with non-obvious management protocol issues.
iDRAC firmware 2.40.40 was released on 17th Oct 2016. Details can be found by following this link.
We have recently had the need to upgrade our iDRAC firmware to 2.40.40 on one of our servers while troubleshooting another issue with Dell and found shortly after that this particular server was no longer able to be discovered by Dell OpenManage Essentials.
We found that the TLS protocol the iDRAC set after updating was set to version 1.2, which is not supported by Windows operating systems less than Server 2012 R2 (Our OME server runs on Windows Server 2012). There is a patch available to fix this. This is all covered in the driver release notes.
The other alternative, which we have chosen to do for now as this firmware is only on one device is to set the iDRAC to use the older TLS protocol, which can be found under the iDRAC Network Settings in the services tab:
I’ll apply the Microsoft patch to the system, and then set the TLS back to v1.2
One of the tasts I am working on is the configuration of our fleet of Dell servers to use Dell’s Open Manage Essentials monitoring and management platform. One of the servers however had been unwilling to have it’s SNMP configuration changed using the VSphere CLI tools and was generating the following error:
Changing notification(trap) targets list to: myserver.local@162/DELLOME…
Use of uninitialized value $sub in string eq at C:/Program Files (x86)/VMware/VMware vSphere CLI/Perl/lib/VMware/VIMRuntime.pm line 81.
Use of uninitialized value $package in concatenation (.) or string at C:/Program Files (x86)/VMware/VMware vSphere CLI/Perl/lib/VMware/VIMRuntime.pm l
Undefined subroutine &Can’t call method “ReconfigureSnmpAgent” on an undefined value at C:\Program Files (x86)\VMware\VMware vSphere CLI\bin\vicfg-snm
p.pl line 297.
::fault_string called at C:\Program Files (x86)\VMware\VMware vSphere CLI\bin\vicfg-snmp.pl line 299.
Hrm OK fine. Lets try logging in to the Host’s ESX Shell and use esxcli to set the trap:
Community string was not specified in trap target: myserver.local
Clearly something is broken with the SNMP configuration. Luckily the VMware forums were quick to supply a solution.
The SNMP settings for ESX are stored in the XML file /etc/vmware/snmp.xml. You can either clear this file (cat /dev/null > /etc/vmware/snmp.xml) or if you know what the setting should be, modify it. in my case I needed to update the <targets></targets> XML Tag to have a community string:
When someone say’s their Virtual Machine is running slowly, the first thing we do is check out the performance graphs with the vSphere client. Everything looks normal, and so it is easy to dismiss it as an application issue. However, what happens when this issue is the result of a previously undetected problem that appears to have multiple components to it?
This is the problem I am currently working on now with a Dell R910 Server. The Dell R910s are quite a powerful machine designed for more intensive workloads and certainly comes as a surprise that we are seeings performance issues.
This is currently an issue being worked on, and I do not know what the outcome will be. However I will be documenting the steps being taken in order to benchmark performance and how we might improve the performance.
Where I work we have recently acquired a number of Dell R620 servers to be used in remote locations. Personally I think they are a great all round server for the light to medium workloads seen on the infrastructure we manage.
I’ve been pushing hard to use automated deployment systems that rely on the iDRAC interface, with the plan being that people take the rack and stack a previously unopened server at the remote site, use the front panel to configure iDRAC Network and then come back to work to provision the server remotely. The default username and password for the iDRAC is well known. That is until recently when we placed a server on site, confirmed we could remotely contact the iDRAC and came back, only to find the defaults not working.
It would appear that certainly the batch of servers we have, and some anecdotal reports from friends deploying Dell servers is that some server’s iDRAC interface appears to have a “faulty” default password. This was confirmed with a follow up call to Dell support. Current fix is (you guessed it) to ensure you set the password before going out on site. Whilst an inconvenience, luckily this was discovered after one server went out, and not all of them.
So just a heads for people that if you are relying on you iDRAC to remotely provision brand new boxes, you may want to set the default username/password yourself rather than making an assumption it will work.
It’s been a little while since my last post, after getting involved with some major projects, however now that much of that work has been completed, I am now getting back into configuration of iDRACs. With an incressing number of Dell servers appearing at work, I want to get on top of a standard configuration as soon as possible.
My iDRAC PS Library has been out there for a while, so it would seem appropriate to create a bulk updating script with it.
please be aware that in order to run this script you will need the RAC PS Library and the racadm iDRAC tools from Dell (Available on the scripts page). Please make sure you configure the RACLibrary $racadmpath or the scripts will not run.
Some nice new shiny arrived at work for me this week, a Dell R620 that mark will mark a shift in strategy for some of our key architecture.
I have been really looking forward to playing with Dell’s new 12G servers, because as some of you know, I have had some interesting issues with the 11G Servers. So, here are some dot points on what I think about the new servers.
They are quiet – I booted this beast in our build room, and even at the pre-boot stage the fans were a LOT quieter than it’s predecessor. Once booted, I actually found the ambient noise in the room louder than the server itself (that said, the room isn’t exactly a soundproof chamber either).
Memory Configuration appears faster – These servers are configured with 32GB of Memory, which is substantially less than the R910s and R810s we otherwise have, but either way the time it takes to “configure memory” is a lot less.
USC Boot/Software Inventory Faster – This I REALLY like. For our 11G servers, to boot into the Universal Server Configurator it could take upwards of 10 minutes to happen, and a lot of this seems to be around the server inventory process – I am pleased to say I didn’t have to wait for more than a couple of minutes for this to happen on the 620.
The USC hasn’t appeared to have any major changes to it, though some cosmetic enhancements have occurred.
This week I hope to sit down and configure one of these machines up with an OS and will write a bit more then!