So here’s a question I want you to try answering off the top of your head – Which certificate is your domain controller using for Kerberos & LDAPS and what happens when there are multiple certificates in the crypto store?
The answer is actually pretty obvious if you already know the answer, however this was the question I faced recently, and ended up having to do a little bit of poking around to answer the question.
The scenario in question for me is having built a new multi-tier PKI in our environment I have reached the point of migrating services to it, including the auto-enrolling certificates templates used on Domain Controllers.
Have you ever looked in OpenManage Essentials and seen the above when looking at a device? I recently had this experience when checking on a number of older servers that we were not receiving alerts properly for.
Checking on the iDRAC and server it appeared that the management agents were running and correctly configured so that the OME server could contact the device, but attempts to discover and inventory were still failing. What was going on?
The Dell Troubleshooting Tool is an excellent utility by Dell to interrogate devices using variety of protocols. Querying the iDRAC using WSMAN soon found the problem:
Many of thew older servers had internal SSL certificates installed on them, which had subsequently expired. As most of the servers had been decommissioned, renewing the certificates had been overlooked.
Getting rid of the expired certificate is not as straight forward as it should be, with no ability in the iDRAC6 web interface to delete the certificate. This was resolved by accessing the IDRAC via SSH and using the racadm command:
After the iDRAC interface rebooted to apply changes, OME was then able to discover and inventory the iDRAC interface.
The Dell troubleshooting tool proved to be a very useful tool in the infrastructure admin’s toolbox for dealing with non-obvious management protocol issues.
Having covered off on the methodology in Part 1 and the Bare Metal Results of Part 2, we now get onto what was perhaps the more controversial aspect of my testing where I worked – performance under a virtualisation platform. We primarily use VMware, and so the tests were done using this.
Disclaimer: The numbers obtained below are not indicative of the true performance of the server and should not influence any purchasing decisions made.
With virtualisation, I’m conservative and have a general expectation that I would see some minimal performance degradation of under 10%. The first tests we did consisted of creating an empty VM, and then running the PTS Live CD, as we thought this would rule out any disk I/O operations. I compared the results to the bare metal for each server and was surprised by the results:
Benchmarking is actually something that needs to be considered very carefully and objectively. Not all benchmarks are equal. Phoronix test suite was good in the sense that you can benchmark certain workloads and I chose to focus on apache and PostgreSQL tests in the product as this closely represented the workloads I needed to improve performance on. At this point it was decided to use the PTS Desktop Live (http://www.phoronix-test-suite.com/?k=pts_desktop_live) as it was felt that this test would assure all things would be equal, regardless of platform.
Not everything was exactly equal though – none of the hardware was like for like in specifications. As I was working towards matching and improving the score of the R910 against the DL380 G5, the best I could achieve was to ensure that benchmarks were consistent. In this case the Live CD achieved this by ensuring that the same linux build and benchmarking tools were being used. I will discuss additional factors impacting the results as I continue through this series.
When someone say’s their Virtual Machine is running slowly, the first thing we do is check out the performance graphs with the vSphere client. Everything looks normal, and so it is easy to dismiss it as an application issue. However, what happens when this issue is the result of a previously undetected problem that appears to have multiple components to it?
This is the problem I am currently working on now with a Dell R910 Server. The Dell R910s are quite a powerful machine designed for more intensive workloads and certainly comes as a surprise that we are seeings performance issues.
This is currently an issue being worked on, and I do not know what the outcome will be. However I will be documenting the steps being taken in order to benchmark performance and how we might improve the performance.
HP released an advisory last week detailing an issue with upgrading firmware for iLO 3 interfaces of a particular firmware version to version 1.5. This appears to affects tools using the hponcfg utility, command line (CLI) interfaces and HPSUM update tools.
The good news is there is a fix for this, with HP releasing revised toolsets on the iLO 3 site, advising versions 4.1.0 of the tools or later have resolved this issue.