April 4, 2008: Nothing to See Here

Wet and rainy in New Jersey today. It is surprisingly warm today. Much warmer than I was anticipating and I had to carry my fleece most of the way to the office. It is Friday so I get to go in a little late although I am sure that it will be a busy day overall. I ate breakfast at Airlie Cafe on my way in this morning.

I finished reading the Pragmatic Programmers’ “Interface Oriented Design” this morning.  I suppose that now I should move on to reading my textbook again.

My day was mostly uneventful. Just normal work stuff. I am wearing a new shirt today though. It is blue. I like it. Poplin.

It was pretty late when I finally got the chance to leave the office. A little after eight. We had late deployments and a few problems that needed to be resolved before calling it an evening. I ate dinner at Steve’s Pizza just off of Cedar in Manhattan on my way home to save time.

It was after nine when I got home. We stayed up and watched some of the eighth season of That 70s Show and then we were off to bed. Tomorrow is an all homework day for both of us.

Linux Processor Ignored

WARNING: NR_CPUS limit of 1 reached. Processor ignored.

Not exactly the error message that you were hoping to see when you were checking you dmesg logs.  Don’t panic, this is easily remedied.  If you are wondering how to check your own Linux system for this error you can look by using this command:

dmesg | grep -i cpu

This error occurs on a multiple logical processor system when a uniprocessor kernel is loaded.  What the error indicates is that one CPU is being used and that more have been found but are being ignored.  The system should come online correctly but with only a single logical CPU.  (For a detailed discussion on logical processors see CPUs, Cores and Threads.)

In today’s market full of multi-core CPU products and hyperthreading this error message has moved from the exclusive realm of multi-socket servers to the home desktop and laptop.  It is now a potentially common site for many casual Linux users.

To correct this issue on a Red Hat, CentOS or Fedora Linux system all you will need to do is make a simple change to your GRUB configuration to tell it to point to a symmetrical multiprocessor (smp) kernel rather than the uniprocessor kernel. The file that you will need to edit is /etc/grub.conf.  After some header comments the beginning of your file should look something like this:

default=1
timeout=5
splashimage=(hd0,0)/grub/splash.xpm.gz
hiddenmenu
title CentOS (2.6.9-67.0.7.plus.c4smp)
     root (hd0,0)
     kernel /vmlinuz-2.6.9-67.0.7.plus.c4smp ro root=/dev/VG0/LV0
     initrd /initrd-2.6.9-67.0.7.plus.c4smp.img
title CentOS (2.6.9-67.0.7.plus.c4)
     root (hd0,0)
     kernel /vmlinuz-2.6.9-67.0.7.plus.c4 ro root=/dev/VG0/LV0
     initrd /initrd-2.6.9-67.0.7.plus.c4.img

The GRUB configuration file can appear daunting at first but, in reality, it is quite simple to deal with.  The only line with which we are concerned with making modifications is the “default” line value.  In this case it is set to 1.  The grub.conf file contains a list of available kernels for us to use.  We may have just one or possible several, maybe even dozens.  In this case we see two.  You can see here that we have a CentOS 2.6.9 c4smp and a CentOS 2.6.9 c4 kernel.  You only need to be concerned with the “title” lines.  These are your kernel titles.  Normally the kernels of most interest will be at the top of the file.

You can check the name of the kernel that you are currently running by issuing:

uname -a

The first title line is kernel “0”, the second is kernel “1”, the next “2” and so forth.  Right now our “default” value is pointing to “1” which is the second kernel from the top and, as you will notice, not an smp kernel (therefore it is a uniprocessor kernel.)  In this case all we need to do is change the “default” value from “1” to “0” so that it now points to the first kernel option which for us is the smp kernel.

After your grub.conf file has been saved you make reboot the Linux system.  If all goes well it will return to you with additional logical processors enabled.  You can verify the name of the loaded kernel with the command given above.

April 3, 2008: Vista SP1 Day

If you like IHOP, you’ll love SpyHop!

Today is my work from home day and I am taking advantage of it to do the Vista Server Pack 1 update on Dominica’s HP Compaq 6515b laptop. It is a pretty major change and includes the kernel change syncing Vista with Windows Server 2008. So I am very interested to see how the performance improvements are and how the file copy issue has been addressed. That has been affecting me so I was anxious to get SP1 put into place.

Dominica went shopping today at Target and picked up the eighth and final season of That 70s Show. I am pretty sure that I have only seen half of one episode out of this entire season and she doesn’t think that she has seen a single episode. So this will be a nice treat. The other seven seasons have been almost entirely “reruns” for us. It isn’t very often that we actually watch something that is completely “new”. Although we are wondering how good the final season is going to be. The show has been winding down and with Topher Grace gone in the final season we are wondering what they plan to do with the show to keep it going.

It was a another busy day at the office. I did manage to do just a little bit of cleaning around the apartment as well. It really needed it. I seem to be spending just tons of time taking care of all of the cleaning. I can’t believe how often dishes need to be done, for example. I run the dishwasher every couple days and it is always half full of Oreo’s dishes. He goes through more dishes than Dominica and I combined, literally. He generated three or four pairs of plastic containers and lids per day, two food dishes per week and all of his food prep dishes once a week. It’s crazy.

Dominica got home and whipped up a vegetable stir-fry in an Indian coconut curry sauce. It was really good. No time for me to relax tonight, though, I have to do Object Technologies homework so that is my whole evening right there. Not fun but it has to be done. No way to put it off until tomorrow.

Before getting a chance to actually shut down for the night I was paged out for a small issue around twenty until midnight.  It is really tough to get any momentum going on homework – let me tell you.

April 2, 2008: Nothing Exciting Today

I worked at home until midnight last night. I went to bed when I was just too exhausted to be of any use. I took Oreo for a late night walk outside and discovered that it was raining. Raining quite hard in fact, but it didn’t start until the moment that Oreo and I went outside. Very bad timing. Oreo was quite unhappy about the situation and attempted to avoid his walk but it had to be done so we just stayed out there and got wet and cold.

I actually got myself up nice and early this morning. I have no idea how I managed to do that. I got up at a quarter after five and got into the shower and started the day. The walk into the office on Wall Street was very nice today. The sun was out and the air temperature was pretty nice. The wind was a bit much but not all that bad.

My morning ended up being pretty crazy. I had been watching my BlackBerry all morning on the way in and I was thinking that yesterday’s slowness was going to roll on over into today but that didn’t happen at all. Instead my morning was bonkers and I didn’t really get a chance to get up from my desk to do anything until a bit after two when I finally ran out to get myself a falafel pita sandwich just outside the building. If you follow that link to Google Maps’ street view you can’t see the falafel guys because they aren’t there but the white car is parked where my falafel guys park during business hours. Just north-east a few feet of the Wall St. – Water St. intersection.

My afternoon wasn’t so bad. Still kept me busy but things slowed down to a little more normal pace. I worked until normal time for a change and then stopped by Borders quickly on the way home and picked up a couple books. One for Min and the “Python Phrasebook” for myself.

We kind of skipped dinner tonight. I just had a bowl of cereal because I had left the “milk” out and it will go bad if I don’t drink it all right away. It isn’t real milk, just soy milk. So it doesn’t spoil like real milk nor is it nasty when it does. So not such a big deal. And we go through it very quickly anyway. Dominica drinks real milk for calcium. I stick to soy for lower cholesterol – and I just like it better.

Tonight was a homework night. I did Java programming all night and have that piece of my homework that is due on Friday completed. Now I just need to do the modeling piece tomorrow and I will be all set. I can’t do any on Friday as I always work late on Fridays and there is just no way to guarantee that I will be able to hand it in if I wait until then.

I went to bed and watched some of Allo, Allo to catch up with where Dominica was in the series. I will be working from home tomorrow.

Linux’ kscand

In Linux the kscand process is a kernel process which is a key component of the virtual memory (vm) system.  According to Unix Tech Tips & Tricks’ excellent Understanding Virtual Memory article “The kscand task periodically sweeps through all the pages in memory, taking note of the amount of time the page has been in memory since it was last accessed. If kscand finds that a page has been accessed since it last visited the page, it increments the page’s age counter; otherwise, it decrements that counter. If kscand finds a page with its age counter at zero, it moves the page to the inactive dirty state.”

For the majority of Linux users and even system administrators on large servers this kernel process requires no intervention.  It is a simple process that works in the background doing its job well.  Nonetheless, under certain circumstances it can become necessary to tune kscand in order to improve system performance in a desirable way.

Issues with kscand are most likely to arise in a situation where a Linux box has an extremely large amount of memory and will be even more noticeable on boxes with slower memory.  The most notable is probably the HP Proliant DL585 G1 which can support 128GB or memory but in doing so drops bandwidth to a paltry 266MHz.  I first came across this particular issue on a server with 32GB of memory with approximately 31.5GB of it in use.  No swap space was being used and most of the memory was being used for cache so there was no strain on the memory system but the total amount of memory being scanned by the kscand process is where the issue truly lies.

Even on a busy server with gobs of memory (that’s the technical term) it would be extremely rare that kscand would cause any issues.  It is a very light process that runs quite quickly.  You are most likely to see kscand as a culprit when investigating problems with latency sensitive applications on memory intensive servers.  The first time that I came across the need to tune kscand was while diagnosing a strange latency pattern of network traffic going to a high-performance messaging bus.  The latency was minor but small spikes were causing concern in the very sensitive environment.  kscand was spotted as the only questionable process receiving much system attention during the high latency periods.

Under normal conditions, that is default tuning, kscand will run every thirty seconds and will scan 100% of the system memory looking for memory pages that can be freed.  This sweep is quick but can easily cause measurable system latency if you look carefully.  Through carefull tuning we can reduce the latency caused by this process but we do so as a tradeoff with memory utilization efficiency.  If you have a box with significantly extra memory or extremely static memory, such as large cache sizes that change very slowly, you can safely tune away from memory efficiency towards low latency with nominal pain and good results.

kscand is controlled by the proc filesystem with just the single setting of  /proc/sys/vm/kscand_work_percent. Like any kernel setting this can be changed on the fly to a live system (be careful) or can be set to persist through reboots by adding it to your /etc/sysctl.conf file.  Before we make any permanent changes we will want to do some testing.  This kernel parameter tells kscand what percentage of the system memory to scan each time that a memory scan is performed.  Since it is normally set to 100 kscand normally scans all in-use memory each time that it is called.  You can verify you current setting quite easily.

cat /proc/sys/vm/kscand_work_percent

A good starting point with kscand_work_percent is to set to 50.  A very small adjustment may not be noticeable so seeing 100 and then 50 should provide a good starting point for evaluating the changes in system performance.  It is not recommended to set kscand_work_percent below 10 and I would be quite wary of dropping even below 20 unless you truly have a tremendous amount of unused memory and your usage is quite static.

echo 50 > /proc/sys/vm/kscand_work_percent

Once you have determined the best balance of latency and memory utilization that makes sense for your environment you can make you changes permanent.  Be sure to only use the echo technique if this is the first time that this will be added to the file. You will need to edit it by hand after that.

echo "kscand_work_percent = 50" >> /etc/sysctl.conf

Keep in mind that the need to edit this particular kernel parameter is extremely uncommon and will need to be done only under extraordinary circumstances.  You will not need to do this in normal, everyday Linux life and even a senior Linux administrator could easily never have need to modify this setting.  On very specific conditions will cause this performance characteristic to be measurable or its modification to be desirable.

All of my testing was done on Red Hat Enterprise Linux 3 Update 6.  This parameter is the same across many versions although the performance characteristics of kscand vary between kernel revisions so do not assume that the need to modify the parameters in one situation will mean that it is needed in another.

RHEL 3 prior to update 6 had a much less efficient kscand process and much greater benefit is likely to be found moving to a later 2.4 family kernel revision.  RHEL 4 and later, on the 2.6 series kernels, is completely different and the latency issues are, I believe, less pronounced.  In my own testing the same application on the same servers moving from RHEL 3 U6 to RHEL 4.5 removed all need for this tweak even under identical load.  [Edit – In RHEL 4 and later (kernel series 2.6) the kscand process has been removed and replaced with kswapd and pdflush.]

Things that are likely to impact the behavior of kscand that you should consider include the following:

  • Total Used Memory Size, regardless of total available memory size.  The more you have the more kscand will impact you.  Determined by: free -m | grep Mem | awk '{print $3}'
  • Memory Latency, check with your memory hardware vendor. Higher latency will cause kscand to have a larger impact.
  • Memory Bandwidth.  Currently in speeds ranging from 266MHz to 1066MHz.  The slower the memory the more likely a scan will impact you and tuning will be useful.
  • Value in kscand_work_percent. The lower the value the lower the latency.  The higher the value the better the memory utilization.
  • Memory Access Hops.  Number of system bus hops necessary to access memory resources.  For example a two socket AMD Opteron server (HP Proliant DL385) never has more than one hop.  But a four socket AMD Opteron server (HPProliant DL585) can have two hops increasing effective memory latency. So a DL585 is more likely to be affected than a DL385 with all other factors being equal (as long as all three or four processor sockets are occupied.)