sun – Sheep Guarding Llama

Solaris Dstream Package Format (Package Stream)

Scott Alan Miller — Thu, 20 Mar 2008 15:51:41 +0000

If you have worked on Solaris for a while you have probably stumbled across the package stream or “dstream” package format sometimes used for Solaris packages. Dstreams can come as a surprise to Solaris administrators who have become accustomed to the traditional package format. But Dstreams are very easy to work with if you just know some basics.

First of all there appear to be two naming conventions for these packages. The most common, by far, is to end a package in .pkg while the less common variant is to end the name in .dstream. Some people also leave off the postfix altogether leaving it unclear as to what the file is intended to be.

Installing a dstream is only slightly different than a regular package. The dstream is much more similar to a Linux RPM as it is a single, atomic file. Once it is installed it will act just like any other Solaris package and can be managed and removed in the usual ways (e.g. pkginfo, pkgrm, etc.)

Installing is simple. Let’s assume that we are dealing with the package myNewSoftware.dstream which is saved in /tmp. To install simply:

pkgadd -d /tmp/myNewSoftware.dstream

But in some cases you may want to have access to the contents of the Dstream without needing to install it first. If we are on Solaris this is easy. Just use pkgtrans.

pkgtrans myNewSoftware.dstream .

Or, possibly, you need to get access to the contents of the Dstream without having access to a Solaris machine or the pkgadd command. Do not fret. The solution is much simpler than you would imagine. The Dstream is created in the cpio format which we can extract using common tools. Unfortunately I have had some issues getting the packages to unpack correctly using this trick. If anyone has additional insight intot his process, please comment.

So to unpack, but not install, our previous example file on any UNIX box (or even Windows with cpio installed via Cygwin or a similar utility) we can simply:

cpio -idvu < myNewSoftware.dstream

or, I have also seen this option as well – both work for me equally:

dd if=myNewSoftware.dstream skip=1 | cpio -idvu

The “v” option just gives us some verbose output so that we can see what we just unpacked without having to look around for it. You will now have a directory (or a few) as contained in the cpio archive.

CPUs, Cores and Threads: How Many Processors Do I Have?

Scott Alan Miller — Fri, 07 Mar 2008 17:25:07 +0000

In my job role I am very often called upon to determine how many “processors” a machine has or how many we will need for a specific task. Ten years ago this was a simple process but today even the very concept of the “processor” is fuzzy at best and only a very few people have a clear understanding of what it means. I spend much of my time explaining, as best as possible, the terms needed to even discuss processors today as everything processor related must be seen in context.

Before we begin let us look at the terms involved in discussing processors starting from the bottom of the stack. On the bottom we have a chip carrier, this can be something as simple as a motherboard (a.k.a. mainboard, systemboard, MoBo), a processor daughtercard or a dedicated chip carrier. Any of these will qualify for our use. A chip carrier holds sockets. Chip carriers can have a single socket or many.

On a standard desktop or laptop computer we would expect to find that we have a single motherboard containing a single socket. On a mid-sized server such as the HP DL360 G5 we see a single motherboard with two sockets. On a larger server such as the Sun SunFire x4600 we see several daughtercards each with a single socket but with a total of eight sockets available in the overall system.

The Intel Pentium III Slot A chips are a perfect example of a dedicated chip carrier. In the case of the Slot A Pentium III processors the processor itself was mounted directly onto a small daughterboard that was dedicated for the purpose of carrying the Pentium III processor and its associated voltage management electronics. This small card was enclosed in plastic for protection and would be attached directly to the motherboard.

Above the layer of the chip carrier is the socket. A socket is a physical connector allow a chip to be connected to a board. Occasionally a chip as important as the CPU will be connected to a board without a socket. This is more common when dealing with embedded systems and is exceedingly rare in general purpose computing. In this case the connection itself could be considered analogous to the socket. The use of the socket in this explanation can be confusing because of its questionable interpretation but it is important in its inclusion because of the need to identify potential system capacity and classification which is done, normally, at this level.

It is in the counting of sockets per computer that we determine that maximum “way” of a server. For example the DL360 mentioned above is classified as a two-way server. And the x4600 is an eight-way server. This is the case when the server is at capacity. A particular server would be classified by the number of sockets in use. For example a DL385 with just one socket occupied could be considered a one-way server but with extra potential capacity. By adding another chip to the second socket we are said to be upgrading from a one-way to a two-way. Many server vendors have started advertising the “way” of their servers based on non-socket based factors but this practice is non-standard and highly misleading. Be sure to compare servers based on socket capacity and not on advertised “way”.

Each socket is capable of holding one physical processor. While sockets are purchased with the board to which they are attached, a chip can be purchased already in a socket or as a standalone product. Processors are often sold in stores in boxes just like any other product and are the most visible form of “processor” that consumers will face. This is the only “processor” that can be seen visibly, held in the hand, bought as an item in a store, etc. This is the physical manifestation of processing power. Just as socket count determines the maximum “way” of a computer the processor count will determine the current “way” of that computer. Most consumers or desktop administrators think of processors in terms of the physical chip. If the term processor is to have an official usage this is the level at which it is most appropriate. Common examples of a processor include the AMD Opteron, Intel Core, Intel Pentium II, Sun UltraSpark IV or IBM Power6.

The most important industry recognition of this “level” being the “processor” is Microsoft, Oracle and most major software vendors using this definition of processor to determine their per-processor licensing requirements. Because of this stand on the definition of processor and its long history of use mostly in this context we are likely to see the word processor remain linked to the physical entity.

Each processor chip can have one or more die carried within it. A die is not visible as it is encased in the protective material of the processor. The die consists of the semi-conductive substrate and is a discrete electrical element within the processor. A die is the most difficult portion of a processor to define, in my opinion, as it is completely invisible to anyone unless they break apart a processor and even so they are extremely difficult to see because of their size and density.

A CPU, or Central Processing Unit, is, and has been, generally tied to a die. One die contains one single CPU. A die and a traditional CPU are, roughly, synonymous. Technically an important difference remains because a die can contain components in addition to the CPU such as support processing. In a more general sense, a die can contain other types of integrated circuits other than a CPU so the two words are not the same thing even though they effectively are when we are only discussing general use processors – CPUs. Strangely it is at this level that we have achieved the term CPU used so commonly but so extremely misunderstood.

Within a single CPU there can be one or more processing cores. A core is the real workhorse of the processor stack. It is within a core that the actual processing work is done. It is most common, today, for a CPU to contain only one core. There is a common misconception that this is not the case due to marketing efforts to convince people otherwise. Internal processor architecture should not be used as a marketing tool as it is simply confusing and misleading. Only a holistic view of processor performance characteristics can provide adequate comparisons when deciding on a processing platform. No single architectural element will have an impact large enough to be usable as a determining factor in processor selection. But more importantly it is not feasible for anyone who is not a chip architect with a solid grounding in IC design concepts to even remotely grasp the intricacies involved in the design of a microprocessor.

In traditional processors, like the Pentium III, there is one core per CPU. This is very simple. In many modern processors such as the Intel Core or the Intel Core 2 there is still only one core per CPU while there are multiple CPUs per processor (each CPU is on a discrete die within the processor.) So an Intel Core 2 Duo would be a single processor with two die each with one CPU each with one core. This gives a total of two cores per processor. It is multi-core as well as being multi-CPU. Technically the term multi-core should not apply here as that is only useful in a different and important context. In the AMD Opteron processor we see a single processor with a single die and single CPU with two cores within that single CPU. In this case we have a multi-core single-CPU configuration. This is a true multi-core processor. Multi-core within a single die/CPU is an important distinction because it varies the ability for components to communicate amongst themselves. The most confusing thing here is that Intel product is named “Core” while being based on multi-CPU technology. This has lead to a proliferation in the misuse of the term core.

Cores are still an extremely important component to use in normal system discussions, however. Cores are discrete processing elements and therefore represent a very important look at our computers. By looking at cores we can see how many independent parallel actions can be taken by the processors at one time. This is very important for understanding the scaling and capacity abilities of our computers. A computer can only truly parallelize to the extent of its “core” capacity.

The final layer of our stack that we need to examine is that of the multithread (a.k.a. Hyper-Thread, SuperThread, etc.) The most well known example of this is Intel’s implementation of such used in their Pentium 4 derived XEON processors. In current use the Sun UltraSparc T family of processors are the poster-children for multithreading. Multithreading does not truly add additionally parallelism to the processing structure but it can be used, under certain loads, to make the processing pipeline more efficient and to push multiple threads of execution into the processor roughly simultaneously. Multithreading is complicated but in the absolutely simplest terms (and possibly the most useful to the layman looking to grasp the correct use of this technology) it can be though of as allowing the processor to manage thread execution and scheduling instead of leaving this solely to the operating system. In reality what is performed is vastly most complicated than this.

Multithreading is useless for single-threaded workloads and its mere presence will degrade performance. Multithreading is most useful for highly threaded workloads. It is currently seeing a lot of positive use in the areas of web servers and databases. To transfer decision making from the operating system to the multithreading portion of the processor, an MT processor presents each of its thread processors to the operating system as a separate “logical processor”. It is at this point that we finally see the concept of processor as viewed by the operating system. This “logical processor” is what we view in Microsoft’s PerfMon or TaskMgr or in top on Linux. Often this is what we think of as being the processor.

Now that we have been bombarded with terms, layers and models we will look at a few examples to help determine how we should approach the classification of processors. We will look at the HP DL360 G2, the HP DL585 G2, the HP DL580 G4, the HP and the SunFire T2000.

In our first example we will look at the very traditional and standard Hewlett-Packard/Compaq Proliant DL360 G2. This server has a single motherboard containing two processor sockets. Each socket accepts one Intel Pentium III-S processor (up to 1.4GHz.) At this level we can identify this server as a true two-way server. Each Pentium III-S processor contains a single die / CPU. Each CPU has one core and each core is natively threaded with no multithreading capabilities. So, in total, this server is a two-way server with two processors, two CPUs, two cores and two logical processors to present to the operating system. Very simple, very straightforward. Just as we expect a computer to behave.

Our second example is the Hewlett-Packard Proliant DL585 G2. This server has four processor sockets on its motherboard making it a true four-way server. Each socket can hold an AMD Rev F Opteron Dual-Core processor. Each Opteron, in this scenario, has a single die with a single CPU. Each CPU has two cores and each core has only the native thread handler providing a total of one logical processor per core. So our total is four-way, four die / CPU, eight core and eight logical processors presented to the operating system.

Our third example is the Hewlett-Packard Proliant DL580 G4. The Proliant DL580 G4 has a four socket motherboard capable of holding four Dual-Core Intel XEON 7000 series processors. This, like the DL585 G2, is a true four-way server when fully populated. Each XEON 7000 processor contains dual dies / CPUs and each CPU contains one core for a total of two cores per processor. Each core has a single native thread handler. So our total is four-way, eight die / CPU, eight core and eight logical processors presented to the operating system.

My desktop example is the Hewlett-Packard Compaq DeskPro d530. This desktop unit has the option of using the Intel Pentium 4 HyperThread processor which is what makes it interesting for our purposes. We will use this processor in our example. The DeskPro d530 has a motherboard that supports a single Pentium 4 (or Celeron 4) processor. Like most desktops this is a one-way machine. Each Pentium 4 processor has a single die / CPU with a single execution core. Each core on a traditional Pentium 4 (or Celeron 4) can execute just a single thread but, in our example, we will use the HyperThread version of the P4 which can handle two simultaneous threads presenting two logical processors to the operating system. So we have a one-way desktop with a single processor with a single CPU containing a single core with two mulithread handlers presenting two logical processors.

To make this analysis more complicated we must also be aware that because of single thread performance problems on the Pentium 4 HT platform it was very common for HyperThreading to be disabled on this processors through a BIOS setting. In these cases the threading model returns to native and only a single logical processor is presented to the operating system. This is the only example, of which I am aware, of a processor having a selectable number of presentable logical processors. The efficacy of using the HyperThread features was based upon operating system and load characteristics. For example, Windows 98SE or ME running on the d530 could not even see the second logical processor because it only has a uni-processor kernel option. So HyperThreading is not even possible. With Windows 2000 or XP both logical processors were visible and usable but some workloads, such as most video games at the time, could not take advantage of it while many business workloads would. Each user would have to determine which mode made the most sense for them adding to the complexity of the situation.

Our final example is the Sun SunFire T2000 server. The SunFire T2000 is a single socket motherboard designed to hold one UltraSparc T processor. This is a true one-way server. Each UltraSparc T processor has a single die / CPU. Each CPU contains either four, six or eight cores depending on the purchased configuration – we will use eight in our example. Each of these eight cores has four thread handlers. In this machine we therefore see a one-way server with a single processor with a single CPU containing eight cores and a total of thirty-two simultaneous multithreads being presented to the operating system as thirty-two logical processors.

As computer systems continue to increase the number of logical processors being presented to the operating system the importance of efficient process and thread handling by the operating system kernel will continue to become more and more important. Many traditional systems have not been able to handle multi-processor situations very efficiently, if at all, but today with the number of available logical processors skyrocketing even in desktops the need for good process and thread handling across a potentially large number of logical processors is extremely important.

As you can see the issue of determining the number of processors, cores, CPUs, etc. is extremely difficult. It is clear why people have become confused and why marketing is playing such a significant role in determining the public’s perceptions of these architectural components. The most important components to keep clear are the counts for way, processor, core and logical processor (virtual processor, processing thread, execution engine, etc.) Underlying component issues, while important to be semantically correct and to understand the working of processors, are still underlying components and should not be thought of as being the defining characteristics of our computer systems today.

Rsync on Solaris 10

Scott Alan Miller — Sat, 09 Feb 2008 14:04:45 +0000

I was interested to get Rsync up and running on my Solaris 10 server – an UltraSPARC based SunFire V100. To my dismay the Rsync package is not available for Solaris 10 from SUN. So I decided to set out on a journey to discover how exactly to get Rsync “the right way” and to get it installed and working.

After much searching I discovered that the obvious SUNWrsync package does, in fact, exist but not, at this time, for Solaris 10. Rather it is only available in Solaris Express (aka Solaris 11.) This means that it is not available in the standard Solaris 10 Update 4 or earlier installation CDs. Rsync has not been available on any previous version of Solaris to my knowledge either. The version of Rsync available from Solaris Express is currently 2.6.9 which is current, as of November 6, 2006, according to the official Rsync project page.

Fortunately SUN has made Solaris Express available as a free download. Unfortunately it is a single DVD image that must be downloaded in three parts and then combined into a single, huge image. This is not nearly as convenient as having an online package repository from which a single package could be downloaded (hint, hint SUN!)

You will need to download all three files from SUN, unzip them and then concatenate them into a single 3.7GB ISO file from which you can extract the necessary package.

# unzip sol-nv-b64a-sparc-dvd-iso-a.zip # unzip sol-nv-b64a-sparc-dvd-iso-b.zip # unzip sol-nv-b64a-sparc-dvd-iso-c.zip # cat sol*a sol*b sol*c > sol-nv-sparc.iso # mkdir /mnt/iso # lofiadm -a sol-nv-sparc.iso /dev/lofi/1 # mount -F hsfs -o ro /dev/lofi/1 /mnt/iso # cd /mnt/iso/Solaris_11/Product/ # ls -l | grep rsync

You will now have the list of the two available Rsync packages: SUNWrsync and SUNWrsyncS. It is SUNWrsync that we are really interested in here. I like to move all of my packages that I am installing to my own personal repository so that I can keep track of what I am installing and to make it easier to build a matching machine or to rebuild this one. If you are going to use a repository in this way be sure to back it up or it won’t be very useful during a rebuild.

# cp -r SUNWrsync/ /data/PKG/ # pkgadd -d /data/PKG/

You will now be able to pick from the available packages in your repository to choose which to install. [Helpful hint: If you have a large number of packages in your personal repository, consider placing each package into its own directory. For example, make a directory called “rsync” or “rsync_client” and symlink (ln -s) back to the installation directory. This makes it easier and quicker to install a single package. You can simple “cd /data/PKG/rsync” and “pkgadd -d .” Much quick and easier for large repos. By using the symlink method you maintain a single directory of the files while also having handy individual directories.]

Once you have installed the Rsync client package it is ready to be used. Because we are not using Rsync in a server based configuration we have no configuration to worry about. Rsync is most popularly used as a client package over SSH. Unlike most network aware packages that require that an SSH tunnel be created for them Rsync has SSH tunneling built into it making it extremely easy to use.

Let start with an example of moving files from one server to another without using SSH.

/usr/bin/rsync -av remote-host:/data/home/ /data/home

In this example we are synchronizing the remote /data/home directory with the local one pulling files from the remote site to the local. This is a one way sync so only files missing locally are brought over and files that exist locally but do not exist remotely are left intact. This is a relatively safe process. Files of the same name will be overwritten, however, so do this with test directories until you are used to how it works. You can run a test command using the -n option. With this option you will get the output of the Rsync command without actually moving any files so that you have a chance to see what would have happened. Here is the same command run in “test” mode.

/usr/bin/rsync -avn sc-sol-nfs:/data/home/ /data/home

With this particular Rsync package, the default mode of operation is to use SSH as the transport. This can be set explicitly but that is not necessary. By using SSH you have the security of the SSL tunnel to protect your traffic and you have the security and ease of use that comes with not needing to run a daemon process on any server that you want to sync to or from. The final command that you will normally want to run will involve the “-z” option which turns on file compression. This will normally decrease the time that it takes to transfer files. The gzip algorithm used for the compression is very effective on text documents and general files and is quite fast but on already compressed files such as tgz, Z, zip, jpeg, jpg, png, gif, mp3, etc. it can, at worst case, actually expand the files and will use a lot of CPU overhead without increasing the transfer speed. So best to be aware of the file types that you will be transferring. But for most users gzip is the right compression to use. So our final transfer command is:

/usr/bin/rsync -avz sc-sol-nfs:/data/home/ /data/home