Performance Basics: Bottlenecks

Today’s computer systems are increasingly complex, made of components made by different suppliers themselves made of components made by different suppliers, etc. Even your cheapo laptop is made that way. The same applies whether we are talking about hardware or software (big software publishers license or “borrow” code from others).

All these components, again whether hardware or software, can have an impact on performance be it perceived (“my computer feels sluggish”) or measured (time it takes to perform a certain task). Often solving performance problems starts by identifying the one component that is having the biggest negative impact on performance, the so called “bottleneck”.

To take the simple example of a single computer (enterprise systems composed of a myriad of networked computers are much more complex), there is a number of basic components whose utilization must be measured in order to identify a bottleneck:

  • CPU
  • Disk (Storage)
  • Memory
  • Network

There is a lot of software offering to measure the utilization of these components, Operating Systems usually come up with basic tools to monitor real time performance: sar/iostat/vmstat, etc. on UNIX/Linux style systems, the Activity Monitor on Mac OS, the Windows Task Manager or Resource Monitor on Windows (depending on the version).

Let’s say you are converting a video file from one format to another which is something that is becoming increasingly common. I am showing here what the iostat command measures on my Mac while iMovie is finishing the export of a video:

iostat measurement while iMovie is converting a movie

iostat measurement while iMovie is converting a movie

What is important here is the “cpu” column and the “id” subcolumn. It shows the idle time as in the time that the cpu (or rather the combination of the two cores on my particular machine) does not spend working. You can see that while iMovie is converting, it is less than 20% and when iMovie stops, it goes up to about 70%. Meanwhile, the disks (first two columns) are not being used at all. This would indicate that the CPU might be the “bottleneck”. Of course you would have to use other tools such as the activity monitor to measure the memory usage, network occupancy and double check that indeed the CPU is the limiting factor i.e. the application is “cpu bound”.

In a more complex enterprise environment, you would use more or less the same method: measure as much as you can and identify the “hot spots”.

Once they are identified, you may (or may not) have a solution. The busy component might be a  bottleneck just because it has to work hard, this is the way it is: you might be able to upgrade it (or not if you are already using the latest and greatest). It could also be that the busy component is a bottleneck because the software is unnecessarily overusing it. In that case, the software might be optimized. The recent example of the audio driver problem on the mac pros is a primary example of a bug that put an uncalled strain on the CPU. It could also be that there is no solution: take the example of transmitting messages to Mars (the planet, not the chocolate bar), you will always hit a “hard limit” imposed by physics in terms of transmission times.

In conclusion, there is no definite recipe, it is fun detective work though!

0 Responses to “Performance Basics: Bottlenecks”

  • No Comments

Leave a Reply