Workstations, ECC Memory and Thermal Engineering

Many desktop mainboards have moved away from ECC (Error-Correcting Code) memory, which is an important tool for data accuracy and stable computer operation. ECC enables memory bit errors to be corrected at the hardware level, and it is frequently used in servers due to the greater system reliability that it enables. Without ECC, it's possible for data to get corrupted, or for a computer to become unstable due to a pointer (memory reference in a program) getting corrupted, the latter of which can easily cause an unrecoverable crash. The mechanism of bit change errors is usually cosmic ray or neutron interaction, and it's more common than might be supposed from the exotic-sounding names.

Desktop boards supported ECC more widely in the past, but the needs and specifications of desktop and server boards have seemingly diverged over time. Another factor is that ECC memory is often priced significantly higher than non-ECC. The physical differences between these memory types are relatively minor (9 DRAM chips per octet instead of 8), but the market for ECC memory and supporting boards is probably smaller than the consumer or business desktop market, so ECC market volumes may be lower and prices thus higher. This perhaps forms a vicious cycle where ECC becomes increasingly more expensive and less used, therefore it's less commonly used and becomes increasingly expensive.

Another general problem with desktop boards compared to server boards is the poor thermal management of desktop boards. Server boards generally have airflow and heat flow that is strictly in one direction, usually front to back. Desktop boards often blow heat in seemingly random directions. It's common for CPU heatsinks to be cooled with a downward airflow that pushes the heat to the back and front of the computer. DRAM is often oriented lengthwise towards the sides of the board which is yet another different direction and perpendicular to a front to back airflow. Having air and heat trying to move in many different directions probably does not help get the heat out of the case and probably recirculates some of it inside the case instead.

One solution to both ECC and cooling issues is to use a server or server-derived board as a desktop board. Server specialist Super Micro seems to have done this with some of their boards such as the X8SAX, which is available as of mid-2009 using the late-2008 Intel X58 chipset and supporting late-2008, high-performance, 45nm, "Nehalem" Xeon 5500 processors. X58 gives QuickPath Interconnect (QPI) support for DDR3 memory including both ECC and non-ECC. The X8SAX is slightly different from Super Micro's related Intel X58 chipset server boards in that it has no on-board video, but does have onboard audio. This is essentially the opposite requirement from a server board, which would usually be expected to have minimalist console video but no audio, so it's pretty clearly designed for the desktop workstation market. Presumably other board makers have similar approaches, but Super Micro has my attention as a maker of very-popular and widely-used server boards.

Super Micro also has utilitarian-looking, but very well-engineered mid-tower cases for these boards that have a correct front-to-back airflow supported by large thermally-controlled intake and exhaust fans, multiple removable hard drive sleds, etc. An example is the