We have recently picked up a contract role for maintaining an aging SGI storage cluster.

The SGI units are re-badged to other companies, Dell has them listed as a “Scalable 10Gb iSCSI storage MD3036e” (see bottom unit in image) anf they are now very much end of life.

Each unit has upwards of 60 disks, and at this site many of the arrays have 900GB SAS Disks installed in about a dozen of these units connected via a highly interconnected Infiniband network.

The 5500 can have expansion units slaved off it in a daisy chain arrangement so more disks present from the unit.

Networking Overview

The Storage is networked using Melanox Inifiniband. In a typical High Performance Compute environment, three primary Inifinband switches are located at the top of the stack, six leaf nodes under that with each leaf node connecting to each head switch.

Each Storage unit is connected to the leaf nodes via an Inifinband connector, and each array has dual 40/56Gb Inifiband links.

In addition to the storage, each node of the High Performance Compute Cluster (34 nodes) is also connected into the Infiniband network enabling access to all the data that is servered from file severs and a special file system – Lustre parallel file system.

There are additional nodes for admin and Object Storage but that’s covered in another post.

How the storage uses disk is interesting, and this paper from SGI best sums it up. If you have used and MD or DS storage array, the GUI’s show will look familiar 🙂

PDF: Comparing Dynamic Disk Pools

Todo – IMAGE of top NETWORK will go here 🙂

Close up of the 5 draws, each with 12 disk bays.

Our role is to maintain it till a newer replacement roles in shortly, then we get to migrate the data off it to the newer storage cluster.

Some additional links of relevance: