Previous Up

Chapter 3  Installation

NOTE: The installation procedure given here for the SBUS controller is similar to that found in the manual. It has been modified so minor variations in the SPARCLinux installation may be included.

3.1  compatibilitySBUS Controller Compatibility

The 5070 / Linux 2.2 combination was tested on SPARCstation (5, 10, & 20), Ultra 1, and Ultra 2 Creator. The 5070 was also tested on Linux with Symmetrical Multiprocessing (SMP) support on a dual processor Ultra 2 creator 3D with no problems. Other 5070 / Linux / hardware combinations may work as well.


3.2  hardware installationHardware Installation Procedure

If your system is already up and running, you must halt the operating system.

GNOME:
  1. From the login screen right click the "Options" button.
  2. On the popup menu select System -> Halt.
  3. Click "Yes" when the verification box appears
KDE:
  1. From the login screen right click shutdown.
  2. On the popup menu select shutdown by right clicking its radio button.
  3. Click OK
XDM:
  1. login as root
  2. Left click on the desktop to bring up the pop-up menu
  3. select "New Shell"
  4. When the shell opens type "halt" at the prompt and press return
Console Login (systems without X windows):
  1. Login as root
  2. Type "halt"
All Systems:
Wait for the message "power down" or "system halted" before proceeding. Turn off your SPARCstation system (Note: Your system may have turned itself off following the power down directive), its video monitor, external disk expansion boxes, and any other peripherals connected to the system. Be sure to check that the green power LED on the front of the system enclosure is not lit and that the fans inside the system are not running. Do not disconnect the system power cord.
SPARCstation 4, 5, 10, 20 & UltraSPARC Systems:
  1. Remove the top cover on the CPU enclosure. On a SPARCstation 10, this is done by loosening the captive screw at the top right corner of the back of the CPU enclosure, then tilting the top of the enclosure forward while using a Phillips screwdriver to press the plastic tab on the top left corner.
  2. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in.
  3. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board.
  4. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged.
  5. Snap the nylon board retainers inside the SPARCstation over the 5070 RAIDium board to secure it inside the system.
  6. Secure the 5070 RAIDium SBUS backpanel to the system by replacing the rectangular washers and screws that held the original filler panel in place.
  7. Replace the top cover by first mating the plastic hooks on the front of the cover to the chassis, then rotating the cover down over the unit until the plastic tab in back snaps into place. Tighten the captive screw on the upper right corner.
Ultra Enterprise Servers, SPARCserver 1000 & 2000 Systems, SPARCserver 6XO MP Series:
  1. Remove the two Allen screws that secure the CPU board to the card cage. These are located at each end of the CPU board backpanel.
  2. Remove the CPU board from the enclosure and place it on a static-free surface.
  3. Decide which SBUS slot you will use. Any slot will do. Remove the filler panel for that slot by removing the two screws and rectangular washers that hold it in. Save these screws and washers.
  4. Remove the SBUS retainer (commonly called the handle) by pressing outward on one leg of the retainer while pulling it out of the hole in the printed circuit board.
  5. Insert the board into the SBUS slot you have chosen. To insert the board, first engage the top of the 5070 RAIDium backpanel into the backpanel of the CPU enclosure, then rotate the board into a level position and mate the SBUS connectors. Make sure that the SBUS connectors are completely engaged.
  6. Secure the 5070 RAIDium board to the CPU board with the nylon screws and standoffs provided on the CPU board. The standoffs may have to be moved so that they match the holes used by the SBUS retainer, as the standoffs are used in different holes for an MBus module. Replace the screws and rectangular washers that originally held the filler panel in place, securing the 5070 RAIDium SBus backpanel to the system enclosure.
  7. Re-insert the CPU board into the CPU enclosure and re-install the Allen-head retaining screws that secure the CPU board.
All Systems:
  1. Mate the external cable adapter box to the 5070 RAIDium and gently tighten the two screws that extend through the cable adapter box.
  2. Connect the three cables from your SCSI devices to the three 68-pin SCSI-3 connectors on the Antares 5070 RAIDium. The three SCSI cables must always be reconnected in the same order after a RAID set has been established, so you should clearly mark the cables and disk enclosures for future disassembly and reassembly.
  3. Configure the attached SCSI devices to use SCSI target IDs other than 7, as that is taken by the 5070 RAIDium itself. Configuring the target number is done differently on various devices. Consult the manufacturer's installation instructions to determine the method appropriate for your device.
  4. As you are likely to be installing multiple SCSI devices, make sure that all SCSI buses are properly terminated. This means a terminator is installed only at each end of each SCSI bus daisy chain.
Verifying the Hardware Installation:
These steps are optional but recommended. First, power-on your system and interrupt the booting process by pressing the "Stop" and "a" keys (or the "break" key if you are on a serial terminal) simultaneously as soon as the Solaris release number is shown on the screen. This will force the system to run the Forth Monitor in the system EPROM, which will display the "ok" prompt. This gives you access to many useful low-level commands, including:
ok show-devs

. . .

/iommu@f,e0000000/sbus@f,e000100SUNW, isp@1,8800000

. . .
The first line in the response shown above means that the 5070 RAIDium host adapter has been properly recognized. If you don't see a line like this, you may have a hardware problem.
Next, to see a listing of all the SCSI devices in your system, you can use the probe-scsi-all command, but first you must prepare your system as follows:
ok setenv auto-boot? False

ok reset

ok probe-scsi-all
This will tell you the type, target number, and logical unit number of every SCSI device recognized in your system. The 5070 RAIDium board will report itself attached to an ISP controller at target 0 with two Logical Unit Numbers (LUNs): 0 for the virtual hard disk drive, and 7 for the connection to the Graphical User Interface (GUI). Note: the GUI communication channel on LUN 7 is currently unused under Linux. See the discussion under "SCSI Monitor Daemon (SMON)" in the "Advanced Topics" section for more information.
REQUIRED: Perform a reconfiguration boot of the operating system:
ok boot -r
If no image appears on your screen within a minute, you most likely have a hardware installation problem. In this case, go back and check each step of the installation procedure. This completes the hardware installation procedure.

3.2.1  Serial Terminal

If you have a serial terminal at your disposal (e.g. DEC-VT420) it may be connected to the controller's serial port using a 9 pin DIN male to DB25 male serial cable. Otherwise you will need to supplement the above cable with a null modem adapter to connect the RAID controller's serial port to the serial port on either the host computer or a PC. The terminal emulators I have successfully used include Minicom (on Linux), Kermit (on Caldera's Dr. DOS), and Hyperterminal (on a windows CE palmtop), however, any decent terminal emulation software should work. The basic settings are 9600 baud , no parity, 8 data bits, and 1 stop bit.

3.2.2  Hard Drive Plant

Choosing the brand and capacity of the drives that will form the hard drive physical plant is up to you. I do have some recommendations:

3.3  5070 Onboard Configuration

Before diving into the RAID configuration I need to define a few terms. The test based GUI can be started by typing "agui"
: raid; agui 
at the husky prompt on the serial terminal (or emulator).

Agui is a simple ASCII based GUI that can be run on the RaidRunner console port which enables one to configure the RaidRunner. The only argument agui takes is the terminal type that is connected to the RaidRunner console. Current supported terminals are dtterm, vt100 and xterm. The default is dtterm.

Each agui screen is split into two areas, data and menu. The data area, which generally uses all but the last line of the screen, displays the details of the information under consideration. The menu area, which generally is the bottom line of the screen, displays a strip menu with a title then list of options or sub-menus. Each option has one character enclosed in square brackets (e.g. [Q]uit) which is the character to type to select that option. Each menu line allows you to refresh the screen data (in case another process on the RaidRunner writes to the console). The refresh character may also be used during data entry if the screen is overwritten. The refresh character is either <Control-l> or <Control-r>.

When agui starts, it reads the configuration of the RaidRunner and probes for every possible backend. As it probes for each backend, it's "name" is displayed in the bottom left corner of the screen.

3.3.1  Main Screen Options

The Main screen (Figure 3.1) is the first screen displayed. It provides a summary of the RaidRunner configuration. At the top is the RaidRunner model, version and serial number. Next is a line displaying, for each controller, the SCSI ID's for each host port (labeled A, B, C, etc) and total and currently available amounts of memory. The next set of lines display the ranks of devices on the RaidRunner. Each device follows the nomenclature of <device_type_c.s.l> where device_type_ can be D for disk or T for tape, c is the internal channel the device is attached to, s is the SCSI ID (Rank) of the device on that channel, and l is the SCSI LUN of the device (typically 0).


Figure 3.1: The main screen of the 5070 onboard configuration utility


The next set of lines provide a summary of the Raid Sets configured on the RaidRunner. The summary includes the raid set name, it's type, it's size, the amount of cache allocated to it and a comma separated list of it's backends. See rconf in the "Advanced Topics" section for a full description of the above.

Next the spare devices are configured. Each spare is named (device_type_c.s.l format), followed by it's size (in 512-byte blocks), it's spin state (Hot or Warm), it's controller allocation , and finally it's current status (Used/Unused, Faulty/Working). If used, the raid set that uses it is nominated.

At the bottom of the data area, the number of controllers, channels, ranks and devices are displayed.

The menu line allows one to quit agui or select further actions or sub-menus. These selections are described in detail below.

3.3.2  [Q]uit

Exit the agui main screen and return to the husky ( :raid; ) prompt.

3.3.3  [R]aidSets:

The Raid Set Configuration screen (Figure 3.2) displays a Raid Set in the data area and provides a menu which allows you to Add, Delete, Modify, Install (changes) and Scroll through all other raid sets (First, Last, Next and Previous). If no raid sets have been configured, only the screen title and menu is displayed. All attributes of the raid set are displayed. For information on each attribute of the raid set, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Raid Set Configuration screen or select further actions.


Figure 3.2: The RAIDSet configuration screen.


3.3.4  [H]ostports:

The Host Port Configuration screen (Figure 3.3) displays for each controller, each host port (labelled A, B, C, etc for port number 0, 1, 2, etc) and the assigned SCSI ID. If the RaidRunner you use, has external switches for host port SCSI ID selection, you may only exit ([Q]uit) from this screen. If the RaidRunner you use, does NOT have external switches for host port SCSI ID selection, then you may modify (and hence install) the SCSI ID for any host port. The menu line allows one to leave the Host Port Configuration screen or select further actions (if NO external host).


Figure 3.3: The host port configuration screen.


3.3.5  [S]pares:

The Spare Device Configuration screen (Figure 3.4) displays all configured spare devices in the data area and provides a menu which allows you to Add, Delete, Mod­ ify and Install (changes) spare devices. If no spare devices have been configured, only the screen title and menu is displayed. Each spare device displayed, shows it's name (in device_type_c.s.l format), it's size in 512-byte blocks, it's spin status (Hot or Warm), it's controller allocation, finally it's current status (Used/Unused, Faulty/Working). If used, the raid set that uses it is nominated. For information on each attribute of a spare device, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the Spare Device Configuration screen or select further actions.


Figure 3.4: The spare device configuration screen.


3.3.6  [M]onitor:

The SCSI Monitor Configuration screen (Figure 3.5) displays a table of SCSI monitors configured for the RaidRunner. Up to four SCSI monitors may be configured. The table columns are entitled Controller, Host Port, SCSI LUN and Protocol and each line of the table shows the appropriate SCSI Monitor attribute. For details on SCSI Monitor attributes, see the rconf command in the "Advanced Topics" section. The menu line allows one to leave the SCSI Monitor Configuration screen or modify and install the table.


Figure 3.5: The SCSI monitor configuration screen.


3.3.7  [G]eneral:

The General screen (Figure 3.6) has a blank data area and a menu which allows one to Quit and return to the main screen, or to select further sub-menus which provide information about Devices, the System Message Logger, Global Environment variables and throughput Statistics.


Figure 3.6: The General Screen. The options accessible from here allow you to view information on the attached devices (SCSI hard drives and tape units), browse the system logs, and examine environment variables.


3.3.8  [P]robe

The probe option re-scans the SCSI channels and updates the backend list with the hardware it finds.

3.3.9  Example RAID Configuration Session

The generalized procedure for configuration consists of three steps arranged in the following order:
  1. Configuring the Host Port(s)
  2. Assigning Spares
  3. Configuring the RAID set
Note that there is a minimum number of backends required for the various supported RAID levels: In this example we will configure a RAID 5 using 6, 2.04 gigabyte drives. The total capacity of the virtual drive will be 10 gigabytes (the equivalent of one drive is used for redundancy). This same configuration procedure can be used to configure other levels of RAID sets by changing the type parameter.
  1. Power on the computer with the serial terminal connected to the RaidRunner's serial port.
  2. When the husky ( :raid; ) prompt appears, Start the GUI by typing "agui" and pressing return.
  3. When the main screen appears, select "H" for [H]ostport configuration
  4. On some models of RaidRunner the host port in not configurable. If you have only a [Q]uit option here then there is nothing further to be done for the host port configuration, note the values and skip to step 6. If you have add/modify options then your host port is software configurable.
  5. If there is no entry for a host port on this screen, add an entry with the parameters: controller=0, hostport=0 , SCSI ID=0. Don't forget to [I]nstall your changes. If there is already and entry present, note the values (they will be used in a later step).
  6. From this point onward I will assume the following hardware configuration:
    1. There are 7 - 2.04 gig drives connected as follows:
      1. 2 drives on SCSI channel 0 with SCSI IDs 0 and 1 (backends 0.0.0, and 0.1.0, respectively).
      2. 3 drives on SCSI channel 1 with SCSI IDs 0 ,1 and 5 (backends 1.0.0, 1.1.0, and 1.5.0).
      3. 2 drives on SCSI channel 2 with SCSI IDs 0 and 1 (backends 2.0.0 and 2.1.0).
    2. Therefore:
      1. Rank 0 consists of backends 0.0.0, 1.0.0, 2.0.0
      2. Rank 1 consists of backends 0.1.0, 1.1.0, 2.1.0
      3. Rank 5 contains only the backend 1.5.0
    3. The RaidRunner is assigned to controller 0, hostport 0
  7. Press Q to [Q]uit the hostports screen and return to the Main screen.
  8. Press S to enter the [S]pares screen
  9. Select A to [A]dd a new spare to the spares pool. A list of available backends will be displayed and you will be prompted for the following information:

    Enter the device name to add to spares - from above:

    enter

    D1.5.0

  10. Select I to [I]nstall your changes
  11. Select Q to [Q]uit the spares screen and return to the Main screen
  12. Select R from the Main screen to enter the [R]aidsets screen.
  13. Select A to [A]dd a new RAID set. You will be prompted for each of the RAID set parameters. The prompts and responses are given below.
    1. Enter the name of Raid Set: cim_homes (or whatever you want to call it).
    2. Raid set type [0,1,3,5]: 5
    3. Enter initial host interface - ctlr,hostport,scsilun: 0.0.0

      Now a list of the available backends will be displayed in the form:
      0 - D0.0.0 1 - D1.0.0 2 - D2.0.0 3 - D0.1.0 4 - D1.1.0 5 - D2.1.0
    4. Enter index from above - Q to Quit:
      1 press return
      2 press return
      3 press return
      4 press return
      5 press return
      Q
  14. After pressing Q you will be returned to the Raid Sets screen. You should see the newly configured Raid set displayed in the data area (Figure 3.12).
  15. Press I to [I]nstall the changes


    Figure 3.12: The RaidSets screen of the GUI showing the newly configured RAID 5


  16. Press Q to exit the RaidSet screen and return to the the Main screen
  17. Press Q to [Q]uit agui and exit to the husky prompt.
  18. type "reboot" then press enter. This will reboot the RaidRunner (not the host machine.)
  19. When the RaidRunner reboots it will prepare the drives for the newly configured RAID.
    NOTE: Depending on the size of the RAID this could take a few minutes to a few hours. For the above example it takes the 5070 approximately 10 - 20 minutes to stripe the RAID set.
  20. Once you see the husky prompt again the RAID is ready for use. You can then proceed with the Linux configuration.

3.4  Linux Configuration

These instructions cover setting up the virtual RAID drives on RedHat Linux 6.1. Setting it up under other Linux distributions should not be a problem. The same general instructions apply.

If you are new to Linux you may want to consider installing Linux from scratch since the RedHat installer will do most of the configuration work for you. If so skip to section titled "New Linux Installation." Otherwise go to the "Existing Linux Installation" section (next).

3.4.1  Existing Linux Installation

Follow these instructions if you already have Redhat Linux installed on your system and you do not want to re-install. If you are installing the RAID as part of a new RedHat Linux installation (or are re-installing) skip to the "New Linux Installation" section.

QLogic SCSI Driver

The driver can either be loaded as a module or compiled into your kernel. If you want to boot from the RAID then you may want to use a kernel with compiled in QLogic support (see the kernel-HOWTO available from http://www.linuxdoc.org. To use the modular driver become the superuser and add the following lines to /etc/conf.modules:
alias qlogicpti /lib/modules/preferred/scsi/qlogicpti 
Change the above path to where ever your SCSI modules live. Then add the following line to you /etc/fstab (with the appropriate changes for device and mount point, see the fstab man page if you are unsure)
/dev/sdc1 /home ext2 defaults 1 2
Or, if you prefer to use a SYSV initialization script, create a file called ``raid'' in the /etc/rc.d/init.d directory with the following contents (NOTE: while there are a few good reasons to start the RAID using a script, one of the aforementioned methods would be preferable):
#!/bin/bash

case "$1" in

start)

echo "Loading raid module"

/sbin/modprobe qlogicpti

echo

echo "Checking and Mounting raid volumes..."

mount -t ext2 -o check /dev/sdc1 /home

touch /var/lock/subsys/raid

;;

stop)

echo "Unmounting raid volumes"

umount /home

echo "Removing raid module(s)"

/sbin/rmmod qlogicpti

rm -f /var/lock/subsys/raid

echo

;;

restart)

$0 stop 

$0 start 

;; 

*)

echo "Usage: raid {start|stop|restart}"

exit 1

esac

exit 0 
You will need to edit this example and substitute your device name(s) in place of /dev/sdc1 and mount point(s) in place of /home. The next step is to make the script executable by root by doing:
chmod 0700 /etc/rc.d/init.d/raid
Now use your run level editor of choice (tksysv, ksysv, etc.) to add the script to the appropriate run level.

Device mappings

Linux uses dynamic device mappings you can determine if the drives were found by typing:
more /proc/scsi/scsi
one or more of the entries should look something like this:
Host: scsi1 Channel: 00 Id: 00 Lun: 00

Vendor: ANTARES Model: CX106 Rev: 0109

Type: Direct-Access ANSI SCSI revision: 02

There may also be one which looks like this:

Host: scsi1 Channel: 00 Id: 00 Lun: 07

Vendor: ANTARES Model: CX106-SMON Rev: 0109

Type: Direct-Access ANSI SCSI revision: 02
This is the SCSI monitor communications channel which is currently un-used under Linux (see SMON in the advanced topics section below).

To locate the drives (following reboot) type:
dmesg | more
Locate the section of the boot messages pertaining to you SCSI devices. You should see something like this:
qpti0: IRQ 53 SCSI ID 7 (Firmware v1.31.32)(Firmware 1.25 96/10/15)

[Ultra Wide, using single ended interface]

QPTI: Total of 1 PTI Qlogic/ISP hosts found, 1 actually in use.

scsi1 : PTI Qlogic,ISP SBUS SCSI irq 53 regs at fd018000 PROM node ffd746e0
Which indicates that the SCSI controller was properly recognized, Below this look for the disk section:
Vendor ANTARES Model: CX106 Rev: 0109

Type: Direct-Access ANSI SCSI revision: 02

Detected scsi disk sdc at scsi1, channel 0, id 0, lun 0

SCSI device sdc: hdwr sector= 512 bytes. Sectors= 20971200 [10239 MB] [10.2 GB]
Note the line that reads "Detected scsi disk sdc ..." this tells you that this virtual disk has been mapped to device /dev/sdc. Following partitioning the first partition will be /dev/sdc1, the second will be /dev/sdc2, etc. There should be one of the above disk sections for each virtual disk that was detected. There may also be an entry like the following:
Vendor ANTARES Model: CX106-SMON Rev: 0109

Type: Direct-Access ANSI SCSI revision: 02

Detected scsi disk sdd at scsi1, channel 0, id 0, lun 7

SCSI device sdd: hdwr sector= 512 bytes. Sectors= 20971200 [128 MB] [128.2 MB]
BEWARE: this is not a drive DO NOT try to fdisk, mkfs, or mount it!! Doing so WILL hang your system.

Partitioning

A virtual drive appears to the host operating system as a large but otherwise ordinary SCSI drive. Partitioning is performed using fdisk or your favorite utility. You will have to give the virtual drive a disk label when fdisk is started. Using the choice ``Custom with autoprobed defaults'' seems to work well. See the man page for the given utility for details.

Installing a filesystem

Installing a filesystem is no different from any other SCSI drive:
mkfs -t <filesystem_type> /dev/<device>
for example:
mkfs -t ext2 /dev/sdc1

Mounting

If QLogic SCSI support is compiled into you kernel OR you are loading the "qlogicpti" module at boot from /etc/conf.modules then add the following line(s) to the /etc/fstab:
/dev/<device> <mount point> ext2 defaults 1 1
If you are using a SystemV initialization script to load/unload the module you must mount/unmount the drives there as well. See the example script above.

3.4.2  New Linux Installation

This is the easiest way to install the RAID since the RedHat installer program will do most of the work for you.
  1. Configure the host port, RAID sets, and spares as outlined in "Onboard Configuration." Your computer must be on to perform this step since the 5070 is powered from the SBUS. It does not matter if the computer has an operating system installed at this point all we need is power to the controller card.
  2. Begin the RedHat SparcLinux installation
  3. The installation program will auto detect the 5070 controller and load the Qlogic driver
  4. Your virtual RAID drives will appear as ordinary SCSI hard drives to be partitioned and formatted during the installation. NOTE: When using the graphical partitioning utility during the RedHat installation DO NOT designate any partition on the virtual drives as type RAID since they are already hardware managed virtual RAID drives. The RAID selection on the partitioning utilities screen is for setting up a software RAID.
    IMPORTANT NOTE: you may see a small SCSI drive ( usually ~128 MB) on the list of available drives. DO NOT select this drive for use. It is the SMON communication channel NOT a drive. If setup tries to use it it will hang the installer.
  5. Thats it, the installation program takes care of everything else !!

3.5  Maintenance

3.5.1  spares, activatingActivating a spare

When running a RAID 3 or 5 (if you configured one or more drives to be spares) the 5070 will detect when a drive goes offline and automatically select a spare from the spares pool to replace it. The data will be rebuilt on-the-fly. The RAID will continue operating normally during the re-construction process (i.e. it can be read from and written to just is if nothing has happened). When a backend fails you will see messages similar to the following displayed on the 5070 console:
930 secs: Redo:1:1 Retry:1 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16

932 secs: Redo:1:1 Retry:2 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16

933 secs: Redo:1:1 Retry:3 (DIO_cim_homes_D1.1.0_q1) CDB=28(Read_10)Re-/Selection Time-out @682400+16

934 secs: CIO_cim_homes_q3 R5_W(3412000, 16): Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out"

934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0)

934 secs: CIO_cim_homes_q2 R5: Drained alternate jobs for drive 4 (D1.1.0) RPT 1/0

934 secs: CIO_cim_homes_q2 R5_W(524288, 16): Initial Pre-Read drive 4 (D1.1.0) fails with result "Re-/Selection Time-out"

935 secs: Redo:1:0 Retry:1 (DIO_cim_homes_D1.0.0_q1) CDB=28(Read_10)SCSI Bus ~Reset detected @210544+16

936 secs: Failed:1:1 Retry:0 (rconf) CDB=2A(Write_10)Re-/Selection Time-out @4194866+128

...

Then you will see the spare being pulled from the spares pool, spun up, tested, engaged, and the data reconstructed.
937 secs: autorepair pid=1149 /raid/cim_homes: Spinning up spare device

938 secs: autorepair pid=1149 /raid/cim_homes: Testing spare device/dev/hd/1.5.0/data

939 secs: autorepair pid=1149 /raid/cim_homes: engaging hot spare ...

939 secs: autorepair pid=1149 /raid/cim_homes: reconstructing drive 4 ...

939 secs: 1054

939 secs: Rebuild on /raid/cim_homes/repair: Max buffer 2800 in 7491 reads, priority 6 sleep 500

...

The rebuild script will printout its progress every 10% of the job completed
939 secs: Rebuild on /raid/cim_homes/repair @ 0/7491

1920 secs: Rebuild on /raid/cim_homes/repair @ 1498/7491

2414 secs: Rebuild on /raid/cim_homes/repair @ 2247/7491

2906 secs: Rebuild on /raid/cim_homes/repair @ 2996/7491

3.5.2  re-integrating repaired driveRe-integrating a repaired drive into the RAID (levels 3 and 5)

After you have replaced the bad drive you must re-integrate it into the RAID set using the following procedure.
  1. Start the text GUI
  2. Look the list of backends for the RAID set(s).
  3. Backends that have been marked faulty will have a (-) to the right of their ID ( e.g. D1.1.0- ).
  4. If you set up spares the ID of the faulty backend will be followed by the ID of the spare that has replaced it ( e.g. D1.1.0-D1.5.0 ) .
  5. Write down the ID(s) of the faulty backend(s) (NOT the spares).
  6. Press Q to exit agui
  7. At the husky prompt type:
    replace <name> <backend> 
    Where <name> is whatever you named the raid set and <backend> is the ID of the backend that is being re-integrated into the RAID. If a spare was in use it will be automatically returned to the spares pool. Be patient, reconstruction can take a few minutes minutes to several hours depending on the RAID level and the size. Fortunately, you can use the RAID as you normally would during this process.

3.6  Troubleshooting / Error Messages

3.6.1  Out of band temperature detected...

3.6.2  ... failed ... cannot have more than 1 faulty backend.

3.6.3  When booting I see: ... Sun disklabel: bad magic 0000 ... unknown partition table.

3.7  Bugs

None yet! Please send bug reports to [email protected]

3.8  Frequently Asked Questions

3.8.1  How do I reset/erase the onboard configuration?

At the husky prompt issue the following command:
rconf -init
This will delete all of the RAID configuration information but not the global variables and scsi monitors. the remove ALL configuration information type:
rconf -fullinit
Use these commands with caution!

3.8.2  How can I tell if a drive in my RAID has failed?

In the text GUI faulty backends appear with a (-) to the right of their ID. For example the list of backends:
D0.0.0,D1.0.0-,D2.0.0,D0.1.0,D1.1.0,D2.1.0

Indicates that backend (drive) D1.0.0 is either faulty or not present. If you assigned spares (RAID 3 or 5) then you should also see that one or more spares are in use. Both the main and the and the RaidSets screens will show information on faulty/not present drives in a RAID set.

3.9  command referenceAdvanced Topics: 5070 Command Reference

In addition to the text based GUI the RAID configuration may also be manipulated from the husky prompt ( the : raid; prompt) of the onboard controller. This section describes commands that a user can input interactively or via a script file to the K9 kernel. Since K9 is an ANSI C Application Programming Interface (API) a shell is needed to interpret user input and form output. Only one shell is currently available and it is called husky. The K9 kernel is modelled on the Plan 9 operating system whose design is discussed in several papers from AT&T (See the "Further Reading" section for more information). K9 is a kernel targeted at embedded controllers of small to medium complexity (e.g. ISDN-ethernet bridges, RAID controllers, etc). It supports multiple lightweight processes (i.e. without memory management) on a single CPU with a non-pre-emptive scheduler. Device driver architecture is based on Plan 9 (and Unix SVR4) STREAMS. Concurrency control mechanisms include semaphores and signals. The husky shell is modelled on a scaled down Unix Bourne shell.

Using the built-in commands the user can write new scripts thus extending the functionality of the 5070. The commands (adapted from the 5070 man pages) are extensive and are described below.

3.9.1  autobootAUTOBOOT - script to automatically create all raid sets and scsi monitors

3.9.2  AUTOFAULT - script to automatically mark a backend faulty after a drive failure

3.9.3  AUTOREPAIR - script to automatically allocate a spare and reconstruct a raid set

3.9.4  BIND - combine elements of the namespace

3.9.5  BUZZER - get the state or turn on or off the buzzer

3.9.6  CACHE - display information about and delete cache ranges

3.9.7  CACHEDUMP - Dump the contents of the write cache to battery backed-up ram

3.9.8  CACHERESTORE - Load the cache with data from battery backed-up ram

3.9.9  CAT - concatenate files and print on the standard output

3.9.10  CMP - compare the contents of 2 files

3.9.11  CONS - console device for Husky

3.9.12  DD - copy a file (disk, etc)

3.9.13  DEVSCMP - Compare a file's size against a given value

3.9.14  DFORMAT- Perform formatting functions on a backend disk drive

3.9.15  DIAGS - script to run a diagnostic on a given device

3.9.16  DPART - edit a scsihd disk partition table

3.9.17  DUP - open file descriptor device

3.9.18  ECHO - display a line of text

3.9.19  ENV- environment variables file system

3.9.20  ENVIRON - RaidRunner Global environment variables - names and effects

3.9.21  EXEC - cause arguments to be executed in place of this shell

3.9.22  EXIT - exit a K9 process

3.9.23  EXPR - evaluation of numeric expressions

3.9.24  FALSE - returns the K9 false status

3.9.25  FIFO - bi-directional fifo buffer of fixed size

3.9.26  GET - select one value from list

3.9.27  GETIV - get the value an internal RaidRunner variable

3.9.28  HELP - print a list of commands and their synopses

3.9.29  HUSKY - shell for K9 kernel

3.9.30  HWCONF - print various hardware configuration details

3.9.31  HWMON - monitoring daemon for temperature, fans, PSUs.

3.9.32  INTERNALS - Internal variables used by RaidRunner to change dynamics of running kernel

3.9.33  KILL - send a signal to the nominated process

3.9.34  LED- turn on/off LED's on RaidRunner

3.9.35  LFLASH- flash a led on RaidRunner

3.9.36  LINE - copies one line of standard input to standard output

3.9.37  LLENGTH - return the number of elements in the given list

3.9.38  LOG - like zero with additional logging of accesses

3.9.39  LRANGE - extract a range of elements from the given list

set list D1 D2 D3 D4 D5 # create the list

set subl `lrange 0 3 $list' # extract from indices 0 to 3

echo $subl

D1 D2 D3 D4

set subl `lrange 3 1 $list' # extract from indices 3 to 1

echo $subl

D4 D3 D2

set subl `lrange 4 4 $list' # extract from indices 0 to 3

echo $subl # equivalent to get 4 $list

D5

set subl `lrange 3 100 $list'

echo $subl

D4 D5

3.9.40  LS - list the files in a directory

3.9.41  LSEARCH - find the a pattern in a list

set list D1 D2 D3 D4 D5 # create the list

set idx `lsearch D4 $list' # get index of D4 in list

echo $idx

3

set idx `lsearch D1 $list' # get index of D1 in list

echo $idx

0

set idx `lsearch D8 $list' # get index of D8 in list

echo $idx # equivalent to get 4 $list

-1

3.9.42  LSUBSTR - replace a character in all elements of a list

set list D1 D2 D3 D4 D5 # create the list

set subl `lsubstr D x $list' # replace all D's with x's

echo $subl

x1 x2 x3 x4 x5

set subl `lsubstr D {} $list' # delete all D's

echo $subl

1 2 3 4 5

set list -L -16 # create a list with embedded braces

set subl `lsubstr {} $list' # delete all open braces

set subl `lsubstr {} $subl' # delete all close braces

echo $subl

-L 16

3.9.43  MEM - memory mapped file (system)

3.9.44  MDEBUG - exercise and display statistics about memory allocation

3.9.45  MKDIR - create directory (or directories)

3.9.46  MKDISKFS - script to create a disk filesystem

3.9.47  MKHOSTFS - script to create a host port filesystem

3.9.48  MKRAID - script to create a raid given a line of output of rconf

3.9.49  MKRAIDFS - script to create a raid filesystem

3.9.50  MKSMON - script to start the scsi monitor daemon smon

3.9.51  MKSTARGD - script to initialize a scsi target daemon for a given raid set

3.9.52  MSTARGD - monitor for stargd

3.9.53  NICE - Change the K9 run-queue priority of a K9 process

3.9.54  NULL- file to throw away output in

3.9.55  PARACC - display information about hardware parity accelerator

3.9.56  PEDIT - Display/modify SCSI backend Mode Parameters Pages

3.9.57  PIPE - two way interprocess communication

3.9.58  PRANKS - print or set the accessible backend ranks for the current controller

3.9.59  PRINTENV - print one or all GLOBAL environment variables

3.9.60  PS - report process status

: raid; ps

NAME____________________PID__PPID__PGRP_S_P_ST%_TIME(ms)__SEMAPHORE+name

hyena                   0      0     0 R 9  18 385930     deadbeef

init                    1      0     1 W 0   9 90         8009b1a8   pau

SCN2681_reader          4      1     4 W 0   0 0          800702a4   2rd

SCN2681_writer          5      1     5 W 0   0 0          8007029c   2wr

SCN2681_putter          6      1     6 W 0   0 0          800702ac   2tp

DIO_R_drive3_q0        391     1   391 W 0   4 40120      8021a828   Ard

DIO_R_drive0_q0        397     1   397 W 0   4 13420      8007ac64   Ard

DIO_R_drive1_q0        404     1   404 W 0   5 25570      8007b224   Ard

husky                  28      1     1 W 0  10 50         8013a138   pau

cache_flusher          424     1   424 W 0  23 17700      8030c2c4   Cfr

CIO_R_q0               426     1   426 W 0  96 2320       8030d6f4   Ard

CIO_R_q1               427   426   426 W 0  96 2420       8030d6f4   Ard

CIO_R_q2               428   426   426 W 0  96 2410       8030d6f4   Ard

CIO_R_q3               429   426   426 W 0  96 2430       8030d6f4   Ard

CIO_R_q4               430   426   426 W 0  96 2240       8030d6f4   Ard

CIO_R_q5               431   426   426 W 0  96 2130       80c37540   Ard

CIO_R_q6               432   426   426 W 0  96 2300       8030d6f4   Ard

CIO_R_q7               433   426   426 W 0  96 2180       8030d6f4   Ard

smon                    65     1     1 W 0   5 30         8008d5e4   Nsl

DIO_R_drive2_q0        326     1   326 W 0   5 27680      8007b7e4   Ard

/bin/ps                871    28     1 W 0   8 40         80cfd020   pau

stargd                 107     1     1 R 0  48 23990      8007a648   Nsl

starg_107_L_R          119   107   119 W 0   0 0          8018c608   pau

3.9.61  PSCSIRES - print SCSI-2 reservation table for all or specific monikers

3.9.62  PSTATUS - print the values of hardware status registers

3.9.63  RAIDACTION- script to gather/reset stats or stop/start a raid set's stargd

3.9.64  RAID0 - raid 0 device

3.9.65  RAID1 - raid 1 device

3.9.66  RAID3 - raid 3 device

3.9.67  RAID4 - raid 4 device

3.9.68  RAID5 - raid 5 device

3.9.69  RAM - ram based file system

3.9.70  RANDIO - simulate random reads and writes

3.9.71  RCONF, SPOOL, HCONF, MCONF, CORRUPT-CONFIG - raid configuration and spares management

3.9.72  REBOOT - exit K9 on target hardware + return to monitor

3.9.73  REBUILD - raid set reconstruction utility

3.9.74  REPAIR - script to allocate a spare to a raid set's failed backend

3.9.75  REPLACE - script to restore a backend in a raid set

3.9.76  RM - remove the file (or files)

3.9.77  RMON - Power-On Diagnostics and Bootstrap

3.9.78  RRSTRACE - disassemble scsihpmtr monitor data

3.9.79  RSIZE - estimate the memory usage for a given raid set

3.9.80  SCN2681 - access a scn2681 (serial IO device) as console

3.9.81  SCSICHIPS - print various details about a controller's scsi chips

3.9.82  SCSIHD - SCSI hard disk device (a SCSI initiator)

3.9.83  SCSIHP - SCSI target device

3.9.84  SET - set (or clear) an environment variable

3.9.85  SCSIHPMTR - turn on host port debugging

3.9.86  SETENV - set a GLOBAL environment variable

3.9.87  SDLIST - Set or display an internal list of attached disk drives

3.9.88  SETIV - set an internal RaidRunner variable

3.9.89  SHOWBAT - display information about battery backed-up ram

3.9.90  SHUTDOWN - script to place the RaidRunner into a shutdown or quiescent state

3.9.91  SLEEP - sleep for the given number of seconds

3.9.92  SMON - RaidRunner SCSI monitor daemon

3.9.93  SOS - pulse the buzzer to emit sos's

3.9.94  SPEEDTST - Generate a set number of sequential writes then reads

3.9.95  SPIND - Spin up or down a disk device

3.9.96  SPINDLE - Modify Spindle Synchronization on a disk device

3.9.97  SRANKS - set the accessible backend ranks for a controller

3.9.98  STARGD - daemon for SCSI-2 target

00: Test Unit Ready If backend is ready returns GOOD Status, else sets Sense Key to Not Ready and returns CHECK CONDITION Status
01: Rezero Unit      Does nothing, returns GOOD Status

03: Request Sense    Sense data held on a per initiator basis (plus extra for bad 

                     LUN's)

04: Format Unit      Does nothing, returns GOOD Status

07: Reassign Blocks  Consumes data but does nothing, returns GOOD Status

08: Read_6           DPO, FUA and RelAdr not supported

0a: Write_6          DPO, FUA and RelAdr not supported

0b: Seek_6           Does nothing, returns GOOD Status

12: Inquiry          Only standard 36 byte data format supported (not vital product 

                     data pages)

15: Mode Select      Support pages 1, 2, 3, 4, 8 and 10 (but none writable)

16: Reserve          Doesn't support extents + 3rd parties

17: Release          Doesn't support extents + 3rd parties

1a: Mode Sense       Support pages 1, 2, 3, 4, 8 and 10.

1b  Start Stop       If Start is requested and the Immediate bit is 0 then waits for 

                     backend to become ready, else does nothing and returns GOOD 

                     Status. If backend does not become ready within 20 seconds set 

                     Sense Key to Not Ready and returns CHECK  CONDITION Status

1d  Send Diagnostics Returns GOOD Status when self test else complains (does nothing 

                     internally)

25  Read Capacity    RelAdr, PMI and logical address > 0 are not supported

28  Read_10          Same as Read_6

2a  Write_10         Same as Write_6

2b  Seek_10          Does nothing, returns GOOD Status

2f  Verify           Does nothing, returns GOOD Status

55  Mode Select_10   Same as Mode Select     

5a  Mode Sense_10    Same as Mode Sense

3.9.99  STAT - get status information on the named files (or stdin)

: raid; stat /bin/ps

ps                               ram    0 0x00049049    2 144

: raid;

3.9.100     STATS - Print cumulative performance statistics on a Raid Set or Cache Range

3.9.101     STRING - perform a string operation on a given value

set string ABCDEFGHIJ              # create the string

set subs `string length $string'   # get it's length

echo $subs

10

set subs `string range $string 2 2'     # extract character at index 3

echo $subs

C

set subs `string range $string 3 6'     # extract from indices 3 to 6

echo $subs

DEFG

set subs `string range $string 6 3'     # backwards

echo $subs

GFED

set subs `string range $string 4 70'    # extract from index 4 to 70 (or end)

echo $subs

EFGHIJ

set string D1,D2,D4,D8             # create the string

set subs `string split $string ,'  # split the string

echo $subs

D1 D2 D4 D8 

3.9.102     SUFFIX - Suffixes permitted on some big decimal numbers

3.9.103     SYSLOG - device to send system messages for logging

3.9.104     SYSLOGD - initialize or access messages in the system log area

3.9.105     TEST - condition evaluation command

3.9.106     TIME - Print the number of seconds since boot (or reset of clock)

3.9.107     TRAP - intercept a signal and perform some action

3.9.108     TRUE - returns the K9 true status

3.9.109     STTY or TTY - print the user's terminal mount point or terminfo status

3.9.110     UNSET - delete one or more environment variables

3.9.111     UNSETENV - unset (delete) a GLOBAL environment variable

3.9.112     VERSION - print out the version of the RaidRunner kernel

3.9.113     WAIT - wait for a process (or my children) to terminate

3.9.114     WARBLE - periodically pulse the buzzer

3.9.115     XD- dump given file(s) in hexa-decimal to standard out

3.9.116     ZAP - write zeros to a file

SYNOPSIS: zap [-b blockSize] [-f byteVal] count offset <>[3] store

DESCRIPTION: zap writes count * 8192 bytes of zeros at byte position offset * 8192 into file store (which is opened and associated with file descriptor 3). Both count and offset may have a suffix. The optional "-b" switch allows the block size to be set to blockSize bytes. The default block size is 8192 bytes. The optional "-f" switch allows the fill character to be set to byteVal which should be a number in the range 0 to 255 (inclusive). The default fill character is 0 (i.e. zero). Every 100 write operations the current count is output (usually overwriting the previous count output). Errors on the write operations are ignored.

SEE ALSO: suffix

3.9.117     ZCACHE - Manipulate the zone optimization IO table of a Raid Set's cache

3.9.118     ZERO - file when read yields zeros continuously

3.9.119     ZLABELS - Write zeros to the front and end of Raid Sets

3.10  Advanced Topics: SCSI Monitor Daemon (SMON)

Another way of communicating with the onboard controller from the host operating system is using the SCSI Monitor (SMON) facility. SMON provides an ASCII communication channel on an assigned SCSI ID and LUN. The commands discussed in section 7 may also be issued over this channel to manipulate the RAID configuration and operation. This mechanism is utilized under Solaris to provide a communication channel between an X Based GUI and the RAID controller. It is currently un-utilized under Linux. See the description of the smon daemon in the 5070 command reference above.

3.11  Further Reading


Previous Up