Posts Tagged by Solaris

Solaris – max of 100 cron jobs

I stumbled across a limitation feature behaviour of Solaris over the weekend – by default it only allows you to run a maximum of 100 cron jobs at once. I suppose this to protect against “shooting oneself in the head”, but aaaarghhh… Reminds me why I love Monday mornings.

Anyway, the file to edit is /etc/cron.d/queuedefs, see man queuedefs.

Thanks to Solaris Tips for pointing me in the right direction.

Solaris – convert a single root UFS to a root ZFS mirror

Scenario: the current Solaris Sparc system is booting off a single UFS disk (eg when cloned from an image). You want to convert to a ZFS mirror. Assumes that ZFS is already supported; if not install the ZFS package first. See also:

Connect to the console, login, examine the disk layout:

-> start /SP/console

root@foo # uname -a
SunOS foo 5.10 Generic_142900-11 sun4v sparc SUNW,Sun-Blade-T6320

root@foo # mount | fgrep '/dev/dsk'
/ on /dev/dsk/c0t0d0s0 read/write/setuid/devices/intr/largefiles/logging/xattr/onerr
or=panic/dev=800008 on Thu Feb 17 12:04:28 2011

root@foo # echo | format
Searching for disks...done
AVAILABLE DISK SELECTIONS:
       0. c0t0d0
          /pci@0/pci@0/pci@2/LSILogic,sas@0/sd@0,0
       1. c0t1d0
          /pci@0/pci@0/pci@2/LSILogic,sas@0/sd@1,0
Specify disk (enter its number): Specify disk (enter its number):

So on this system, / is currently mounted on target 0 (c0t0d0). We want to zfs mirror onto target 1 (c0t1d0), and allow booting from both drives.

Clear the current partition table on target 1, and create partition 0 covering the whole disk. Label the disk as SMI (ie traditional Solaris VTOC. EFI labels are not supported for boot disks).

format -e
Specify disk (enter its number): 1
selecting c0t1d0
format> p
(etc)
format> label
[0] SMI Label
[1] EFI Label
Specify Label type[1]: 0
format>

Create the new zfs pool:

root@foo # zpool create rpool c0t1d0s0

Create the zfs boot environment:

root@foo # lucreate -n zfsBE -p rpool
Analyzing system configuration.
...
Populating contents of mount point .
Copying.

Check the results:

root@foo # lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
be0                        yes      yes    yes       no     -
be3                        yes      no     no        yes    -
zfsBE                      yes      no     no        yes    -

Activate the environment and reboot:

root@foo # luactivate zfsBE
A Live Upgrade Sync operation will be performed on startup of boot environment .
**********************************************************************
The target boot environment has been activated. It will be used when you
reboot. NOTE: You MUST NOT USE the reboot, halt, or uadmin commands. You
MUST USE either the init or the shutdown command when you reboot. If you
do not use either init or shutdown, the system will not boot using the
target BE.
**********************************************************************

In case of a failure while booting to the target BE, the following process
needs to be followed to fallback to the currently working boot environment:

1. Enter the PROM monitor (ok prompt).

2. Change the boot device back to the original boot environment by typing:

     setenv boot-device /pci@0/pci@0/pci@2/LSILogic,sas@0/disk@0,0:a

3. Boot to the original boot environment by typing:

     boot

**********************************************************************

Modifying boot archive service
Activation of boot environment  successful.

root@foo # init 6

After rebooting, you should see your new zfs root environment:

root@foo # zfs list
NAME               USED  AVAIL  REFER  MOUNTPOINT
rpool             16.1G   258G    98K  /rpool
rpool/ROOT        10.0G   258G    21K  /rpool/ROOT
rpool/ROOT/zfsBE  10.0G   258G  10.0G  /
rpool/dump        2.00G   258G  2.00G  -
rpool/swap        4.01G   262G    16K  -

root@foo # zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c0t1d0s0  ONLINE       0     0     0

errors: No known data errors

As you can see, rpool only contains one hard disk (c0t1d0s0). Delete the old boot environments, otherwise format will complain:

root@foo # lustatus
Boot Environment           Is       Active Active    Can    Copy
Name                       Complete Now    On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
be0                        yes      no     no        yes    -
be3                        yes      no     no        yes    -
zfsBE                      yes      yes    yes       no     -
root@foo # ludelete be0
...
root@foo # ludelete be3
Determining the devices to be marked free.
Updating boot environment configuration database.
Updating boot environment description database on all BEs.
Updating all boot environment configuration databases.
Boot environment  deleted.

Repartition and mirror onto the original hard disk. Notice that the current ZFS root disk is the first argument in the attach command, and the old boot disk is the second argument. Use -f when it complains about an existing UFS filesystem:

root@foo # format -e
(repartition, relabel as SMI, etc)

root@foo # zpool status
  pool: rpool
 state: ONLINE
 scrub: none requested
config:

        NAME        STATE     READ WRITE CKSUM
        rpool       ONLINE       0     0     0
          c0t1d0s0  ONLINE       0     0     0

errors: No known data errors

root@foo # zpool attach -f rpool c0t1d0s0 c0t0d0s0
Please be sure to invoke installboot(1M) to make 'c0t0d0s0' bootable.
Make sure to wait until resilver is done before rebooting.

root@foo # zpool status rpool
  pool: rpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scrub: resilver in progress for 0h1m, 11.50% done, 0h11m to go
config:

        NAME          STATE     READ WRITE CKSUM
        rpool         ONLINE       0     0     0
          mirror      ONLINE       0     0     0
            c0t1d0s0  ONLINE       0     0     0
            c0t0d0s0  ONLINE       0     0     0  1.38G resilvered

errors: No known data errors

root@foo # zpool status rpool
...
 scrub: resilver completed after 0h11m with 0 errors on Fri Feb 18 14:31:09 2011
...
            c0t0d0s0  ONLINE       0     0     0  12.0G resilvered

Make bootable off target 0, and confirm:

root@foo # installboot /usr/platform/`uname -i`/lib/fs/zfs/bootblk /dev/rdsk/c0t0d0s0
root@foo # init 0
...
svc.startd: 104 system services are now being stopped.
...
{0} ok devalias
...
disk1                    /pci@0/pci@0/pci@2/@/disk@1
disk0                    /pci@0/pci@0/pci@2/@/disk@0
disk                     /pci@0/pci@0/pci@2/@/disk@0
...

{0} ok boot disk0
Boot device: /pci@0/pci@0/pci@2/@/disk@0  File and args:
SunOS Release 5.10 Version Generic_142900-11 64-bit
...
root@foo # init 0
...
svc.startd: 104 system services are now being stopped.
...
{0} ok boot disk1
...
Boot device: /pci@0/pci@0/pci@2/@/disk@1  File and args:
SunOS Release 5.10 Version Generic_142900-11 64-bit
...

Resize Terminal in Solaris

Sometimes when connecting into a Solaris box via multiple jumps eg Citrix to putty to jumphost to target host, the terminal can get confused about screen size and scramble the output. Use this command to update environment variables with the current terminal size:

eval `resize`

or if $PATH is munged:

eval `/usr/openwin/bin/resize`

Thanks to PeterK.

Solaris – increase tmpfs /tmp on the fly

A useful script from BigAdmin for increasing the size of a tmpfs /tmp without rebooting. See also SoftPanorama, Talking about RAM disks in the Solaris OS.

(more…)

The RAID5 Write Hole

The latest edition of the venerable UNIX and Linux System Administration Handbook (Nemeth et al) has a good section discussing the “RAID5 Write Hole”:

Finally, RAID 5 is vulnerable to corruption in certain circumstances. Its incremental updating of parity data is more efficient than reading the entire stripe and recalculating the stripe’s parity based on the original data. On the other hand, it means that at no point is parity data ever validated or recalculated. If any block in a stripe should fall out of sync with the parity block, that fact will never become evident in normal use; reads of the data blocks will still return the correct data.

Only when a disk fails does the problem become apparent. The parity block will likely have been rewritten many times since the occurrence of the original desynchronization. Therefore, the reconstructed data block on the replacement disk will consist of essentially random data.

Further reading on the BAARF archive (Battle Against Any Raid 5), including why RAID10 and RAID3 should be chosen over RAID5. And then there’s ZFS and RAID-Z.

Next Page »