Archive for the ‘Solaris Tips’ Category

Xsun desktop within Solaris zones

Friday, April 15th, 2005

Well, tonight I got my desktop working in a Solaris zone (Sun java desktop, using Xsun). My system is a ultra 60 with a creator-3d card.

Below is how I got it working:

First of all, on the host operating system (global zone) run ‘/usr/dt/bin/dtconfig -d’ to disable the main X server, and reboot the machine.

Next, i made a zone:
# zonecfg -z bluto-desktop
bluto-desktop: No such zone configured
Use ‘create’ to begin configuring a new zone.
zonecfg:bluto-desktop> create
zonecfg:bluto-desktop> set zonepath=/opt/zones/bluto-desktop # the path to the zone
zonecfg:bluto-desktop> add net
zonecfg:bluto-desktop:net> set physical=hme0 # my network
zonecfg:bluto-desktop:net> set address=192.168.0.12
zonecfg:bluto-desktop:net> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/mouse # mouse device
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/kbd # keyboard device
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/pm # power managment
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/winlock # window lock device
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/sound/0 # sound
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/sound/0ctl # sound control
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> add device
zonecfg:bluto-desktop:device> set match=/dev/fbs/ffb0 # framebuffer
zonecfg:bluto-desktop:device> end
zonecfg:bluto-desktop> verify
zonecfg:bluto-desktop> commit
zonecfg:bluto-desktop> exit

Next, I install the zone

# zoneadm -z bluto-desktop install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <2583> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <911> packages on the zone.
Initialized <911> packages on zone.
Zone
is initialized.
The file contains a log of the zone installation.

Boot and setup the Solaris install in the zone.

# zoneadm -z bluto-desktop boot

Once the initial system setup is done, halt the zone.

# zoneadm -z bluto-desktop halt or init 0 in the zone.

Now, we need to make some “fake” devices to make the X server and sound work.

# cd /opt/zones/bluto-desktop/dev
# ln -s fbs/ffb0 fb
# ln -s sound/0 audio
# ln -s audioctl sound/0ctl

Now, boot the zone back up
# zoneadm -z bluto-desktop boot

Enable DT:
zone# /usr/dt/bin/dtconfig -e;init 6

Once the zone is reboored, you should get the dtgreet.

The devices above need to point to the /dev entry that points to the device. This is because the Solaris zone tool sets the /dev/whatever entry in the zone to whatever major and minor number the /device entry is for the device on the global zone. Confused? Good. All this meens is that if in the zonecfg config, a match=whatever varible is set to something that is a sym link to another file in /dev. it is not going to work.

This breaks things like /dev/fb, which are kind of needed for Xsun and DT to work. To fix this, go to your zonepath/dev directory and make some links to resolve this:

Issues:
Issue number one is that once you start and stop the desktop zone, the text console of the system is no longer usable. I think this is because the keyboard device is being grabbed, even tho the tty device has it.

All in all, this seems to work somewhat good. Often I have had to reboot my workstation because of an Xwindows issue or something. With this, I can just reboot the zone, which is much quicker. It will also allow me to limit memory, and CPU utilization.

Solaris 10 zones info page

Wednesday, April 13th, 2005

Found this page (http://users.tpg.com.au/adsln4yb/zones.html) with some cool info and very cool scripts about Solaris 10, CPU and memory caps in zones, script to control the FSS, and other goodies.

Check it out!

How to rescue an A3500 LUN

Sunday, April 10th, 2005

Well, today we had a striped volume on our A3500 die. This volume, along with another volume makes up an 300gb veritas volume. One of the A3500 disks died, and it happened to be in one of these LUNs. When recovering, rm6 did not come up, so all I had was commandline. Goddie!

First healthcheck:
# /etc/raid/bin/healthck -a

Health Check Summary Information

a3500_upper: LUN - Hot Spare In Use at Drive [4,0]
a3500_lower: Dead LUN at Drive [5,11]

As you see, we have a dead LUN, and a hot spare. My worry is the dead LUN.

Now to find my LUNs

# /etc/raid/bin/raidutil -c c13t4d0 -i
LUNs found on c13t4d0.
LUN 0 RAID 5 138771 MB
LUN 2 RAID 5 138771 MB
LUN 4 RAID 1 34692 MB
LUN 5 RAID 0 138771 MB

Vendor ID Symbios
ProductID StorEdgeA3500FCd
Product Revision 0301
Boot Level 03.01.04.00
Boot Level Date 04/05/01
Firmware Level 03.01.04.75
Firmware Date 04/11/02
Fibre Level 03.01.04.75
raidutil succeeded!

Now, LUN 5 is my striped LUN.

Now to look at my disks
# /etc/raid/bin/drivutil -I c13t4d0

Group Information for a3500_lower

Group No. of RAID No. of Total Remaining
LUNs Level Drives Space(MB) Space(MB)

Hot Spare - - 2 - -
1 1 5 5 138771 0
2 1 5 5 138771 0
3 1 5 5 138771 0
4 1 5 5 138771 0
5 1 1 2 34692 0
6 1 0 4 138771 0

I have to hot spare disks, could come handy.
Raid group 6 is my striped group, contains 4 disks.

# /etc/raid/bin/drivutil -d c13t4d0

Drives in Group for a3500_lower

Group Drive List [Channel,Id]

Hot Spare [4,8]; [5,8];
Group 1: [1,0]; [2,0]; [3,0]; [4,0]; [5,0];
Group 2: [1,1]; [2,1]; [3,1]; [4,1]; [5,1];
Group 3: [1,2]; [2,2]; [3,2]; [4,2]; [5,2];
Group 4: [1,3]; [2,3]; [3,3]; [4,3]; [5,3];
Group 5: [4,9]; [5,9];
Group 6: [4,10]; [5,10]; [4,11]; [5,11];

Group 6 has those 4 disks (including the dead one), and my two hot spare disks.

First, get rid of a hot spare:

# /etc/raid/bin/raidutil -c c13t4d0 -H 48
LUNs found on c13t4d0.
LUN 0 RAID 5 138771 MB
LUN 2 RAID 5 138771 MB
LUN 4 RAID 1 34692 MB
LUN 5 RAID 0 138771 MB

raidutil succeeded!

Now, delete the “bad” lun 5
Delete lun 5
# /etc/raid/bin/raidutil -c c13t4d0 -D 5
LUNs found on c13t4d0.
LUN 0 RAID 5 138771 MB
LUN 2 RAID 5 138771 MB
LUN 4 RAID 1 34692 MB
LUN 5 RAID 0 138771 MB
Deleting LUN 5.
Press Control C to abort.

LUNs successfully deleted

Now remake my striped LUN, using the hot spare instead of the bad disk. Keeping disks in order could lower data loss:
# /etc/raid/bin/raidutil -c c13t4d0 -n 5 -l 0 -s 138771 -g 410,510,411,48
LUNs found on c13t4d0.
LUN 0 RAID 5 138771 MB
LUN 2 RAID 5 138771 MB
LUN 4 RAID 1 34692 MB
Capacity available in drive group: 284204032 blocks (138771 MB).
Creating LUN 5

Registering new logical unit 5 with system.
Formatting logical unit 5 RAID 0 138771 MB
Formatting logical unit 5 RAID 0 138771 MB
LUNs found on c13t4d0.
LUN 0 RAID 5 138771 MB
LUN 2 RAID 5 138771 MB
LUN 4 RAID 1 34692 MB
LUN 5 RAID 0 138771 MB

LUNs successfully created

raidutil succeeded!

Now for veritas
vxdiskadm, remove failed disk, replace failed disk.

Now, vxprint shows:
# vxprint -ht -g sasdg_dg
dg sas_dg default default 32000 1092403203.1591.server

dm saspool1-1 c11t4d8s2 sliced 3839 142082048 NOHOTUSE
dm saspool1-2 c13t4d5s2 sliced 4287 281001216 NOHOTUSE

v saspool1-lv - DISABLED ACTIVE 423077888 SELECT -
+fsgenpl saspool1-lv-01 saspool1-lv DISABLED RECOVER 423078976 CONCAT -
+RW
sd saspool1-1-01 saspool1-lv-01 saspool1-1 0 142082048 0 c11t4d8 ENA
sd saspool1-2-01 saspool1-lv-01 saspool1-2 0 280996928 142082048 c13t4d5 ENA

Well, the bad LUN was the second half, I could get some data back. Now to recover the plex:

# vxmend -o force off saspool1-lv-01
# vxmend on saspool1-lv-01
# vxmend fix clean saspool1-lv-01
# vxvol -g sas_dg start saspool1-lv
# vxprint -ht -g sas_dg
dg sas_dg default default 32000 1092403203.1591.server

dm saspool1-1 c11t4d8s2 sliced 3839 142082048 NOHOTUSE
dm saspool1-2 c13t4d5s2 sliced 4287 281001216 NOHOTUSE

v saspool1-lv - ENABLED ACTIVE 423077888 SELECT -
+fsgenpl saspool1-lv-01 saspool1-lv ENABLED ACTIVE 423078976 CONCAT -
+RW
sd saspool1-1-01 saspool1-lv-01 saspool1-1 0 142082048 0 c11t4d8 ENA
sd saspool1-1-01 saspool1-lv-01 saspool1-1 0 142082048 0 c11t4d8 ENA
sd saspool1-2-01 saspool1-lv-01 saspool1-2 0 280996928 142082048 c13t4d5 ENA

The above vxmend steps work great also if you have lost a SAN disk, and brought it back. Makes a bad plex look good.

fsck moved about 60gb off to lost+found, but overall not bad seeing we could have lost much more data.

Solaris 10 SunPCI II tricks

Friday, February 4th, 2005

Ok, here is what I had to do to get my SunPCI II card working with Solaris 10 at work:
1. Install SUNWspci2 v2.3.2
2. Install patch 113616-06
3. In /opt/SUNWspci2/drivers/solaris link sunpcidrv.2100 -> sunpcidrv.290
4. link sunpcidrv2100.64 -> sunpcidrv.290.64
5. Run sunpcload to load the kernel modules
6. Boot up the sunpci card as you normally would, life is good again.

SunPCI2 should now work - with one exception that I found
I ran across a problem that the passthru ethernet driver would cause my sun to panic on a bad trap. To fix this, I booted Windows 2k in single user mode, and disabled the sun NDIS device in device manager. The SIS network interface on the sunpci card does still function.

Veritas Snapshots

Monday, January 24th, 2005

Well, the first of my UNIX tips section.
I have been playing with Veritas snapshots to do an off-host backup solution for several large databases. The situation is that I have a production quality disk array which the database uses, and a cheap disk array that is used to house the backup data. The process uses the pre-4.0 vxassist snapshot method, as I have run in to issues with the 4.0 vxsnap process. The snapshots maintain a DCO log which tracks changes. The 4.0 method uses a version 20 of this log, which I do not think is up to par. I ran in to problems with data being written when it should not have been written. The pre-4.0 snapshot method uses a version 0 DCO log, where I have not had this problem.
In this setup, we have prod_dg with proddisk1 as a disk. the volume prod_lv contains the production database. We will create a snapshot, prod_slv, which will be in another disk group, snap_dg, using snapdisk1. the disk groups, and the volume prod_lv is already created. the disk snapdisk1 is mapped to both the production database server and the backup server. disk proddisk1 is only seen by the production server. Both hosts require a veritas volume manager key, as well as a veritas flashsnap key.

Now the goodies.
1. Add a DCO log to the production volume. This is a bitmap type log that tracks changes to the volume.
vxassist -g prod_dg addlog prod_lv logtype=dco

2. Enable fast resync on the production volume. This sets a flag where you can only sync changed data on a snap resync.
vxvol -g prod_dg set fastresync=on prod_lv

3. Join the snapshot disk group to the production disk group.
vxdg join snap_dg prod_dg

4. Create a snapshot of prod_lv, on to snapdisk1. This will create a plex of prod_lv, as well as a disabled DCO log attached to that plex.
vxassist -b -g prod_dg snapstart prod_lv snapdisk1

5. Monitor the sync process.
vxtask list or vxprint -ht -g prod_dg | grep SNAPDONE

6. You now want to make prod_lv quiet.
shutdown database, or put in to hot backup mode.

7. Perform a snapshot split. This will take the plex created by the snapstart, and make it an independent volume.
vxassist -g prod_dg snapshot prod_lv prod_slv

8. You can use prod_lv once again.
startup database, or remove from hot backup mode

9. Split the snapshot disk in to its own disk group, and deport.
vxdg split prod_dg snap_dg prod_slv
vxdg deport snap_dg

10. You can then import, mount and backup prod_slv on the off-host backup server.

To perform a snapshot refresh:
1. Import the snapshot disk group.
vxdg import snap_dg

2. Recover and start the snapshot volume.
vxrecover -g snap_dg -m prod_slv
vxvol -g snap_dg start prod_slv

3. Join the snapshot disk group to the production disk group.
vxdg join snap_dg prod_dg

4. Resync the snapshot volume from the production volume. Because of the DCO log, and fastresync, only changed blocks are copied.
vxassist -g prod_dg snapback prod_slv

5. Monitor the resync.
vxtask list or vxprint -ht -g prod_dg | grep SNAPDONE

6. Once resync is complete, make disks quiet.
Throw oracle in to hot-backup mode

7. Perform a snapshot split.
vxassist -g prod_dg snapshot prod_lv prod_slv

8. You can once again use the disks.
Remove oracle from hot-backup mode

9. Split and deport the snapshot disks.
vxdg split prod_dg snap_dg prod_slv
vxdg deport snap_dg

I do not support the above, but it works for me.