Tuesday, December 23, 2014

Setting up Logical Domains, part 2

In part 1, I talked about the server side of Logical Domains. This time, I'll cover how I set up a guest domain.

First, I have created a zfs pool called storage on the host, and I'm going to present a zvol (or maybe several) to the guests.

I'm going to create a little domain called ldom1.

ldm add-domain ldom1
ldm set-core 1 ldom1
ldm add-memory 4G ldom1
ldm add-vnet vnet1 primary-vsw0 ldom1


Now create and add a virtual disk. Create a dataset for the ldom and a 24GB volume inside it, add it to the storage service and set it as the boot device.

zfs create storage/ldom1
zfs create -V 24gb storage/ldom1/disk0
ldm add-vdsdev /dev/zvol/dsk/storage/ldom1/disk0 ldom1_disk0@primary-vds0
ldm add-vdisk disk0 ldom1_disk0@primary-vds0 ldom1
ldm set-var auto-boot\?=true ldom1
ldm set-var boot-device=disk0 ldom1


Then bind resources, list, and start:

ldm bind-domain ldom1
ldm list-domain ldom1
ldm start-domain ldom1


You can connect to the console by looking at the number under CONS in the list-domain output:

# ldm list-domain ldom1
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
ldom1            bound      ------  5000    8     4G
 

# telnet localhost 5000
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.

Connecting to console "ldom1" in group "ldom1" ....
Press ~? for control options ..


T5140, No Keyboard
Copyright (c) 1998, 2014, Oracle and/or its affiliates. All rights reserved.
OpenBoot 4.33.6.e, 4096 MB memory available, Serial #83586105.
Ethernet address 0:14:4f:fb:6c:39, Host ID: 84fb6c39.



Boot device: disk0  File and args:
Bad magic number in disk label
ERROR: /virtual-devices@100/channel-devices@200/disk@0: Can't open disk label package

ERROR: boot-read fail

Evaluating:

Can't open boot device

{0} ok


Attempt to boot it off the net and it says:

Requesting Internet Address for 0:14:4f:f9:87:f9

which doesn't match the MAC address in the console output. So if you want to jumpstart the box, and that's easy enough, you need to do a 'boot net' first to get the actual MAC address of the LDOM to add it to your jumpstart server.

To add an iso image to boot from:

ldm add-vdsdev options=ro /var/tmp/img.iso iso@primary-vds0
ldm add-vdisk iso iso@primary-vds0 ldom1


At the ok prompt, you can issue the 'show-disks' command to see what disk devices are present. To boot off the iso:

boot /virtual-devices@100/channel-devices@200/disk@1:f

And it should work. This is how I've been testing the Tribblix images for SPARC, by the way.

Setting up Logical Domains, part 1

I've recently been playing with Logical Domains (aka LDOMs, aka Oracle VM Server for SPARC). For those unfamiliar with the technology, it's a virtualization framework built into the hardware of pretty well all current SPARC systems, more akin to VMware than Solaris zones.

For more information, see here, here, or here.

First, why use it? Especially when Solaris has zones. The answer is that it addresses a different set of problems. Individual LDOMs are more independent and much more isolated than zones. You can partition resources more cleanly, and different LDOMs don't have to be at the same patch level (to my mind, it's not that you can have a different level of patches in each LDOM that matters, but that you can do maintenance of each LDOM to different schedules that matters). One key advantage I find is that the virtual switch you set up with LDOMs is much better at dealing with complex network configuration (I have hosts scattered across maybe dozens of VLANs, trying to fake that up on Solaris 10 is a bit of a bind). And some applications don't really get on with zones - I would build new systems around zones, but ill-understood and poorly documented legacy systems might be easier to drop inside an LDOM.

That dealt with, here's how I setup up one of my machines (a T5140, as practice for live deployment on a bunch of replacement T4-1 systems) as an LDOM host. I'll cover setting up the guest side in a second post.

Make sure the firmware is current - these are the minimum revs:

T2 - 7.4.5
T3 - 8.3
T4 - 8.4.2c


Then install the LDOM software.

cd /var/tmp
unzip p17291713_31_SOLARIS64.zip
cd OVM_Server_SPARC-3_1/Install
./install-ldm


You'll get asked if you want to launch the configuration assistant after installation. I chose n, and you can run ldmconfig at any later time. (If you want - it's best not to.)

Now need to apply the LDOM patch

svcadm disable -s ldmd
patchadd 150817-02
svcadm enable ldmd


Verify things are working as expected:

# ldm list
NAME             STATE      FLAGS   CONS    VCPU  MEMORY   UTIL  UPTIME
primary          active     -n-c--  SP      128   32544M   0.1%  16m


You should see the primary domain.

The next step is to establish default services and configure the control domain.

The necessary services are:

Virtual console
Virtual disk server
Virtual switch service

ldm add-vds primary-vds0 primary
ldm add-vcc port-range=5000-5100 primary-vcc0 primary
ldm add-vsw net-dev=nxge0 primary-vsw0 primary


verify with

ldm list-services primary

And we need to limit the control domain to a limited set of resources. The way I do this (this is just a personal view) is, on a system with N cores, to define N units each with 1 core with 1/N of the total memory. Assign one of those units to the primary domain, and then build the guest domains with 1 or more of those units (sizing as necessary - note that you can resize them on the fly so you don't have to get it perfect first time around). You can get very detailed and start allocating down to individual threads and specific amounts of memory, but it's so much better to just keep it simple.

For a T2/T2+/T3 you need to futz with crypto MAUs. This is unnecessary on later systems.

To show:

ldm list -o crypto primary

To add

ldm set-mau 1 primary

To set the CPU resources of the control domain

ldm set-core 1 primary

(I want to just set 1 core. Allocation best done by cores, not threads.)

Start the reconfig

ldm start-reconf primary

Fix the memory

ldm set-memory 4G primary

Save the config to the SP

ldm add-config initial

verify the config has been saved

ldm list-config

Reboot to activate

shutdown -y -g0 -i6

OK, so that creates a 1-core (8-thread) 4G control domain. And that all seems to work.

Next steps are to configure networking and enable terminal services. From the console (as you're reconfiguring the primary network):

ifconfig vsw0 plumb
ifconfig nxge0 down unplumb
ifconfig nxge0 inet6 down unplumb
ifconfig vsw0 inet 172.18.1.128 netmask 255.255.255.0 broadcast 172.18.1.255 up
ifconfig vsw0 inet6 plumb up
mv /etc/hostname.nxge0 /etc/hostname.vsw0
mv /etc/hostname6.nxge0 /etc/hostname6.vsw0


For a T4, replace nxge with igb.

At this point, you have a machine with minimal resources assigned to the primary domain, which looks for all the world like a regular Solaris box, ready to create guest domains using the remaining resources.