Unraveling Multimodality with Large Language Models.pdf
2009 04.s10-admin-topics1
1. Solaris 10 Administration Topics Workshop
1- Administration
By Peter Baer Galvin
For Usenix
Last Revision Apr 2009
Copyright 2009 Peter Baer Galvin - All Rights Reserved
Saturday, May 2, 2009
2. About the Speaker
Peter Baer Galvin - 781 273 4100
pbg@cptech.com
www.cptech.com
peter@galvin.info
My Blog: www.galvin.info
Bio
Peter Baer Galvin is the Chief Technologist for Corporate Technologies, Inc., a leading
systems integrator and VAR, and was the Systems Manager for Brown University's
Computer Science Department. He has written articles for Byte and other magazines. He
was contributing editor of the Solaris Corner for SysAdmin Magazine , wrote Pete's
Wicked World, the security column for SunWorld magazine, and Pete’s Super Systems, the
systems administration column there. He is now Sun columnist for the Usenix ;login:
magazine. Peter is co-author of the Operating Systems Concepts and Applied Operating
Systems Concepts texbooks. As a consultant and trainer, Mr. Galvin has taught tutorials
in security and system administration and given talks at many conferences and
institutions.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 2
Saturday, May 2, 2009
3. Objectives
Cover a wide variety of topics in Solaris 10
Useful for experienced system administrators
Save time
Avoid (my) mistakes
Learn about new stuff
Answer your questions about old stuff
Won't read the man pages to you
Workshop for hands-on experience and to reinforce concepts
Note – Security covered in separate tutorial
Copyright 2009 Peter Baer Galvin - All Rights Reserved 3
Saturday, May 2, 2009
4. More Objectives
What makes novice vs. advanced administrator?
Bytes as well as bits, tactics and strategy
Knows how to avoid trouble
How to get out of it once in it
How to not make it worse
Has reasoned philosophy
Has methodology
Copyright 2009 Peter Baer Galvin - All Rights Reserved 4
Saturday, May 2, 2009
5. Prerequisites
Recommend at least a couple of years of
Solaris experience
Or at least a few years of other Unix
experience
Best is a few years of admin experience,
mostly on Solaris
Copyright 2009 Peter Baer Galvin - All Rights Reserved 5
Saturday, May 2, 2009
6. About the Tutorial
Every SysAdmin has a different knowledge set
A lot to cover, but notes should make good
reference
So some covered quickly, some in detail
Setting base of knowledge
Please ask questions
But let’s take off-topic off-line
Solaris BOF
Copyright 2009 Peter Baer Galvin - All Rights Reserved 6
Saturday, May 2, 2009
7. Fair Warning
Sites vary
Circumstances vary
Admin knowledge varies
My goals
Provide information useful for each of you at
your sites
Provide opportunity for you to learn from
each other
Copyright 2009 Peter Baer Galvin - All Rights Reserved 7
Saturday, May 2, 2009
8. Why Listen to Me
20 Years of Sun experience
Seen much as a consultant
Hopefully, you've used:
My Usenix ;login: column
The Solaris Corner @ www.samag.com
The Solaris Security FAQ
SunWorld “Pete's Wicked World”
SunWorld “Pete's Super Systems”
Unix Secure Programming FAQ (out of date)
Operating System Concepts (The Dino Book), now 8th ed
Applied Operating System Concepts
Copyright 2009 Peter Baer Galvin - All Rights Reserved 8
Saturday, May 2, 2009
9. Slide Ownership
As indicated per slide, some slides
copyright Sun Microsystems
Feel free to share all the slides - as long as
you don’t charge for them or teach from
them for fee
Copyright 2009 Peter Baer Galvin - All Rights Reserved 9
Saturday, May 2, 2009
10. Overview
Lay of the Land
Copyright 2009 Peter Baer Galvin - All Rights Reserved
Saturday, May 2, 2009
11. Schedule
Times and Breaks
Copyright 2009 Peter Baer Galvin - All Rights Reserved 11
Saturday, May 2, 2009
12. Coverage
Solaris 10+, with some Solaris 9 where
needed
Selected topics that are new, different,
confusing, underused, overused, etc
Copyright 2009 Peter Baer Galvin - All Rights Reserved 12
Saturday, May 2, 2009
13. Outline
Overview
Objectives
Solaris Versions, features, selection
Booting and Installation
SMF and FMA
Patching
Important Administration Tools
What’s Next for Solaris
Quick Performance Overview
Sysadmin Philosophy
Copyright 2009 Peter Baer Galvin - All Rights Reserved 13
Saturday, May 2, 2009
14. Polling Time
Solaris releases in use?
Plans to upgrade?
Other OSes in use?
Use of Solaris rising or falling?
SPARC and x86
OpenSolaris?
Copyright 2009 Peter Baer Galvin - All Rights Reserved 14
Saturday, May 2, 2009
15. Your Objectives?
Copyright 2009 Peter Baer Galvin - All Rights Reserved 15
Saturday, May 2, 2009
16. Your Lab Environment
Apple Macbook Pro
3GB memory
Mac OS X 10.5
VMware Fusion 2.0
Solaris 10U6
50 Containers
Copyright 2009 Peter Baer Galvin - All Rights Reserved 16
Saturday, May 2, 2009
17. Lab Preparation
Have device capable of telnet on the
USENIX network
Or have a buddy
Learn your “magic number”
Telnet to 131.106.62.100+”magic number”
User “root, password “lisa”
It’s all very secure
Copyright 2009 Peter Baer Galvin - All Rights Reserved 17
Saturday, May 2, 2009
18. Lab Preparation
Or...
Use virtualbox
Use your own system
Use a remote machine you have legit
access to
Copyright 2009 Peter Baer Galvin - All Rights Reserved 18
Saturday, May 2, 2009
19. Solaris Versions
Use the “best” one
Copyright 2009 Peter Baer Galvin - All Rights Reserved 19
Saturday, May 2, 2009
20. Solaris 8
Many sites still running S8
Why?!
Watch project Solaris 8 Migration Assistant
Per-socket cost
But does P to V of S8 into an S8-compatible
container(!)
Fully support by Sun as “Solaris 8”
Does not expand lifetime of S8
Copyright 2009 Peter Baer Galvin - All Rights Reserved 20
Saturday, May 2, 2009
21. The Case of the System that Would only Boot
Sometimes
System would run without problems
Normal shutdown or system crash /re boot
System would fail to boot with “short
read” error
Ideas?!
Copyright 2009 Peter Baer Galvin - All Rights Reserved 21
Saturday, May 2, 2009
22. Solaris 9
Improved performance (page coloring, variable page sizes, page
locality)
Solaris 9 Resource Manager in the Solaris 9 Operating System
Solaris Volume Manager
Solaris Naming and Directory Service
Sun Management Center Change Manager
Network Multipathing - Solaris IP Multipathing (IPMP)
Mobile IP for the Solaris 9 Operating System
Solaris Operating Environment and Linux Compatibility
Java 2 Platform, Standard Edition 1.4 for the Solaris 9 Operating
System
Not a developer release!
Copyright 2009 Peter Baer Galvin - All Rights Reserved 22
Saturday, May 2, 2009
23. Solaris 10
Shipped Feb 2005
Major new features (some discussed throughout)
Dtrace
Fire Engine
Solaris Cryptography Framework
NFS V4
Solaris Privileges
ZFS (S10 Update 2)
Full history w/ details available in 817-0547.pdf
Copyright 2009 Peter Baer Galvin - All Rights Reserved 23
Saturday, May 2, 2009
24. Solaris 10 (2)
Netscape 7
New X Windowing features
Gnome 2.0 desktop
System V IPC resource controls
Physical memory control using a new resource capping daemon
Extended accounting for IPQos
USB 2.0 support, and USB removable media support
Dynamic intimate shared memory large-page support (for databases) (SPARC only)
Memory placement optimization (on SunFire servers) (SPARC only)
Improved UFS logging performance
Unicode version 3.2
FTP client and server enhancements
PAM enhancements
Auditing enhancements
Password history checking
Copyright 2009 Peter Baer Galvin - All Rights Reserved 24
Saturday, May 2, 2009
25. Solaris 10 (3)
Locale administrator for adding and removing locates at the command line
A new autofs configuration file
Multiterabyte volume and disk support (64-bit SPARC only)
Up to 16TB UFS file systems (64-bit SPARC only) (individual files are still limited to 1TB)
devfs dynamically attaches and detaches device entries in /devices
NCA support of multiple instances of the web server
IPv6 6to4 router and packet tunneling of IPv4 over IPv6
NFS services are only started when needed, rather than only at boot time
Sun ONE integration and availability
routeadm routing administration command
sendmail version 8.12 using TCP wrappers
BIND version 8.4.2
Availability of a reduced networking software group for selection during installation of more
secure systems
Solaris Product Registry added features and a command-line interface
Solaris Flash differential archives and configuration scripts
Customized contents of Solaris Flash archives
Copyright 2009 Peter Baer Galvin - All Rights Reserved 25
Saturday, May 2, 2009
26. Solaris 10 (4)
Solaris Live Upgrade 2.1
Ability to boot and install software over a WAN
Improved DHCP implementation
Solaris Management Console Patches tool can now analyze, download and install recommended patches
Improved System V IPC configuration
Signed packages and patches for more secure download
NIS to LDAP transition service
Top-down volume creation in Solaris Volume Manager
Systems Management Agent implements SNMPv1, v2c, and v3
Event ports for generating and collecting events from disjoint sources
New atomic operations API included in libc
WBEM includes many updates
Solaris Privileges for programmers allows applications to be written that need specific rights, rather than
superuser rights.
Smartcard interfaces and middleware APIs
Basic Audit and Reporting Tool (BART) can compare contents of a system over time or audit an installed
package for changes
Kerberos enhancements
Copyright 2009 Peter Baer Galvin - All Rights Reserved 26
Saturday, May 2, 2009
27. S10 U1 (1/06) Changes
Upgrade from S8, S9, or old S10
Sun Update Connection for patching
x86 GRUB booting
Performance - large pages, kernel page
relocation, memory placement optimization
(MPO)
prtconf -b prints product names
Copyright 2009 Peter Baer Galvin - All Rights Reserved 27
Saturday, May 2, 2009
28. S10 U2 (6/06) Changes
ZFS
New ACL model (ZFS only), based on NFS V4, more
granular, chmod
Predictive self-healing for x86
iscsiadm multiple session targets
logadmin -l uses local time when renaming
volfs managed by SMF
UDP and TCP performance improvement
IPv6 for ipfilter
Copyright 2009 Peter Baer Galvin - All Rights Reserved 28
Saturday, May 2, 2009
29. S10 U3 (11/06) Changes
smcwebserver enhancements
fsstat
SMF management of dynamic resource pools
Zones “move” and “clone” commands
Zone migration
LDOMS 1.0
Solaris Trusted Extensions
SNIA multipath management - mpathadm
Copyright 2009 Peter Baer Galvin - All Rights Reserved 29
Saturday, May 2, 2009
30. S10 U4 (8/07) Changes
Improved nscd
iostat -y understands multipathing
Sun service tag product identifer
MPxIO path steering
raidctl
more FMA and predictive self-healing supported devices
stmsboot on SPARC and x86 (enable or disable MPxIO on fibre-channel
Live Upgrade includes non-global-zone support
Deferred-activation patching
Networking improvements, including zone “exclusive-IP”, nge jumbo frame support
Solaris key-management framework (KMF) to manage public key objects
iSCSI target, iscsiadm, iscsitadm
Branded zones - lx
zonecfg integrated resource management (zone.max-*), temporary pools, capped memory
improvements
DTrace non-kernel use in zones
Copyright 2009 Peter Baer Galvin - All Rights Reserved 30
Saturday, May 2, 2009
31. S10 U 5 (5/08) Changes
Trusted Extensions installed by default, SMF managed, disabled by default
fwflash - new firmware manipulation tool
The PostScriptTM Printer Description (PPD) file management utility, /usr/sbin/ppdmgr plus PAPI print
commands
Client-side support for the Internet Printing Protocol (IPP)
SunVTS 7.0 includes the following features:
Introduction of the concept of purpose-based testing
Improved diagnostics effectiveness
Web-based user interface
Simplified usage
New architecture framework
Enterprise View
Resource management expansion via CPU Caps plus projmod -a to apply project DB to active project
x86 power management
iSNS support for iSCSI target
SPARC: Hardware -Accelerated Elliptical Curve Cryptography (ECC) Support
Network enhancements, desktop tools, performance, libchewing, fsexam file code converter, more drivers
Copyright 2009 Peter Baer Galvin - All Rights Reserved 31
Saturday, May 2, 2009
32. S10 U6 (10/08) Changes
ZFS boot / root (text installer)
Zones on ZFS, Auto zone-upgrade
Live upgrade from UFS to ZFS root
Roll back ZFS dataset without unmounting
ZFS quotas and reservations for file system
data only
ZFS cachefile property controls what is cached,
where
Separate ZIL locations, iSCSI improvements...
Copyright 2009 Peter Baer Galvin - All Rights Reserved 32
Saturday, May 2, 2009
33. Current Releases
(Source: Stephen Lau (http://whacked.net))
Copyright 2009 Peter Baer Galvin - All Rights Reserved 33
Saturday, May 2, 2009
34. Software Express for Solaris
Get future Solaris releases, now!
Frequent updates (~1 / month)
Basically, exports of internal Solaris builds (SPARC and x86)
Regression tested by Sun for stability
Other products might be available in the future
No patches, but bug report and on-line support for paid version
Free version allows download, access to docs
Takes a couple of hours over fast link
Need to be able to create .iso CDs, DVDs
Copyright 2009 Peter Baer Galvin - All Rights Reserved 34
Saturday, May 2, 2009
35. OpenSolaris
Opensolaris is a source base and a distro
Solaris open source under CDDL license
Updates currently biweekly or so
One week after code checked in to kernel gate
Very recent bits
Goal is to be even closer to kernel engineering
No testing done
No support
But great stuff to play with
Copyright 2009 Peter Baer Galvin - All Rights Reserved 35
Saturday, May 2, 2009
36. OpenSolaris (2)
Can use either gcc or (free*) Sun Studio
compiler to build
http://www.opensolaris.org/os/
community/tools/sun_studio_tools/
Whole community around OpenSolaris
At http://www.opensolaris.org
This is the place that kernel developers
communicate about DTrace and other areas of
the kernel
Lots of great info at http://
blogs.sun.com
Copyright 2009 Peter Baer Galvin - All Rights Reserved 36
Saturday, May 2, 2009
37. OpenSolaris (3)
Already some interesting community work
Live discs from shillix - http://
schillix.berlios.de/
Belenix - http://belenix.sarovar.org/
belenix_home.html
Nexenta – debian-based GNU/Solaris(!) - http://
www.gnusolaris.org/gswiki
marTux - first non-Solaris Express/Solaris Express
Community Release OpenSolaris distribution for SPARC
(sun4u for now, sun4v later) - http://
www.martux.org/RELEASES/
opensolaris live small CD / USB distro - http://
www.milax.org
Copyright 2009 Peter Baer Galvin - All Rights Reserved 37
Saturday, May 2, 2009
38. OpenSolaris (4)
Note that each release has its own
patching / upgrade methodology
Sometimes need to reinstall each time
For Sun flavors use the BFU to install a
new archive over an old
Just updates the kernel
components, not user-land stuff
Copyright 2009 Peter Baer Galvin - All Rights Reserved 38
Saturday, May 2, 2009
39. OpenSolaris Distro
Nee’ Project Indiana
FCS 5-May-2008
Commercial support available (just like Solaris) 13-
May-2008
Solaris kernel + ZFS + modern userland + new packaging
system
Livecd
Could be the future of Solaris
x64 only for now(!)
ISV support is the open issue
Copyright 2009 Peter Baer Galvin - All Rights Reserved 39
Saturday, May 2, 2009
40. OpenSolaris Distro (2)
IPS is new package system, SVR4 packages supported too
IPS lets you create and manage packages
Packages repositories on the web
Update all installed packages
“undo” via ZFS rollback
Search packages
Create your own repository
pkg, pkgsend, pkg.depotd
For example pkg install openoffice
Other packages: netbeans, sunstudioexpress,
clustertools, webstackui, glassfishv2
Copyright 2009 Peter Baer Galvin - All Rights Reserved 40
Saturday, May 2, 2009
41. OpenSolaris Distro (3)
Once in OpenSolaris, upgrades are easy:
$ pfexec pkg refresh
$ pfexec pkg image-update
When done:
A clone of opensolaris exists and has been updated
andactivated. On next boot the Boot Environment
opensolaris-1 willbe mounted on '/'. Reboot when
ready to switch to this updated BE.
$ beadm list
BE Active Active on Mountpoint Space
Name reboot Used
---- ------ --------- ---------- -----
opensolaris-1 no yes - 17.06M
opensolaris yes no - 33.92M
Copyright 2009 Peter Baer Galvin - All Rights Reserved 41
Saturday, May 2, 2009
42. OpenSolaris Distro (4)
Can build your own distribution, via tool
http://www.opensolaris.org/os/
project/caiman/Constructor/
The Opensolaris Bible provides very good
coverage (mostly user-land)
New automatic installer (replacing
Jumpstart et al) - follow it at
http://www.opensolaris.org/os/
project/caiman/auto_install/
Copyright 2009 Peter Baer Galvin - All Rights Reserved 42
Saturday, May 2, 2009
44. Which OS?
8, 9, 10 are all viable operating systems
<= 2.6 for legacy environments if you can’t move
2.7 for those too lazy to upgrade(!)
8 for those seeking consistency without going through upgrade
effort, those waiting for 10
Solaris 9 for most, stable, apps available, good performance if
conservative or not ready to move
I recommend S10 latest supported release SPARC and x86
Especially as only OS on new hardware
Unless apps not available
Or company standard for previous release
Watch out for vendor support and patch cycle on x86
Copyright 2009 Peter Baer Galvin - All Rights Reserved 44
Saturday, May 2, 2009
45. Solaris 10 Adoption
Everyone wants it
But waiting for vendor support
Given a list of apps, Sun can tell you
expected support date
Start from that, start testing a few months
before all apps expected to be supported
Some waiting for ZFS bootability (to avoid
upgrading twice)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 45
Saturday, May 2, 2009
46. Installation
Getting it Right the First Time...
Copyright 2009 Peter Baer Galvin - All Rights Reserved 46
Saturday, May 2, 2009
47. Topics
Partitions
Installation Methods
Swap Space
Upgrading
Zones / Containers
Copyright 2009 Peter Baer Galvin - All Rights Reserved 47
Saturday, May 2, 2009
48. How Many Partitions?
•"It depends"
•Who/what is using the system?
•Users cause problems!
•Why few partitions?
•Backups easy and fewer passes
•Easy to add VXVM
•Easy to mirror (manually or automatically)
•Less chance of miss-allocating
•Why many partitions?
•Finer-grain backup control
•Faster restore if a corruption
•More control over disk space use
•Solaris 8 has 1TB file system limit for UFS
•Life will be different once ZFS is bootable!
Copyright 2009 Peter Baer Galvin - All Rights Reserved 48
Saturday, May 2, 2009
49. Partitions My Way -72GB Disk
•On an 72GB disk, system with 32GB memory
•/ 10 GB
•swap 6 GB
•/var 10GB
•4GB unused raw partition (set aside for crash)
•2 X 9MB partitions for disksuite
•What to do with the rest?
•Leave unused for emergency or optimum performance
•Create scratch space
•Personal sysadmin space
Copyright 2009 Peter Baer Galvin - All Rights Reserved 49
Saturday, May 2, 2009
50. ZFS Boot / Root
It all changes once ZFS is root file system
snapshots before all changes
rollback if don’t like the changes
No partitioning needed
1 command mirroring
Copyright 2009 Peter Baer Galvin - All Rights Reserved 50
Saturday, May 2, 2009
51. /crash?!
From the kernel developer who wrote dumpadm:
“dumpadm can either be used to configure a
swap device as the dump device, or a
dedicated dump device (e.g. a raw /dev/dsk/
xxx partition not being used as a filesystem).
We actually prefer that because you can never
have your dump swapped over if savecore
runs out of disk space, and we run savecore in
the background if you have one, improving
reboot time.” – Mike Shapiro
Copyright 2009 Peter Baer Galvin - All Rights Reserved 51
Saturday, May 2, 2009
52. Installation Methods
•CDROM / DVDROM
•for single system or custom systems
•Jumpstart - scripted network install
•Flash Archive - image-based install, based
on jumpstart
Copyright 2009 Peter Baer Galvin - All Rights Reserved 52
Saturday, May 2, 2009
53. Swap Devices (continued)
Performance tip: access to swap page is 104
X slower than memory page
Also, disk location of swap or head
contention can cause 101 X difference in
access time
Webstart requires at least 512MB swap
space
Need to mirror swap to prevent disk failure
from causing crash
Copyright 2009 Peter Baer Galvin - All Rights Reserved 53
Saturday, May 2, 2009
54. Swap Devices (cont)
•Yes, mirroring can cause performance degradation, but
without mirroring system not proof against failed disk
causing crash
•Can be raw partitions or files in file systems
•Both work well
•Add swap space with swap -a <device>
•swap -a /dev/dsk/c2t0d0s2
•or swap -a <file>
•swap -a /swap1/swapfile
Copyright 2009 Peter Baer Galvin - All Rights Reserved 54
Saturday, May 2, 2009
55. Swap Device (cont)
•Check use with
# swap -l
swapfile dev swaplo blocks free
/dev/dsk/c0t0d0s1 32,1 16 1049744 927360
/dev/dsk/c2t0d0s2 32,242 16 4194272 4194272
/swap1/swapfile - 16 819184 819184
# swap -s
total: 77240k bytes allocated + 30912k reserved =
108152k used, 3012576k available
Copyright 2009 Peter Baer Galvin - All Rights Reserved 55
Saturday, May 2, 2009
56. Where to Swap?
Best to avoid swapping altogether
Spread swap among multiple controllers,
multiple disks
Can swap to raw disk partition or file system file
Make file system file with mkfile
#mkfile 100m /opt/swapfile
Performance decreases with file system
Almost never have >1 swap space per device
Copyright 2009 Peter Baer Galvin - All Rights Reserved 56
Saturday, May 2, 2009
57. Upgrading OS Releases
Just plopping in the new CDROM and answering
questions not recommended
Do not upgrade a potentially-security-breached system
Perform a new install instead
Why do most sites avoid upgrades?
Upgrading with zones adds complexity / limits
from http://docs.sun.com/app/docs/doc/817-1592/6mhahuoul?
(
q=upgrade+zone&a=view as of S10U2)
You can use either the standard Solaris interactive installation program or the custom
JumpStart installation program to upgrade your Solaris system with zones installed.
Solaris Live Upgrade is not supported for this release. For information, see Solaris 10
Installation Guide: Solaris Live Upgrade and Upgrade Planning and Solaris 10
Installation Guide: Custom JumpStart and Advanced Installations.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 57
Saturday, May 2, 2009
58. Upgrading OS Releases (2)
New technologies can help
appcert to check if given app “guaranteed” to run on new OS rev
Determine if your platform is supported
http://www.sun.com/bigadmin/hcl/
Determine the platform to use (if changing platforms)
Opteron servers great (but need to move to x86/x64)
Sun T1-based systems great for lots of threads
Will you work load run well on a T1? http://www.sun.com/
bigadmin/content/cooltst_tool/
Jumpstart for upgrades – possible but not guaranteed
Liveupgrade (working as of 10/01 release)
Splits the mirror (if SVM if >= S9 8/03) or find available disk
Automates duplication of boot disk, upgrade to duplicate disk
Allows upgrade while system live
Easy test and fall-back to previous release
Boot alternate, if unhappy reboot primary boot disk
Copyright 2009 Peter Baer Galvin - All Rights Reserved 58
Saturday, May 2, 2009
59. Production Server Upgrade Methodology
Check all app certifications for support under
new release
If in house or non-supported app, build
test environment and test
Perform full backup
And test it!
Record all system details via Explorer or
manually
Copyright 2009 Peter Baer Galvin - All Rights Reserved 59
Saturday, May 2, 2009
60. Production Server Upgrade Methodology (2)
Mirror root disk if possible (or break existing
mirror)
Upgrade one half of mirror, test
Fall back if necessary
Or re-mirror after testing period over for RAID protection
Undo DiskSuite mirroring, VXVM encapsulation
Check VX manuals for upgrade instructions, including
use of begin and end scripts to save and restore VX
state
Copyright 2009 Peter Baer Galvin - All Rights Reserved 60
Saturday, May 2, 2009
61. Production Server Upgrade Methodology (3)
Update via CDROM
Or other method if you get it working
Restore VX state if it was saved
Test system and apps
Check log files, run usual system status commands
Analyze old /etc/system
Do not just copy it over – reset it based on new OS release
Run explorer to capture new system state
Perform full backup to record “known good state”
After test period, remirror root disk
Copyright 2009 Peter Baer Galvin - All Rights Reserved 61
Saturday, May 2, 2009
62. Faster / Easier > S10U5
ZFS root allows multiple boot environments
Use LU on Solaris
Quite feature rich
# lucreate -A 'mydescription' -c first_disk
-m /:/dev/dsk/c02t4d0s0:ufs -m /usr:/dev/dsk/c02t4d0s1:ufs
-M /etc/lu/swapslices -n second_disk
BE on OpenSolaris / Nevada
# beadm list
BE Active Mountpoint Space Policy Created
---- ------ --------- ----- ----- -----
opensolaris NR / 2.36G static 2008-12-01 17:03
opensolaris-1 - - 57.0K static 2008-12-01 17:55
Copyright 2009 Peter Baer Galvin - All Rights Reserved 62
Saturday, May 2, 2009
63. Install using JET
Jumpstart Enterprise Toolkit
Unsupported extensions to Jumpstart by
Sun to make it easier / faster
1. Add the packages
# pkgadd -d SUNWjet.pkg
2. Add /opt/SUNWjet/bin to the path of the root user
3. Either:
1. run 'copy_solaris_media' to copy the Solaris image from CD/DVD to disk
2. run 'add_solaris_location' to inform the toolkit of existing Solaris images
4. Create a 'template' for a new client, using the 'make_template' command.
# make_template machine1
5. Edit the new template and configure the build
# vi /opt/jet/Templates/machine1
6. Configure the build environment for this client
# make_client machine1
7. Start the build on the client:
* (for Sparc)
ok boot net - install
* (for x86/64)
Force a PXE boot
Copyright 2009 Peter Baer Galvin - All Rights Reserved 63
Saturday, May 2, 2009
64. Upgrading to the Same OS
If binaries deleted, corruption problems,
packages missing, other program-level
problems (not config file problems)
Perform an “upgrade” to the same OS
release as is currently running
Will refresh all packages to their original
state
Need to re-patch the system
Copyright 2009 Peter Baer Galvin - All Rights Reserved 64
Saturday, May 2, 2009
65. Booting
Giving a system the boot
Copyright 2009 Peter Baer Galvin - All Rights Reserved
Saturday, May 2, 2009
66. Topics
•Solaris 10 booting
•Service Management Facility (SMF)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 66
Saturday, May 2, 2009
67. System Boot < S10
ufsbootblock ufsboot /kernel/unix
/etc/rcS /etc/inittab init
/etc/rc2
/etc/rc3
Copyright 2009 Peter Baer Galvin - All Rights Reserved 67
Saturday, May 2, 2009
68. Run levels
0 system shutdown
1 "systemic state"
one user, no services or daemons
2 multi-user no NFS
mount all partitions, starts services
3 multi-user with NFS
4 spare multiuser state (unused)
5 Power down
6 Reboot
kills all processes, unmounts, reboots
S,s Single user state
no daemons, system mounts
Q,q Causes init to reread inittab file
Copyright 2009 Peter Baer Galvin - All Rights Reserved 68
Saturday, May 2, 2009
69. Solaris 10 Service Management Facility (SMF)
Part of larger predictive self-healing facility (Build 69
and beyond)
Replacing inetd, changing use of /etc/rc files, etc
Much more sophisticated management of system startup
and daemons
Builds reference tree of which processes need which, and
order to start them in
If service fails, knows how to restart the service and all
that depended on it
Startup to login prompt much faster with multithreading –
each service started when those it depends on are ready
The only mandatory difference in S10
Copyright 2009 Peter Baer Galvin - All Rights Reserved 69
Saturday, May 2, 2009
70. SMF (conf)
Has a repository containing state and
configuration of services, dependencies,
methods of managing services
Has manifests (in XML format) to describe
services -> input into repository
/var/svc/manifest
Changes to services can be made here
Won’t be reflected until service restarted or refreshed
Repository/database used for services
/etc/svc/repository.db
Has commands to manage services, repository
Copyright 2009 Peter Baer Galvin - All Rights Reserved 70
Saturday, May 2, 2009
71. SMF (cont)
Booting now much “quieter”
Each service has its own log in /var/svc/log
(/etc/svc/volatile)
Services that would have hung boot now
debuggable in maintenance mode
New boot –m verbose to display message per
service
Processes will automatically restart by
svc.startd or be placed in maintenance mode
(watch out for kill -9)
Location of the scripts to be executed
/lib/svc/method
Copyright 2009 Peter Baer Galvin - All Rights Reserved 71
Saturday, May 2, 2009
72. rc scripts
Only a few rc scripts out of the box
# ls /etc/rc3.d
README S52imq S77dmi S84appserv
S16boot.server S75seaport S80mipagent S90samba
S50apache S76snmpdx S82initsma
There for non smf-converted services
There for backward compatibility
rc scripts started after all services start, but no other SMF services provided
Inittab now sparse. It includes info on modifying ttymon for example:
# For modifying parameters passed to ttymon, use svccfg(1m) to modify
# the SMF repository. For example:
#
# # svccfg
# svc:> select system/console-login
# svc:/system/console-login> setprop ttymon/terminal_type = "xterm"
# svc:/system/console-login> exit
Copyright 2009 Peter Baer Galvin - All Rights Reserved 72
Saturday, May 2, 2009
73. The boot process S10 SPARC
ufsbootblock ufsboot /kernel/unix
/etc/svc/
repository.db svc.startd init
inittab
Copyright 2009 Peter Baer Galvin - All Rights Reserved 73
Saturday, May 2, 2009
74. The boot process S10 X86
BIOS boot block GRUB /kernel/unix
/etc/svc/
repository.db svc.startd init
inittab
Copyright 2009 Peter Baer Galvin - All Rights Reserved 74
Saturday, May 2, 2009
75. GRUB
Nice, fairly standard boot management
Controlled via /boot/grub files
RAMdisk image created automatically when
system files changed, used to speed boot
Create manually if needed via
bootadm update-archive
Copyright 2009 Peter Baer Galvin - All Rights Reserved 75
Saturday, May 2, 2009
76. svcs
Displays services and stati
# svcs
STATE STIME FMRI
legacy_run Feb_28 lrc:/etc/rcS_d/S50sk98sol
legacy_run Feb_28 lrc:/etc/rc2_d/S10lu
legacy_run Feb_28 lrc:/etc/rc2_d/S20sysetup
legacy_run Feb_28 lrc:/etc/rc2_d/S40llc2
. . .
legacy_run Feb_28 lrc:/etc/rc3_d/S84appserv
legacy_run Feb_28 lrc:/etc/rc3_d/S90samba
online Feb_28 svc:/system/svc/restarter:default
online Feb_28 svc:/network/pfil:default
online Feb_28 svc:/system/filesystem/root:default
online Feb_28 svc:/network/loopback:default
online Feb_28 svc:/milestone/name-services:default
. . .
Copyright 2009 Peter Baer Galvin - All Rights Reserved 76
Saturday, May 2, 2009
77. svcs (cont)
Displays details about services (i.e. what
failed)
# svcs -x
svc:/application/print/server:default (LP print server)
State: disabled since Mon Feb 28 11:01:34 2005
Reason: Disabled by an administrator.
See: http://sun.com/msg/SMF-8000-05
See: lpsched(1M)
Impact: 2 dependent services are not running. (Use -v for
list.)
Displays info on all services (even disabled
ones)
# svcs -a
Copyright 2009 Peter Baer Galvin - All Rights Reserved 77
Saturday, May 2, 2009
78. svcs (cont)
Displays details about services (i.e. what
depends on what)
# svcs –xv ssh
STATE STIME FMRI
online Feb_28 svc:/network/ssh:default
Feb_28 366 sshd
Copyright 2009 Peter Baer Galvin - All Rights Reserved 78
Saturday, May 2, 2009
79. svcadm
Changes service states permanently
(unless –t option used)
# svcs sendmail
STATE STIME FMRI
online Feb_28 svc:/network/smtp:sendmail
# svcadm disable sendmail
# svcs sendmail
STATE STIME FMRI
disabled 17:46:01 svc:/network/smtp:sendmail
Copyright 2009 Peter Baer Galvin - All Rights Reserved 79
Saturday, May 2, 2009
80. svcprop
List the properties of a service via
svcprop service_name
# svcprop zones
general/enabled boolean false
general/entity_stability astring Unstable
general/single_instance boolean true
multi-user-server/entities fmri svc:/milestone/multi-user-server
multi-user-server/grouping astring require_all
multi-user-server/restart_on astring none
multi-user-server/type astring service
startd/duration astring transient
start/exec astring /lib/svc/method/svc-zones %m
start/timeout_seconds count 60
start/type astring method
stop/exec astring /lib/svc/method/svc-zones %m
stop/timeout_seconds count 500
stop/type astring method
Copyright 2009 Peter Baer Galvin - All Rights Reserved 80
Saturday, May 2, 2009
81. svcprop (cont)
tm_common_name/C ustring Solaris zones
tm_man_zones/manpath astring /usr/share/man
tm_man_zones/section astring 5
tm_man_zones/title astring zones
tm_man_zoneadm/manpath astring /usr/share/man
tm_man_zoneadm/section astring 1M
tm_man_zoneadm/title astring zoneadm
restarter/logfile astring /var/svc/log/system-
zones:default.log
restarter/start_pid count 525
restarter/start_method_timestamp time 1144642223.336907000
restarter/start_method_waitstatus integer 0
restarter/transient_contract count
restarter/auxiliary_state astring none
restarter/next_state astring none
restarter/state astring online
restarter/state_timestamp time 1144642223.379661000
Copyright 2009 Peter Baer Galvin - All Rights Reserved 81
Saturday, May 2, 2009
82. inetadm
SMF component that manages inet services
Now inetd is a subcomponent
Original inetd.conf entries are now services
Any changes to inetd.conf reflected in
changes to services
Only when inetconv is run
Copyright 2009 Peter Baer Galvin - All Rights Reserved 82
Saturday, May 2, 2009
86. inetadm –m -M
Modify a property of an inetd service
# inetadm -m ftp tcp_trace=TRUE
Modify one of the inetd properties
# inetadm -M tcp_wrappers=TRUE
Copyright 2009 Peter Baer Galvin - All Rights Reserved 86
Saturday, May 2, 2009
87. svccfg
Manipulates data in service configuration repository
Full, rich feature set
For example:
# svccfg
svc:> list
. . .
network/imap/tcp
network/imaps/tcp
network/pop3/tcp
network/pop3s/tcp
svc:> network/pop3/tcp
network/pop3s/tcp
svc:> select telnet
svc:/network/telnet> listprop
. . .
general framework
general/entity_stability astring Unstable
general/restarter fmri svc:/network/inetd:default
Copyright 2009 Peter Baer Galvin - All Rights Reserved 87
Saturday, May 2, 2009
88. svccfg (cont)
svc:/network/telnet> setprop
Usage: setprop pg/name = [type:] value
setprop pg/name = [type:] ([value...])
Set the pg/name property of the currently selected entity.
Values may be
enclosed in double-quotes. Value lists may span multiple lines.
svc:/network/telnet> help
General commands: help set repository end
Manifest commands: inventory validate import export archive
Profile commands: apply extract
Entity commands: list select unselect add delete
Snapshot commands: listsnap selectsnap revert
Property group commands: listpg addpg delpg
Property commands: listprop setprop delprop editprop
Property value commands: addpropvalue delpropvalue setenv
unsetenv
Copyright 2009 Peter Baer Galvin - All Rights Reserved 88
Saturday, May 2, 2009
89. Milestones
Augment, not replacement for old “run levels”
If all services for milestone running, milestone reached
If not, milestone not reached
Milestone configurations:
# ls /var/svc/manifest/milestone
multi-user-server.xml name-services.xml
single-user.xml multi-user.xml network.xml
sysconfig.xml
# svcs "svc:/milestone/*"
online Sep_22 svc:/milestone/name-services:default
online Sep_22 svc:/milestone/network:default
online Sep_22 svc:/milestone/devices:default
online Sep_22 svc:/milestone/single-user:default
online Sep_22 svc:/milestone/sysconfig:default
online Sep_22 svc:/milestone/multi-user:default
online Sep_22 svc:/milestone/multi-user-server:default
Copyright 2009 Peter Baer Galvin - All Rights Reserved 89
Saturday, May 2, 2009
90. Milestones vs. Run Levels
Liane Praza's Weblog
Friday February 04, 2005
smf milestones, runlevels, and system maintenance
A number of questions about smf(5) milestones have been surfacing lately, so I'll try to
give a summary of the topic and answer a few common questions here.
An smf(5) milestone is really nothing more than a service which aggregates a bunch of
service dependencies. Usually, a milestone does nothing useful itself, but declares a
specific state of system-readiness which other services can depend upon. One
example is the name-services milestone. It simply depends upon the possible name
services you might be running:
$ svcs -d name-services
STATE STIME FMRI
disabled Jan_04 svc:/network/rpc/nisplus:default
disabled Jan_04 svc:/network/dns/client:default
disabled Jan_04 svc:/network/ldap/client:default
online Jan_04 svc:/network/nis/client:default
Copyright 2009 Peter Baer Galvin - All Rights Reserved 90
Saturday, May 2, 2009
91. Milestones vs. Run
and has no useful actions to perform during the start or stop method:
$ svcprop -p start name-services
start/exec astring :true
start/timeout_seconds count 3
start/type astring method
$ svcprop -p stop name-services
stop/exec astring :true
stop/timeout_seconds count 3
stop/type astring method
The name-services milestone is considered online as long as any name services which are enabled are
running. There's also nothing different about these milestones to smf(5), it just sees them as yet-
another-service.
We've implemented standard Unix system run-levels in smf(5) using milestones. The single-user, multi-
user, and multi-user-server milestones correspond to run-levels S, 2, and 3, respectively. In
addition to the runlevel milestones, there are the all and none keywords. These aren't actual
services, but shorthand for either the graph with no services, or the graph with all services. This
set of five special milestones can either be booted directly to (boot -m milestone=) or
reached by running svcadm milestone. As mentioned in a previous entry, the way we reach a
limited milestone (any special milestone but all) is to temporarily disable all services which aren't
part of the milestone's subgraph.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 91
Saturday, May 2, 2009
92. Milestones vs. Run Levels (cont)
A common question is why the console-login service is disabled if you boot to a milestone that isn't all.
This can easily be determined by looking at console-login's dependents.
$ svcs -D console-login
STATE STIME FMRI
As there are no milestones which have console-login as one of their dependencies, it won't be started
as part of any milestone but all. Fortunately, we'll always start an sulogin(1M) prompt if a login
service can't be reached.
So, why are milestones useful then? The most useful milestone is none, for the recovery/exploration
scenario I described here. The other use is when doing service development. You can use svcadm
milestone to transition to limited milestones then back up without rebooting the system.
There's a large omission in my description of milestone use above. I don't mention system maintenance
or patching anywhere. A very common question is: Should I stop using init s, boot -s, and my
other standard procedures to change runlevels and perform standard system maintenance?
Emphatically, no! Your old favorite commands continue to work as they always have. There's no
need to change procedures. There's no reason to retrain your fingers with a much longer-to-type
command when init s works just fine. The init invocations will work just like they always have,
where svcadm milestone won't. For example, running svcadm milestone svc:/
milestone/single-user:default won't change the run-level of the system (as described
by who -r). Running init s will.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 92
Saturday, May 2, 2009
93. SMF Notes
svcs –a shows all services, no matter the state
Also of interest
svcadm restart – restart the service
svcadm refresh – reread the service configuration
svcs –d FMRI – shows named service and parents
svcs –D FMRI – shows named service and
dependents
boot –m milestone – boots to named milestone
svcadm milestone – transitions to named milestone
svccfg apply /var/svc/profile/
generic_limited_net.xml – disables generic
extraneous network daemons
Copyright 2009 Peter Baer Galvin - All Rights Reserved 93
Saturday, May 2, 2009
94. SMF Notes (cont)
/var/svc/profile/site.xml –
copied by jumpstart script as default set
of services on jumpstarted systems
Check out the Q&A (FAQ): http://
mail.opensolaris.org/pipermail/smf-
discuss/2006-June/000672.html
Never modify manifests in place. Always
use svccfg to modify or customize a
service
Web site to create SMF manifests
http://es.opensolaris.org/easySMF/
Copyright 2009 Peter Baer Galvin - All Rights Reserved 94
Saturday, May 2, 2009
95. Labs
What services are running?
How do you parse the output of svcs?
Which are disabled? Failed?
What does inetd.conf look like?
What is in the rc directories?
What do the service log files show?
Kill off an unimportant service via kill
What happened
Disable it via SMF
Where is the SMF configuration information stored?
How would you change the parameters of a service?
What does an RPC service look like now?
Copyright 2009 Peter Baer Galvin - All Rights Reserved 95
Saturday, May 2, 2009
96. Labs (cont)
What profiles are available?
What run level are we at?
How would you enable tcp-wrappers?
Copyright 2009 Peter Baer Galvin - All Rights Reserved 96
Saturday, May 2, 2009
97. Systems Management
Tips and Tricks
Copyright 2009 Peter Baer Galvin - All Rights Reserved
Saturday, May 2, 2009
98. Topics
•Patches
•Fault management architecture (FMA)
•Crash and core dumps
•Odds and Ends
•Analyzing a system
Copyright 2009 Peter Baer Galvin - All Rights Reserved 98
Saturday, May 2, 2009
99. Patches installpatch and backoutpatch
•Sun patches come with
routines to automate patching
•Solaris >= 2.6 includes new patch commands
•patchadd (accepts path or URL for patch!)
•patchrm
•All patch operations are logged in the /var/sadm/patch
subdirectories as well
•installpatch -u doesn’t verify file attributes
•installpatch -d doesn’t save the original files
•Big disks -> don’t use this option
Use pkgrm to remove packages to avoid them being patched
(sendmail et al)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 99
Saturday, May 2, 2009
100. OpenSolaris
Or run OpenSolaris and never patch again
Rather, upgrade packages
Including the kernel!
Copyright 2009 Peter Baer Galvin - All Rights Reserved 100
Saturday, May 2, 2009
101. Patches (cont)
•Other useful patch tools
• SUC(E) Sun Update Connection (Enterprise)
• smpatch (nee’ patchpro) (from Sunsolve)
• patchDiag (from Sunsolve)
• patchcheck (from Sunsolve)
• patchreport http://www.cs.duke.edu/~wjs/pr.html
•Watch out for Sun patchmanager
•Doesn’t warn of reboot need
•Doesn’t warn of special instructions
• Patches from
• ftp:sunsolve.sun.com
http://sunsolve.sun.com
Copyright 2009 Peter Baer Galvin - All Rights Reserved 101
Saturday, May 2, 2009
102. Patches (continued)
New with Nevada is the Sun Update Manager
GUI interface built into CDE(?) and Xorg
Automatic patch updates (like Windows!)
Can use proxy, caching, etc.
Possibly a command line interface as well
Future of this unclear
Also, Sun bought Aduva, which will result in
Sun Update Connection Enterprise 1.0 (Sun
UCE 1.0) - now part of Sun xVM Ops Center
Copyright 2009 Peter Baer Galvin - All Rights Reserved 102
Saturday, May 2, 2009
103. Patch Philosophy
On new systems, install
Recommended Patches
Suggested Patches
Sometimes not the same thing!
Install security patches if you care
Install point patches only if you see symptoms
Watch out – patches can overwrite security changes
(startup script removal, sendmail, inetd.conf changes)
Retest after changes made
Copyright 2009 Peter Baer Galvin - All Rights Reserved 103
Saturday, May 2, 2009
104. FMA
Copyright 2009 Peter Baer Galvin - All Rights Reserved 104
Saturday, May 2, 2009
105. FMA
New with Solaris 10, Solaris Fault Management Architecture
(called predictive self-healing by marketing)
Two components – service manager and fault manager
Fault manager designed to detect faults (as before) and
analyze them
Can reduce downtime / debugging by not “waiting for that
problem to happen again”
New daemon runs by default at boot – fmd
Still logs to syslog et al, and /var/fm/fmd/fltlog
Command line interface
fmadm
fmdump
fmstat
Currently, better hw info from SPARC than Opteron CPUs
Copyright 2009 Peter Baer Galvin - All Rights Reserved 105
Saturday, May 2, 2009
106. FMA Fault Management
Should be much more likely to catch and debug intermittent or
correctable error and point to a correction: (from bigadmin
article)
SUNW-MSG-ID: SUN4U-8000-6H, TYPE: Fault, VER: 1,
SEVERITY: Major EVENT-TIME: Sun Oct 17 14:15:50
PDT 2004 PLATFORM: SUNW,Sun-Blade-1000, CSN: -,
HOSTNAME: myhost EVENT-ID: 64fe6c23-12b7-ccd1-
f0a7-b531941738f8 DESC: The number of errors
associated with this CPU has exceeded acceptable
levels. Refer to http://sun.com/msg/SUN4U-8000-6H
for more information. AUTO-RESPONSE: An attempt
will be made to remove the affected CPU from
service. IMPACT: Performance of this system may
be affected. REC-ACTION: Schedule a repair
procedure to replace the affected CPU. Use fmdump
-v -u <EVENT_ID> to identify the CPU.
Copyright 2009 Peter Baer Galvin - All Rights Reserved 106
Saturday, May 2, 2009
107. fmadm
Main administrative interface
# fmadm
Usage: fmadm [-P prog] [-q] [cmd [args ... ]]
fmadm config - display fault manager configuration
fmadm faulty [-ai] - display list of faulty resources
fmadm flush <fmri> ... - flush cached state for resource
fmadm load <path> - load specified fault manager module
fmadm repair <fmri>|<uuid> - record repair to resource(s)
fmadm reset [-s serd] <module> - reset module or sub-component
fmadm rotate <logname> - rotate log file
fmadm unload <module> - unload specified fault manager module
# fmadm config
MODULE VERSION STATUS DESCRIPTION
cpumem-retire 1.0 active CPU/Memory Retire Agent
eft 1.12 active eft diagnosis engine
fmd-self-diagnosis 1.0 active Fault Manager Self-Diagnosis
io-retire 1.0 active I/O Retire Agent
syslog-msgs 1.0 active Syslog Messaging Agent
Copyright 2009 Peter Baer Galvin - All Rights Reserved 107
Saturday, May 2, 2009
108. fmdump
Facility to display fault logs and detailed
information (from bigadmin article)
# fmdump -v -u 64fe6c23-12b7-ccd1-f0a7-b531941738f8
TIME UUID SUNW-MSG-ID Oct 17 14:15:50.1630
64fe6c23-12b7-ccd1-f0a7-b531941738f8
SUN4U-8000-6H 100% fault.cpu.ultraSPARC-
III.l2cachedata FRU: hc:///component=Slot 1 rsrc:
cpu:///cpuid=1/serial=1107C270C8A
Copyright 2009 Peter Baer Galvin - All Rights Reserved 108
Saturday, May 2, 2009
109. fmstat
Information about resource use by FMA
# fmstat
module ev_recv ev_acpt wait svc_t %w %b open solve memsz bufsz
cpumem-retire 0 0 0.0 0.0 0 0 0 0 0 0
eft 0 0 0.0 0.0 0 0 0 0 260K 0
fmd-self-diagnosis 0 0 0.0 0.0 0 0 0 0 0 0
io-retire 0 0 0.0 0.0 0 0 0 0 0 0
syslog-msgs 0 0 0.0 0.0 0 0 0 0 32b 0
Copyright 2009 Peter Baer Galvin - All Rights Reserved 109
Saturday, May 2, 2009
110. logadm
Tool for managing log files
Configurable to automatically rotate files,
delete old files, etc
Copyright 2009 Peter Baer Galvin - All Rights Reserved 110
Saturday, May 2, 2009
111. FMA Odds and Ends
FMA supports AMD “M2” CPUs (Rev F)
Enabled by default
S10 8/07 provides predictive self-healing
on PCI-Express on x64 systems
Copyright 2009 Peter Baer Galvin - All Rights Reserved 111
Saturday, May 2, 2009
112. Odds and Ends
Copyright 2009 Peter Baer Galvin - All Rights Reserved 112
Saturday, May 2, 2009
113. routeadm
routeadm now the proper way to manage
use of routing and forwarding
# routeadm -e ipv4-forwarding
Copyright 2009 Peter Baer Galvin - All Rights Reserved 113
Saturday, May 2, 2009
114. Sun SRS NetConnect
Has had many names over time
Now useful, free (if you have “good” support
contract), going away!?
Time to have a new look, but possibly going away in favor of xVM
Ops Center
Can send data back to Sun or to a server at your site
Provides patch info, uptime, performance monitoring,
event monitoring, etc
But does not phone home for service calls
You have to do that
http://www.sun.com/service/netconnect/
Copyright 2009 Peter Baer Galvin - All Rights Reserved 114
Saturday, May 2, 2009
115. Must Read
Finally, Sun has documented kernel
tunables
Read
“Solaris Tunable Parameters Reference
Manual”
Unique per Solaris release, starting with
S8
At docs.sun.com (for free)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 115
Saturday, May 2, 2009
116. mpathadm
Solaris 10 U 3 and beyond
Tool to manage multipathing via ANSI
standard API
Probably the best way to manage storage
multipathing
Copyright 2009 Peter Baer Galvin - All Rights Reserved 116
Saturday, May 2, 2009
117. Analyzing a System
Methodology I use when approaching a
“broken” system
“Slow” system, failing applications, etc
Learned the hard way, I always regret
skipping any steps
Copyright 2009 Peter Baer Galvin - All Rights Reserved 117
Saturday, May 2, 2009
118. Analyzing a System - Capture
Capture problem definition as succinctly as possible
Helps avoid the “death spiral”
When did the problem start
What invokes it
What avoids it
What is it
What changes were made before it started
What debugging / analyzing / testing changes made since the
start
What existing diagnosis is available (performance trends,
performance monitoring tools)
Copyright 2009 Peter Baer Galvin - All Rights Reserved 118
Saturday, May 2, 2009
119. Analyzing a System - Capture 2
Capture available testing resources
Any dev or Q/A systems available?
Ability to reproduce the problem
Ability to test and make changes in production
Ability to test under load, load generation tools
Downtime windows, change limits (validated system, production lockdown)
Change deployment method and cycle
Production testing possible? Performance impact possible?
Capture state with explorer
Now part of “Services Tools Bundle”
http://www.sun.com/download/products.xml?id=47c7250a
Capture state with GUDS
Apparently only available from Sun Support on an as-needed basis
Copyright 2009 Peter Baer Galvin - All Rights Reserved 119
Saturday, May 2, 2009
120. Analyzing a System - Audit
OS release and patch level
Application release and patch level
Is the application level supported on the OS level?
Scan through dmesg / /var/adm/messages
Don’t ignore anything odd - could be the canary
Check system health via
df (full disks, mount options,fs types via -n)
ifconfig (network param mismatch), kstat (grep for interface name)
/etc/system (inherited settings, system variables)
/etc/projects (system settings)
Quick scan of “the usual suspects”
iostat, vmstat, netstat, prstat, mpstat, lockstat,
intrstat
Then go process-level if the problem can be narrowed down
Copyright 2009 Peter Baer Galvin - All Rights Reserved 120
Saturday, May 2, 2009
121. Analyze a System - Next Steps
Have McDougall and Mauro books handy
Install DTracetoolkit if possible
Have DTrace one-liners handy
Watch for system overhead
Is current scheduler class appropriate
If the system isn’t time-sharing, don’t run with time-sharing
scheduler
If processes stepping on each other or one running amok,
consider implementing limits as possible based on the OS
Processor sets
CPU caps, memory limits
Copyright 2009 Peter Baer Galvin - All Rights Reserved 121
Saturday, May 2, 2009
122. Analyze a System - Drill Down
From “Performance Analysis Using DTrace” by Benoit Chaffanjon - here are some examples but
read the paper
Graph of time spent in each system call by each process
syscall:::entry
/uid != 0/
{
self->tm = timestamp
}
syscall:::return
/self->tm/
{
@[execname, pid, probefunc] = quantize(timestamp - self->tm);
self->tm = 0
}
Short lived processes
dtrace -n 'proc:::exec{printf("%s execing %s, ,
uid/zone =%d/%sn",execname,args[0],uid,zonename)}'
Copyright 2009 Peter Baer Galvin - All Rights Reserved 122
Saturday, May 2, 2009
123. Analyze a System - Drill Down - 2
Code compiled on an old compiler may work but will not
perform well
If not stripped, detect with an old friend :
dump -c $1 |grep "WS6U2" (or SUNWspro)
If stripped, usage of obsoleted library or functions like .mul
or .div are a sign of an old v7 compiler
Recompilation is key get SunStudio - its free at
http://www.sun.com/software/solaris/get.jsp
Copyright 2009 Peter Baer Galvin - All Rights Reserved 123
Saturday, May 2, 2009
124. Analyze a System - Drill Down - 3
Error management is time consuming. Detect and fix them before moving forward
It will change your performance picture !
The most common error is Error #2 - File not found
How to detect them (or errinfo):
/usr/sbin/dtrace -qn 'syscall:::return /errno != 0 && pid != $pid/
{ @Errs[execname,probefunc,errno] = count(); }
dtrace:::END {printa("%s %s %d %@dn",@Errs); }'
Not a Number (NaN) exception handling is OS
managed on UltraSparc III
Detect with :
# kstat -n fpu_traps
fpu_unfinished_traps 77652
Only way to fix it : upgrade the CPUs to UltraSPARC-IV , SPARC VI 64, Opteron or Xeon
Copyright 2009 Peter Baer Galvin - All Rights Reserved 124
Saturday, May 2, 2009
125. Analyze a System - Drill Down - 4
Identify application log writing impact
Who is doing what ?
dtrace -n 'io:::start{@[execname, args[2]-
>fi_pathname] = count()}'
And what is the block size ?
dtrace -n 'io:::start{@[execname, args[2]-
>fi_pathname] =
quantize(args[0]->b_bufsize)}
Need hot spots or number of pending I/Os (and more),
Beyond the DTracetoolkit: ($) Ortera Atlas http://
www.ortera.com
Copyright 2009 Peter Baer Galvin - All Rights Reserved 125
Saturday, May 2, 2009
126. Analyze a System - Drill Down - 5
Use the proper chip for the proper workload. See http://
www.spec.org and http://www.tpc.org
Single threaded workloads are common. Make sure your
application is multi-threaded and that it works.
Verify with :
profile:::profile-100hz /pid/{@[pid, execname] = lquantize(cpu, 0,
512,
1);}
Fixed priority and FSS are good practices - priocntl
Processor sets and process binding are good tuning tools
Example : Oracle database redo log process
Copyright 2009 Peter Baer Galvin - All Rights Reserved 126
Saturday, May 2, 2009
127. Analyze a System - Drill Down - 6
Draw your system memory map with :
mdb -k << !
::memstat
!
Then, drill down per process with pmap -sx
The memory allocator matters : libumem.so (32 or 64 bit) often yields performance gains.
If Java, think garbage collection. -XX:ParallelGc and -XX:AggressiveHeap works well
on SMP
The memory block size matters
Use large 4M pages when possible. You can change it on the fly with ppgsz
LD_PRELOAD=$LD_PRELOAD:mpss.so.1 can be used to control the page size used by
any software
Setting MPSSHEAP=size will control the heap pages
Setting MPSSSTACK=size will control the stack pages
pmap will be used to verify if the change worked
Copyright 2009 Peter Baer Galvin - All Rights Reserved 127
Saturday, May 2, 2009
128. Analyze a System - Drill Down - 7
1Gb isn’t that fast any more
Difficult to spot as a bottleneck
DTrace not fully IP-stack enabled
In the mean time check out nicstat:
http://www.brendangregg.com/K9Toolkit/nicstat
All this and more in a SAGE wiki: XXX
Copyright 2009 Peter Baer Galvin - All Rights Reserved 128
Saturday, May 2, 2009
129. Sys Admin Labs
Explore the commands in this section
Copyright 2009 Peter Baer Galvin - All Rights Reserved 129
Saturday, May 2, 2009
130. Performance
You can tune a file system
but you can’t tuna fish
Copyright 2009 Peter Baer Galvin - All Rights Reserved
Saturday, May 2, 2009
131. Overview
•DTrace
•Other Old and New Important
Performance Tools
Copyright 2009 Peter Baer Galvin - All Rights Reserved 131
Saturday, May 2, 2009
132. Overview of Performance Tools – Process Stats
(Courtesy of McDougall and Mauro)
Process Stats
cputrack - per-processor hw counters
pargs – process arguments
pflags – process flags
pcred – process credentials
pldd – process's library dependencies
psig – process signal disposition
pstack – process stack dump
pmap – process memory map
pfiles – open files and names
prstat – process statistics
ptree – process tree
ptime – process microstate times
pwdx – process working directory
Copyright 2009 Peter Baer Galvin - All Rights Reserved 132
Saturday, May 2, 2009
133. Overview of Performance Tools – Process Control
(Courtesy of McDougall and Mauro)
Process Control
pgrep – grep for processes
pkill – kill processes list
pstop – stop processes
prun – start processes
prctl – view/set process resources
pwait – wait for process
preap – reap a zombie process
Copyright 2009 Peter Baer Galvin - All Rights Reserved 133
Saturday, May 2, 2009
134. Overview of Performance Tools – Tracing
(Courtesy of McDougall and Mauro)
Process Tracing/debugging
abitrace – trace ABI interfaces
dtrace – trace the world
mdb – debug/control processes
truss – trace functions and system calls
Kernel Tracing/debugging
dtrace – trace and monitor kernel
lockstat – monitor locking statistics
lockstat -k – profile kernel
mdb – debug live and kernel cores
Copyright 2009 Peter Baer Galvin - All Rights Reserved 134
Saturday, May 2, 2009
135. Overview of Performance Tools – System Stats
(Courtesy of McDougall and Mauro)
System Stats
acctcom – process accounting
busstat – Bus hardware counters
cpustat – CPU hardware counters
iostat – IO & NFS statistics
kstat – display kernel statistics
mpstat – processor statistics
netstat – network statistics
nfsstat – nfs server stats
sar – kitchen sink utility
vmstat – virtual memory stats
intrstat - interrupt stats
Copyright 2009 Peter Baer Galvin - All Rights Reserved 135
Saturday, May 2, 2009