New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
AusLug2012 - Client serve and application monitoring and optimization done right!
1. AusLUG2012
Client, Server and Application Monitoring
and Optimization
done right
Florian Vogler | CEO & CTO | panagenda
Meet.Share.Learn www.panagenda.com
Efficiency describes the extent to which time or effort
is well used for an intended task or purpose.
29th & 30th March, Melbourne, Victoria, Australia
2. AusLUG2012
Agenda
Coming up next …
Who am I? … and about panagenda
Laying the basics of what is actually possible – or:
• What Admins and IT departments have to cope with
Deep Diving …
• The 30 most important server statistics (out of ~2.000)
• … and Clients?
• … and Groups?
• … and Databases?
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
3. AusLUG2012
About Florian Vogler
CEO & CTO – (hopefully) representative for the great work of my colleagues at panagenda
Born in Hamburg (DE), lived in London (UK),
Vienna (AT), Frankfurt (DE), Alicante (ES);
currently back in Frankfurt (DE)
Lotus Notes / Domino since 1992
Started to work with Notes at Raiffeisen Austria
• Administration and Development
• 35,000 user worldwide (today > 100,000)
Since 2002 core competency Client Management,
Notes / Domino infrastructure analysis and optimization
I enjoy working with many great companies in many different
countries (I travel *a lot*)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
4. AusLUG2012
About panagenda
We network symbiotic relationships with our customers and partners for ongoing joint win-win
HQ: Vienna/AT, offices in Heppenheim near Frankfurt/DE, Boston/USA
Development of standard products
> 4 million licenses in over 70 countries
IBM Lotus Notes
Client Management
MarvelClient :: „99%“ manageability
(not „just“ IBM Lotus Domino)
Server Analytics, Monitoring & Reporting
GreenLight :: realtime, longterm, smart
Analyze Groups, Certifiers and ACLs
GroupExplorer :: better transparency, security & automation
plus: NameChanger (Name changes), DatabaseExplorer (Design Analysis), Notes2Web (Web transformation)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
5. AusLUG2012
Agenda
Coming up next …
Who am I? … and about panagenda
Laying the basics of what is actually possible – or:
• What Admins and IT departments have to cope with
Deep Diving …
• The 30 most important server statistics (out of ~2.000)
• … and Clients?
• … and Groups?
• … and Databases?
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
6. AusLUG2012
What Admins and IT departments have to cope with
• Above all: Lack of knowledge (apologies)
• Mostly because of overstress
No time (anymore) for the inner workings of clients, servers, and systems
Growing complexity of single systems
Growing number of systems Development stages of teddy
„Laying the egg“ = yes; bears
(Proactive) „Nurturing“ = no.
• Unknown sources of knowledge
• Lack of time
• If you don't take the time to do things right
you’ll need the time to do them over
Newborn bear 3 month old bear, Full-grown teddy
without fur with thick fur
• „Wrong“ &| missing tooling
Grown environments: large servers are fundamentally different
from small ones; new ones (8) from old ones (< 8)!
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
7. AusLUG2012
What Admins and IT departments have to cope with
Systemic interactions / dependencies in Lotus Notes / Domino
Hardware (CPU, Memory)
Data storage
Across all:
Servers Network connection
Configuration
Geographies
Databases, tasks, mail traffic
Network (bandwidth, structure)
…
Online/Offline
Clustering/Loadbalancing
…
Hardware ODS
Data storage Size
NW connection Clients Databases Reader fields
Configuration Design
Databases # & Size of documents
… …
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
8. AusLUG2012
Lotus Domino „out of the box“ tooling
Public NAB
– Servers
– Clusters
– People/Groups
– Directory
– Messaging
– Replication
– Policies
– Web Configuration
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
9. AusLUG2012
Lotus Domino „out of the box“ tooling
Public NAB (8)
Log.nsf
– Miscellaneous
(!!)
– Mail
– Replication
– (Database) Usage
– Passthru Connections
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
10. AusLUG2012
Lotus Domino „out of the box“ tooling
Public NAB (8)
Log.nsf (5)
Admin Client
– Monitoring
Tip 1: Enable Health-
Monitoring in Admin
Preferences
Tip 2: Disable „Refresh server
bookmarks“
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
14. AusLUG2012
Lotus Domino „out of the box“ tooling
42
Public NAB (8)
Log.nsf (5)
Admin Client (20)
Events & DDM (6)
Monitoring Results (statrep)
Although 42 is (3)
„the answer to life,
the universe and everything“ 8 + 5 + 20 + 6 + 3 = 42
(according to the Hitchhikers Guide to the Galaxy)
That‘s at least 42 views / areas,
that doesn‘t help much one should monitor ...
for LN/D Monitoring & Analysis
Tip 3: In case you don‘t know the Hitchhikers Guide to the Galaxy from Douglas Adams Must Read
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
15. AusLUG2012
Making more of what you already have
• Many companies don‘t even use what‘s in the box already …
• (As said earlier): Realtime Server Monitoring with Health Monitoring
• DDM – Domino Domain Monitoring (sometimes a bit too much, but then again much better than nothing!)
• Frequent reviews of Groups
• Frequent checking of the most
important server stats
(more of that later)
• Look through Lotusphere
presentations
•…
• Investigate Usage-views in log.nsf;
for example …
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
16. AusLUG2012
A sample analysis of usage information from log.nsf
(that you can do yourself easily)
Copy/Paste in Excel
Daten Sortieren nach z.B.
Transaktionen
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
17. AusLUG2012
Possibilities are endless (unfortunately, time is not)
• In almost all of the beforementioned areas one can (and
should) „dig deeper“
• Unfortunately, digging deeper requires (time-
consuming) correlation of data, e.g. …
• Connection documents and log.nsf (db usage):
How much Mail- and/or Replication traffic is there between which
servers?
• Clients and log.nsf - database usage:
Which users cause what load from where?
• Database details from clients and servers:
Who has replicas of databases s/he no longer has access to?
Who has (unencrypted) replicas of critical databases?
• Network compression between servers and clients
• A lot of the data is either already there or (relatively ;-))
easy to get a hold of
• Correlation pays back (repeatedly) …
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
18. AusLUG2012
A picture says a thousand words …
Topological visualization of Mail- & Replication-Traffic between Servers
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
19. AusLUG2012
A picture says a thousand words …
One way to look at network compression
87% = 1 Server
of your IBM Lotus Domino servers use port compression
(33 of 38 servers)
75% = 1.000 Clients
of your IBM Notes Clients use port compression
(35,409 of 47,212 clients)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
20. AusLUG2012
A picture says a thousand words …
Another way to look at network compression
4
2 saved (GByte)
2.30 3.30
1.65 transfered (GByte)
0
current setup no port compression full port compression
● Network transfer volume per day: 3.3 Gbyte
● Current settings: 60% configured „correctly“ ~1 GByte / 30% saved
● Applying port compression to all your servers and clients could save you an
additional ~0.65 GByte every day which is an additional 28% reduction /
absolute 50% reduction of traffic
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
21. AusLUG2012
Agenda
Coming up next …
Who am I? … and about panagenda
Laying the basics of what is actually possible – or:
• What Admins and IT departments have to cope with
Deep Diving …
• The 30 most important server statistics (out of ~2.000)
• … and Clients?
• … and Groups?
• … and Databases?
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
22. AusLUG2012
Before we look at the 30 most important server statistics …
• Difficult – if not impossible – to test in the lab
• Start with the obvious / easy things
• Note down current settings before changing them
• Think in possible interdependencies
• „Too much good“ can actually harm performance
(or lead to „Out of Memory“)
• Don‘t change (too) many things at once
• Unless it‘s absolutely necessary / so „documented“
• Watch your servers for some (sense making) time after making
changes
• Check whether/that your servers are doing better
• „Google“
• Think along/ahead
• Have the heart to try
• This is just the beginning – stay curious!
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
23. AusLUG2012
And another preliminary note (last one(s), promised ;-))
• Many of the following statistics cannot be grasped with a ‚single‘ „sh sta“, but require analysis „over
time“
• Otherwise you won‘t know whether you‘re looking at a permant / recurring / onetime / sometime problem
• Otherwise you won‘t know whether changes actually improved things (or made things worse)
• A picture says a thousand words …
• Admin Client can be used
as a starting point …
(unfortunately, it is very limited)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
24. AusLUG2012
ViewRebuildDir & Disk optimization(s)
Most important of all: free disk space & disk performance
(„30%“ to prevent fragmentation)
Seperate, dedicated disks for …
– Translog
– Data
– If possible, own disk for page file/OS
– „ViewRebuildDir“=…
view indexing on its own disk
– From 8.5.3. on where necessary/wanted
.ft-directories on own disk
Meet.Share.Learn
– DAOS („cheap“) 29th & 30th March, Melbourne, Victoria, Australia
25. AusLUG2012
Server.Availability
Shows how available = „ready to respond“ a Server is (in %)
< 30% means trouble (or loadbalancing);
IF the Availability Index is correct in the first place …
(Only!) if the server is well busy: „sh ai“ on server console;
results in recommendation on how to tune ini:SERVER_TRANSINFO_RANGE
From notes 8.5 and up, you are advised to set:
– notes.ini: Server_MinPossibleTransTime=1500
– notes.ini: Server_MaxPossibleTransTime=20000000
Important:
Delete loadmon.ncf
after server shutdown
in order to delete old
Meet.Share.Learn
values 29th & 30th March, Melbourne, Victoria, Australia
26. AusLUG2012
Keep an eye on Monitor.* Warnings; Examples
Monitor.Last.ADMIN PROCESS.Warning(High)Text = Disk space statistics
could not be found on Servername/Cert.
Monitor.Last.EVENT MONITOR.Warning(High)Text = Event: Error adding event
document to Domino Domain Monitoring: Event correlation cache is full. You
can increase its size via the NOTES.INI setting
EVENT_CORRELATION_POOL_SIZE.
Monitor.Last.INDEX ALL.Warning(High)Text = Error updating view '#4538' in
mailnameabc.nsf: The single copy template associated with this database
cannot be located.
Monitor.Last.SMTP SERVER.FailureText = SMTP Server: Initialization
failure: Message Queue name already in use.
Monitor.Last.STATISTICS.Warning(High)Text = Unable to update activity
document in log database for mailnamexyz.nsf: In Datenbank kann nicht
geschrieben werden, da die Datenbank die erlaubte Größe
überschreiten würde.
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
27. AusLUG2012
Server.Sessions.Dropped
Tells you how many sessions have been ‚dropped‘
since last server restart
Happens when
• issuing a serverside „Drop all“ „Drop all“
• Pressing Ctrl+Break on clients („frustration-
meter“) „different“
Problem
Server.Sessions.Dropped = 25407
18/6 – 18/10 = 4*30 = 120 days
25407 / 120 = 211
sessions dropped per day
Should be further correlated with peak # of users
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
28. AusLUG2012
Platform.LogicalDisk.*
Platform.LogicalDisk.1.AssignedName = D Platform.LogicalDisk.2.AssignedName = C
Platform.LogicalDisk.1.AvgQueueLen = 0 Platform.LogicalDisk.2.AvgQueueLen = 0,01
Platform.LogicalDisk.1.AvgQueueLen.Avg = 0,01 Platform.LogicalDisk.2.AvgQueueLen.Avg = 0,73
Platform.LogicalDisk.1.AvgQueueLen.Peak = 1,01 Platform.LogicalDisk.2.AvgQueueLen.Peak = 34,74
Platform.LogicalDisk.1.BytesReadPerSec = 0 Platform.LogicalDisk.2.BytesReadPerSec = 17.272,75
Platform.LogicalDisk.1.BytesWrittenPerSec = 10.172,49 Platform.LogicalDisk.2.BytesWrittenPerSec = 63.697,52
Platform.LogicalDisk.1.PctUtil = 0,22 Platform.LogicalDisk.2.PctUtil = 1,11
Platform.LogicalDisk.1.PctUtil.Avg = 0,86 Platform.LogicalDisk.2.PctUtil.Avg = 72,8
Platform.LogicalDisk.1.PctUtil.Peak = 101,07 Platform.LogicalDisk.2.PctUtil.Peak = 3.473,81
Platform.LogicalDisk.1.ReadsPerSec = 0 Platform.LogicalDisk.2.ReadsPerSec = 2,58
Platform.LogicalDisk.1.WritesPerSec = 2,07 Platform.LogicalDisk.2.WritesPerSec = 7,3
Interpretation
GOOD < 2% < AvgQueueLen > 5% > BAD (1-100% = 0,01 – 1,0!)
GOOD = PctUtil < 80% (1-100% = 1-100)
NOTE: may need to divide by # of spindles SAN/NAS
Solution
Various parameters (bufferpool, cache, namelookup) and OS / Disk Tuning
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
30. AusLUG2012
Mail.Mailbox.*
Mail.Mailbox.AccessConflicts/Mail.Mailbox.Accesses) x 100
Must be < 2, otherwise: add another Mailbox
(benefit increase decreases above 4-5 mailboxes)
Example:
Mail.Mailbox.AccessConflicts = 1636
Mail.Mailbox.Accesses = 189864
= 0,86 = ok
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
31. AusLUG2012
Update.PendingList
Update.PendingList = number of Background:
views waiting to be updated • If you have many databases/apps …
If • … and a busy update task
– Full text index could be the reason for slowing down
Update.PendingList / “blocking” view indexing
„is often“ > 0, then … • Separate FTI and view updates
– FTI then runs in its own Memory Thread
Notes.ini: • Improves performance
Update_Fulltext_Thread=1 • Update_Fulltext_Thread=1
FTUPDATE_IDLE_TIME=4
Speaking of Fulltext-Indexing:
You can isolate the FTI thread from the limited Domino update pool:
ftg_use_sys_memory=1
FTI thread then gets memory from OS pool;
relieves Domino system memory
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
32. AusLUG2012
Database.Database.BufferPool.*
Database.Database.BufferPool.PerCentReadsInBuffer = 78,96
BAD < 90% < PercentReadsInBuffer < 98% < PERFECT
(99.9% is bad, too!)
– Typically leads to too many requests being written to disk
– Server needs a larger BufferPool
Solution: notes.ini NSF_Buffer_Pool_Size_MB=n (in MB)
─ Default: 512 MB
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
33. AusLUG2012
Database.DbCache.*
Database.DbCache.CurrentEntries = 1647
Database.DbCache.HighWaterMark = 1691
Database.DbCache.MaxEntries = 1536
Database.DbCache.OvercrowdingRejections = 0
GOOD = HighWaterMark < MaxEntries
GOOD = 0 OvercrowdingRejections
Solution:
– notes.ini NSF_DbCache_MaxEntries = n
• Default: NSF_BUFFER Pool size x 3
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
34. AusLUG2012
Replica.Cluster.*
Replica.Cluster.Failed
Replica.Cluster.SecondsOnQueue
Replica.Cluster.WorkQueueDepth
PERFECT < 10 < SecondsOnQueue > 15 > BAD
PERFECT < 10 < WorkQueueDepth > 15 > BAD
Solution:
– Add more cluster replicators
– Optimize cluster load
(e.g. “manually” balance
users across cluster if not
load-balance)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
36. AusLUG2012
Database.NAMELookupCache*
Database.NAMELookupCacheCacheSize = 2.513.328
Database.NAMELookupCacheHits = 24.628.339
Database.NAMELookupCacheMisses = 48.160.502
IMPORTANT: NoHitHits!
Cache too small or too large(!)
Miss > Hits: „Doublecheck“
ini:NLCache_Size=16000000
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
37. AusLUG2012
Server.ConcurrentTasks*
Server.ConcurrentTasks
Server.ConcurrentTasks.Waiting
Waiting should be ZERO (0)
Solution:
─ Server_Pool_Tasks = n (e.g. 80)
─ Server_Max_Concurrent_Trans = m
(e.g. Server_Pool_Tasks * # Ports)
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
38. AusLUG2012
Platform.PagingFile.Total.*
Platform.PagingFile.Total.PctUtil = 0,28
Platform.PagingFile.Total.PctUtil.Avg = 0,14
Platform.PagingFile.Total.PctUtil.Peak = 0,8
OK < 0% < PctUtil.Avg > 10% > BAD
OS Level tuning, Check Memory
Note: If “sh sta” doesn’t show
Platform.* stats Admin-Help
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
39. AusLUG2012
Agenda
Coming up next …
Who am I? … and about panagenda
Laying the basics of what is actually possible – or:
• What Admins and IT departments have to cope with
Deep Diving …
• The 30 most important server statistics (out of ~2.000)
• … and Clients?
• … and Groups?
• … and Databases?
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
40. AusLUG2012
Sponsor Break – Sneak Peek during Social Evening
http://panagenda.com/giftoftransparency
• Efficient Client-Analysis is impossible without additional tooling
• FREE 4 weeks license of panagenda GreenLight – our server
monitoring and reporting solution – includes Database Analyzer for 1
year for one of your servers
• FREE one year license of panagenda MarvelClient Analyze
• The results speak for themselves on „just“ the clientside
• The results can also be used together with GreenLight
• For groups and databases, wie also have GroupExplorer and
DatabaseExplorer
• Whether we may help you is up to you
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
41. AusLUG2012
Timeout
Spending 60 minutes
on Performance Improvements
can be compared to
a walk on the tip of the iceberg –
we have worked on
a MANY more business cases
and solved a MANY more problems
than those mentioned just now.
If your problem was not mentioned in this session –
be it a Client, Server, Design, Admin
or other challenge:
we would love to hear from you.
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
42. AusLUG2012
Thank you for listening – Questions? Answers!
Q&A
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
43. AusLUG2012
Contact me – I look forward to hearing from you!
panagenda GmbH
Doblhoffgasse 7 / 6a :: 1010 Vienna :: Austria
Web: http://www.panagenda.com
Email: office@panagenda.com
Fax: +43 1 89 012 89 – 15
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia
44. AusLUG2012
Ressources / Links
• Daniel Nashed, Nash!Com
• LS08: BP112
• LS11: BP102, BP110, BP118
• LS12: BP110, BP121, ID112, ID114
• Windows Indexing: http://bit.ly/ACzO6Z
• „The internet“ – google „Domino performance ibm“;
great IBM Whitepapers and articles, some very good site out there
Meet.Share.Learn 29th & 30th March, Melbourne, Victoria, Australia