A IBM Connect 2013 session by David O'Neal (Consultant, Infrastructure Engineering, Nationwide), Wouter Aukema (CTO, Trust Factory) and Florian Vogler (CEO, panagenda)
2. What we‘ll cover today
Introduction
IBM Notes and Domino @ Nationwide
What we did
– Collect Data (what sources & some stats)
What we found
– Confirmations & Opportunities
– Configuration, Usage, Performance, Security
What it means
– Short Term Quick Wins
– Long Term Strategic Takeaways
Conclusions / What we learned
Q&A
2
3. Introduction – who‘s who?
About Nationwide
– Nationwide Mutual Insurance Company, based in Columbus, Ohio, is one of the largest
and strongest diversified insurance and financial services organizations in the U.S. and
is rated A+ by both A.M. Best and Standard & Poor’s. The company provides customers
a full range of insurance and financial services, including auto insurance, motorcycle,
boat, homeowners, pet, life insurance, farm, commercial insurance, annuities,
mortgages, mutual funds, pensions, long-term savings plans and specialty health
services.
About Trust Factory
– Trust Factory‘s DNA provides true insight into server performance and scaling
opportunities. DNA is also used by IBM worldwide as Domino DoubleCheck.
About panagenda
– With more than 5.5 million licenses of its products, panagenda helps customers in over
70 countries analyze and optimize their IBM environments.
3
4. IBM Notes and Domino @ Nationwide
The Nationwide Notes/Domino Environment
– Production use began in 1997 with version 3.3
– Migrated/Consolidated to Notes from cc:Mail and a variety of different mainframe email
systems
– Current environment
• 6 Domino Domains
• 200+ Domino servers on Microsoft Windows® (8.x mixture – mostly 8.5.3 for mail)
(Mail, Management, Application, Blackberry and Good servers)
• Active / Passive clustering accross two data centers
• 56,000+ Notes clients (mostly 8.5.2)
• 15,000,000+ messages routed weekly
• ~20 Sametime 8.5.2 IFR1 servers using Domino and WebSphere
• ~1200-1400 Domino applications with 700 being active
4
5. IBM Notes and Domino @ Nationwide
What is Nationwide trying to accomplish by performing this in-depth analysis?
– Server:
• Discover inconsistant configurations, and find gaps where Domino does not readily
report items that could potentially turn into problems.
– Client:
• Discover and inventory client side settings, configurations and local databases to get
a better understanding of client health and functionality.
– Environmental:
• Combine server and client findings to get a holostic view of our Notes/Domino
environment.
5
6. What we did
Collect data from Inventory Notes clients
Domino servers – notes.ini
– statrep.nsf – desktop, bookmarks,
– log.nsf names
– catalog.nsf – local databases
– directories (names.nsf, – various OS and Notes
DA) properties
Talk
6
7. What we did
Collect
– log, statrep and catalog from 151 servers
– 33,000 users used 35,000+ clients
– 690,000+ documents with 315,000+ attachments collected = 3,5 GB of raw data
– 1.6 million desktop icons, 1.5 million local databases, 5.4 million notes.ini entries,
8.5 million client and OS details
Analyze
– DNA: Compared this engagement against 2+ million other users
– 100+ views created consuming 30+ GB of disk space
Interpret and Correlate
… and now for the meat …
7
8. Domino Environment Overview
1 Domino Directory
39,725 Users Registered
34,057 Users Active 47,178 Databases Touched
153 Servers Registered 494,006 Views Indexed
151 Servers Analyzed 133,540 Databases Deployed
4 Domino Releases 1,477,390 Views Defined
4 Operating Systems 381 View Storage (GB)
1,361,855 ACL Entries 82,131 Db Storage (GB)
39,369 Groups Registered 1,370,468 Group Members
6 February 8
2013
9. DNA Benchmark
Active versus Registered Users
100 %
34,057 active users
80 %
60 %
40 %
20 %
0%
Nationwide Lowest Customer DNA Average Highest Customer
Unused Licenses, Web Users, Regular Absense
6 February 9
2013
10. DNA Bechmark
Time Online
25 60
50
20
40
(mins per session)
Session Duration
15
(hours per user)
Online Time
On average
30
with DNA
10
20
5
10
- -
Nationwide Lowest Customer DNA Average Highest Customer
Session Duration 3 1 4 22
Online Time 24 2 23 77
6 February 10
2013
11. User Demand Profiling
(Nationwide, 34,057 active accounts)
Remote Workers Office Workers System Accounts
25%
20%
15%
10%
5%
0%
2 4 6 8 10 12 14 16 18 20 22 24
Distinct Hours Online per Day
6 February 11
2013
12. End User Demand Characteristics Only mail servers
Nationwide in Scope for DNA
100%
75%
50%
25%
0%
Notes Sessions Document Reads Document Writes Db Transactions Network Traffic Session Duration
check new mail 6% 0% 0% 1% 0% 0%
system dbs 6% 0% 0% 1% 0% 0%
mail files 80% 85% 79% 85% 86% 99%
directories 3% 1% 5% 4% 0% 0%
applications 5% 14% 16% 9% 14% 1%
6 February 12
2013
13. End User Demand Characteristics
Other IBM Customer
100%
75%
Extreme high
docreads on
Directory databases
50%
25%
0%
Notes Sessions Document Reads Document Writes Db Transactions Network Traffic Session Duration
check new mail 19% 0% 0% 3% 0% 1%
system dbs 13% 4% 0% 5% 1% 1%
mail files 33% 24% 76% 54% 55% 72%
directories 16% 41% 3% 11% 11% 4%
applications 19% 31% 22% 27% 33% 23%
6 February 13
2013
14. User Demand on 16739, Databases
Nationwide
100.000.000
Showing only databases touched by >1 users. Majority of apps
(47,175 databases touched by all users) are MC
10.000.000
1.000.000
K
B
n
e
o
y
v
S
s
t
r
.
l
i
100.000
10.000
1.000
100
10
1
0
1 10 100 1.000 10.000 100.000 1.000.000 10.000.000 100.000.000
KiloBytes Read from Server
369, Application Dbs 560, Domino Directory Dbs 15209, Mailfiles 55, Mailin databases 143, Server Mail Boxes 403, System databases
6 February 14
2013
15. End User Demand at Nationwide
Classified by Demand Level
Document
Writes
Document
Reads
Database
Transactions
Network Traffic 1 user does 15% of total
network demand
(client to server)
Network Traffic
(server to client)
User Sessions
0% 25% 50% 75% 100%
Extreme (1) Intensive (16) Moderate (804) Light (33,236)
6 February 15
2013
16. Domino Servers at Nationwide
Classified by Maximum Session Concurrency
95
90 Redistributing the load can reduce
nr. of servers w ith up to 87
85
80
75
70
65
60
55
50
45
40
35
30
25
20
15
10
5
0
Very Low Low Average Normal High
Level < 50 50 - 249 250 - 749 750 - 1499 >= 1500
Servers 87 23 23 17 1
6 February 16
2013
18. Network Compression
How Much is Notes Network Compression Used?
Very few customers
have this properly
Enabled
implemented
Includes Traffic
from Users and 75%
Servers
# Users making use of
Notes Network Compression
100%
Enabled
Disabled
75%
Disabled
25%
% Active Users
50%
25%
0%
Persons Servers
6 February 18
2013
19. Deployment Integrity
Entries appearing in multiple documents
Integrity check # Databases Document Type Item Type Nr of Documents
Duplicate Replica On Same Server 380 group docs listname 3
Duplicate Template On Same Server 341 mail-in docs fullname 22
Replicas Acting As Different Template 610 person docs fullname 2
Same Replica But Different Inheritance -
Grand Total 27
Grand Total 1,331
PubNames,
DirCat & DA at
risk (!)
11 Group Cycles Detected
6 February 19
2013
20. Basic Security Checks
Internet Password Strength Databases with Anonymous Access
Variations found Accounts Access Level Databases Templates
'password' 0 Author 84 102
'secret' 0 Editor 11 -
firstname 0 Manager 3 302
lastname 0 Reader 2,507 222
shortname 0
companyname 0
Grand Total 2,605 626
Grand Total 0
1st Customer with
NO issues :)
6 February 20
2013
22. Diving right into client-side analysis
The following slides dive into
various client-side details
In many cases, the Nationwide
environment is surprisingly clean
– Your environment will
most probably look very different
22
23. Notes 8.0.2 & 8.5.2
Although there are 1,817 clients with 8.0.2,
only 26 have Create_R8_Databases enabled =
they do not leverage the benefits of ODS 48
23
25. Local replicas of public addressbook
Local replicas of the public addressbook
beyond cutoff
Risk of replicating deleted documents
back into server-side replica
Enable PIRC:
25
26. Local addressbooks:
Version mismatch
Checked rows show configurations
where names design matches
client version
– (might still have wrong ODS, though)
In general, design mismatch of
system databases
– slows down client startup and beyond
– causes unexpected behaviour or
non-functioning of Policies
can be fixed by
– making sure clients have correct templates
– removing TemplateSetup= from notes.ini
26
27. Local bookmark.nsfs:
Version mismatch
Checked rows show configurations
where bookmark design matches
client version
– (might still have wrong ODS, though)
27
28. Local cache.ndk:
Version mismatch
Checked rows show configurations
where cache design does NOT match
client version
– (might still have wrong ODS, though)
Cache.ndk must be deleted and
re-created together with
CREATE_R85_DATABASES=1
in notes.ini - for it to have proper
design and ODS
(make sure client has correct cache.ntf)
28
30. Local log.nsfs:
Version mismatch
Checked rows show configurations
where log design does NOT match
client version
– (might still have wrong ODS, though)
30
31. Notes.ini:Log=
A couple of users have multiple log= lines in notes.ini
Since only the first entry is actually read in such a case,
logging does not work as expected for those users
31
32. More on ODS levels
Various databases and templates do not have
an ideal ODS …:
Adding
CREATE_R85_DATABASES=1
and
NSF_UpdateODS=1
to notes.ini can help!
32
33. More notes.ini entries …
Less than 1% of all users have port compression
disabled, but 25% of all traffic is uncompressed
must be enabled on BOTH servers and clients
identify servers that are used by users but have
port compression disabled
33
34. EXTMGR_ADDINS= …
Various users have
EXTMGR_ADDINS
entries in notes.ini which are
seperated by a blank
surprisingly DOES work
(side-effects unknown)
34
35. Mail
Who encrypts email when saving emails?
Who encrypts sent email?
Who signs sent emails?
35
37. Cache.ndk
Users where Cache= in notes.ini points to
– A dedicated file/path
– Partly filepaths in which users
might not have write permissions
(e.g. Notes program files directory)
37
40. … same for ini:InstallType=
Identifying Client/Admin/Designer configurations:
40
41. Hardware/OS details: disk space
Users with too little free disk space
– might soon call helpdesk
– may experience stability issues
– have high disk fragmentation = slooooow
41
43. Locations: do not use IP addresses as mailserver names …
A couple of users have an IP address configured as their mailserver
breaks Policies
DNS names as mailservers could become a problem if the DNS domain were ever
to be renamed …
43
44. Mailfile replicas?
98 users work off a local replica
+330 managed replica users
BUT: 3,407 users have a local replica
and: 149 users have more than one mail replica …
– 39 of these local replicas are beyond cutoff:
44
47. Analyzing desktop icons (details)
196,930 local databases with an icon (e.g. bookmark.nsf)
380,243 local databases without an icon (e.g. help files, cache.ndk, …)
1,266 templates on desktops
37,108 templates not on desktops (think shared data directory)
36,865 replicas without any icon
267 replicas without a local icon
2,686 replicas without a server icon
862,395 template replicas without any icon
14 template replicas without a local icon
765 template replicas without a server icon
47
49. Conclusions
Mission accomplished
– Provided a holistic view across servers and clients
Mission not accomplished (yet)
– This is work in process
Nationwide is the 1st customer out of many that leverages most of the
features/potentials of the N/D 8.5 platform
– Optimization potential almost exclusively in areas “without features”
– Implemented Domino password security the way it should be
49
50. What we learned
Detailed data helps to leverage IBM Notes and Domino to its fullest potential
… and helps shifting from reactive to proactive
Assumptions vs. Evidence
– Eliminate best guess/hope based working
Find out and focus on what really matters
50
In this benchmark, DNA calcules the number of active users as a percentage of the number of person documents registered in the Domino directories of the customer. User are considered active as soon as the user has had at least one user session towards one of the servers in scope, with a rich client, during the 7 days of analysis. Web users are not included in this DNA analysis. Purpose of this slide: If the result is low this could indicate that only a small number of servers have been placed in scope for this analysis. When all servers were included, a low score might indicate that the directories containt person documents that are no longer in use.
This benchmark looks at the amount of time notes clients spent online with the Domino servers. The analysis is split up into two measures: 1. total time online during the 7 day period, expressed in hours; 2. average session duration expressed in minutes; Average session duration has a negative correlation with network bandwidth consumption. The longer an average session lasts, the lower the network consumption. High session duration may indicate performance issues in the network or at the servers. Notice that with the customer on the right, each user had sessions open to separate mail and application servers. This is why the customer scored a total online time of more than 40 hours. Usually, we see that customer who deploy local mail file replicas (where users do not work on server but on their local replica) score significantly lower in session duration, and at higher network bandwidth consumption.
Another method to profile end user demand (user segmentation) is by taking a look at the distinct number of working hours per day that users show active on the server park. This analysis does not show the working hours of end users, but observes how many distinct hours the user showed activity, on average per day. As an example: remote non-office workers (e.g. salesmen visiting customers all day) typically replicate with their home server in the morning (1 hour observed), go on the road all day and replicate again in the evening (another hour observed). Many system accounts (monitoring workstations, fax machines operating with notes) show activity up to 24x7. This chart expresses the percentage of the total number of active users in each category.
This analysis presents an overview of the overall user demand characteristics. Total demand is expressed in 6 columns, with each column representing 100% of that type of demand scored during the week of analysis. Each column is then split up into various types of demand: Checks for new mail shows Notes clients checking for new mail; System dbs represents access to system databases; Mail files: access to end user mail files; Directories: access to Domino Directories; Applications: access to application databases* Application databases are identified as follows: Of all databases inventoried, DNA substracts mail and mailin files, ‘known’ system databases and domino directories. What remains is a set of databases that are considered applications. Although this is not a 100% accurate method, it does provide a solid understanding of the types of user demand.
This analysis presents an overview of the overall user demand characteristics. Total demand is expressed in 6 columns, with each column representing 100% of that type of demand scored during the week of analysis. Each column is then split up into various types of demand: Checks for new mail shows Notes clients checking for new mail; System dbs represents access to system databases; Mail files: access to end user mail files; Directories: access to Domino Directories; Applications: access to application databases* Application databases are identified as follows: Of all databases inventoried, DNA substracts mail and mailin files, ‘known’ system databases and domino directories. What remains is a set of databases that are considered applications. Although this is not a 100% accurate method, it does provide a solid understanding of the types of user demand.
This analysis is revealing how end users make use of Notes databases, in terms of network traffic. Every bubble on this chart represents a database. Databases have different colors, indicating the type of database. The size of each bubble is defined by the distinct number of end users that showed activity during the 7 day period that was analyzed. The horizontal and vertical distribution of bubbles reflect the amount of network traffic (bytes read and written towards each database, logarithmic scale). Databases in the lower left corner are the most light in terms of network consumption, while databases in the upper right hand are the most network intensive. While this analysis presents up to 10,000 most used databases, the underlaying factsheet does contain all databases that have been touched. Trust Factory is offering an optional cluster plotter component that enables customers to generate a wide variety of angles in analyzing database utilization.
A significant optimization potential can be found by analyzing user accounts that show excessive demand patterns. Often, we see that very few user accounts consume one third or more of the total network and server capacity. DNA is able to classify user accounts by means of comparing their individual behavior with the organization average. While the underlaying algorhitm is rather complex, it basically comes down to the following classification: Light: below or on average with the overall average; Moderate: causing a load that is 10 - 100 times more than the average; Intensive: causing a load that is 100 – 1,000 times more heavy than average; Extreme: causing a load that is more than 1,000 times more heavy. For each class of user account, this chart shows their impact on the total user demand caused in the 7 days analyzed. This total demand is expressed in 6 measures. The numbers behind the legend indicate the number of users in that class. Details for the 10 most heavy accounts are given in the next slide.
This slide gives an indication of over capacity in the server park. Each server is classified according to the maximum number of concurrent end user sessions it has served, over the 7 day analysis period. Load levels on servers in the yellow area are very low and can often be redistributed onto other servers. Functional servers (smtp, hubs, blackberry, sametime) often show very low session levels. Use the factsheet to verify which servers fall in each category. Customers with a highly centralized server park often show less over capacity than customers with a very decentralized server park.
This analysis topic reveals the total session concurrency caused by end users working on Domino servers, in each of the 168 hours (7 days) that were analyzed. For time-series charts, the timezone reflected on the horizontal axis is equal to that of the workstation that was used for the data collection.
Network compression is a feature that was introduced with Lotus Notes and Domino release 6. The compression ratio we see at customers is around 40%, so the benefits of this feature are significant. For network compression to function properly, a setting needs to be in place at both ends of the connection, so both on all servers as well as on every desktop. This is usually not the case. With this analysis, we show how much of the total network traffic was making use of compression (pie chart). In addition, DNA is presenting for all servers and users if compression has been enabled or not. Customers that make use of other compression solutions in their network, may want to reverse the purpose of this analysis. In these situations, customers may want to disable Notes network compression. The factsheet reveal which servers and users make use of compression.