071510 sun b_1515_feldman_stephen_forpublic

Scaling Blackboard for Large
Scale Distance Learning
Communities
Steve Feldman,
sfeldman@blackboard.com

online learning * Learning that takes place
partially or entirely over the Internet.

The Online Momentum Shift
•  66% of degree-granting post-secondary institutions in
the US offer online, hybrid/blended online and other
distance education courses.1
•  Over 4.6 million students were taking at least one online
course during the fall 2008 term; a 17 percent increase
over the number reported the previous year.2
•  The 17 percent growth rate for online enrollments far
exceeds the 1.2 percent growth of the overall higher
education student population.
•  By 2020, 50% of high school students will take an online
course.1

3

Communities are Getting Larger

•  State and County Initiatives
•  Consortium Programs and
strategic alliances between
institutions.
•  Content distribution networks
•  New sources or revenue to
reach markets and students
that were not historically
accessible
–  Non-traditional students are
being marketed to

Stakes are Getting Higher
•  Competition for funding by government
•  Competition for revenue by students
•  Learning modality changing with each
technological innovation
•  User expectations and online behavior
changing constantly
•  Hours of availability fighting toward
mission critical
–  Often VLEs identified as 24x7 mission
critical systems, but resources to support
are more like 8 x 5

What are we modeling…
Hundreds to Thousands
Concurrent Sessions

Large
Ac3ve

Larger pages, graphics/ Communi3es

video, client-side
interactions

Scalability
Performance

Richer
Connected
Heavy

Content
and
Adop3on
of

User
Learning
Advanced

Experience

Modality

Tools

Availability

Emphasis on Asynchronous
& Synchronous Collaboration

Longer ClickStreams Extended/
& Disposable Access
Frequent

Time
in

System

scalability* The ability for a distributed
system to expand by accommodating greater levels
of load while maintaining similar levels of
performance.

Scalable Deployments
•  Emphasis on adoption of virtualization technologies
–  Virtualization technology transparent to guest OS and
application.
–  Why: Take advantage of CPU and Memory expansion
•  Emphasis on fast provisioning
–  Provisioning technology such as Dell AIM, VMWare
deployment technology and XenServer deployment
technology
–  Why: Solved problems to minimize human error and fast
deployment.
•  Emphasis on diskless systems
–  Hardware is just “rented” space for CPU, Memory and
Network.
–  Why: Speed of network and storage so fast, why be
dependent on “wired” solutions.

performance* The amount of useful work
accomplished by a computer system compared to
the time and resource used.
Alternative Definition: Response time plus latency.

Responsive Deployments
•  Large 64-bid address space…
–  It’s cheaper today than 4 years ago
–  Technology is heading this direction
–  It’s not a bad thing…
•  Plentiful CPU worker threads…
–  Use only which you need
–  Take advantage of hyperthreading and MT technology
–  Partition via virtualization
•  Many bigger…distributed environments
•  Continuous maintenance
–  If you want to make your systems remain fast, you have to
“service” the roads. Lots of litter and potholes out there.

What is Performance?
•  Performance is quantifiable and measureable
•  Performance is also perception
•  Mostly recognized from a cognitive perspective
–  Instantaneous
–  Immediate
–  Continuous
–  Captive

Response

Latency
Performance

Time

Realistic Approaches to Achieve Performance
•  Eliminate interface and resource contention.
–  Better to have more capacity than queuing
•  Know your user behavior.
•  Optimize for the saturated and low-bandwidth network
conditions.
–  Enable Compression
–  Optimize Images
–  Cache Static Content
•  Large JVM memory allocations are not a bad thing, but
rather something to expect with Java-based applications.
–  Large JVM (4GB to 16GB) with aggressive options you understand.
•  Two keys to the database
–  Continuous maintenance
–  Understand the key queries and how the CBO handles

availability* The capability to service a
functional request without issue under conditions of
desired performance and workload scalability.

What is Availability?
•  High-availability offerings mask the effects of a
system failure in order to minimize the impact of
access and functional use of a system to a
community of users.
•  Simple Definition:
–  Percentage of time the system is in its operational state.
•  You will often hear the concept of 3x9’s, 4x9’s or
even 5x9’s
–  Planned versus Unplanned
•  Availability = (Total Units of Time – Downtime) /
Total Units of Time
–  8760 hours in a year
–  Downtime = 10 hours
–  Availability = (8760 – 10)/8760 = 99.88%

Quick View into Availability Statistics
Availability
Percentage
Model
Unexpected
Down8me
per
Year

90%
36.5
days

95%
18.25
days

98%
7.30
days

99%
3.65
days

99.5%
1.83
days

99.8%
17.52
hours

99.9%
8.76
hours

99.95%
4.38
hours

99.99%
52.6
minutes

99.999%
5.26
minutes

99.9999%
31.5s

Realistic Views of Availability
•  If the application is not functioning as expected, but you
can login, is it available?
–  Perception versus Reality
–  If it’s slow, do my users feel just as bad as if they received an
error?
•  How do you plan for unexpected?
–  Practice really does make perfect
•  Do I treat the calendar from a date and time perspective
differently from an availability perspective?
–  Will my users cause problems if I take the site down during low
usage periods/dates?
–  Will the users even know that something happened?
–  Can I recover fast enough?

Realistic Approaches to Achieve Availability
•  Strategically picking redundancy in the architecture.
–  Servers and storage make sense to a degree
–  Monitoring makes sense
–  Do advanced clustering architectures really make a difference?
–  Do the costs of a dedicated DR facility and site make sense?
•  Choosing the right initiatives based on the resources
available to manage
–  Don’t set your administrators up to fail.
–  If you don’t have the capabilities on-site, don’t be skeptical of
outsourcing the problem.
•  Balance costs over goals
–  Choose the right places to put your pennies.
–  Make the business drive the decision…it’s their money!

Deployment: Availability
•  VLEs are different beasts today then in the past.
–  Communities are bigger
–  Sessions last longer
–  Content is richer
–  Key point: Adoption is greater and users expect their sites up 24 x
7 x 365
•  Architecture is designed for many parallel instances of the
product scaled in a horizontal fashion.
–  Distributed physical deployments
–  Virtualization is a key element
•  Database failover more important than horizontal
database scalability.
–  Emphasis on vertical database scalability

Deployment: Advanced Monitoring

•  Measurement is the secret sauce for successful
deployments.
–  Most reliable and scalable deployments measure beyond
the server infrastructure
•  Different types of measurements
–  System/Environmental measurements
–  Business measurements
–  Synthetic measurements
•  Collecting is only part of the prize
–  Need to analyze the data to drive business decisions from
the data.

Lifecycle of Measurement

Deﬁne
Metrics:

Goal
SeVng

Iden3fy
Method
of

Reset
Expecta3ons:

Gathering:
Isolate

New
Ini3a3ves

Tools
and
Processes

Recommend
Implement

Changes:
Show
Instrumenta3on:

Business
Value
Begin
Measuring

Align
to
KPI/ROI:

Share
with

Stakeholders

Different Types of Monitoring

Synthe3c
Monitoring

Real
User
Monitoring

Performance
Forensic
Monitoring

What is Synthetic Monitoring?
•  Automated monitoring technique to measure the
functional behavior of a system, sub-system or
component.
•  Typically a scheduled activity used to measure the
availability, responsiveness and functional attributes
of a common application scenario.
•  Can be executed from any access point to the
system in question, both internal or external.
•  Also considered “Active” Monitoring of a system
•  Not intended to supply load, but rather perform
sampling of performance and availability
•  Two methods:
–  HTTP Simulation or Real Browser Emulation

Tools for Synthetic Transactions
•  You can really use any form of HTTP emulation tool
like JMeter, Grinder, MSTS, LoadRunner,
SilkPerformer, SOASTA, etc…
•  Some monitoring software systems like Foglight,
SiteScope, Nagios, CA IntroScope, Argent
Defender
•  External services: Keynote, Gomez (Compuware),
WebMetrics, AlertSite, Pingdom, SiteUpTime
•  Browser based solution: Selenium

Strategies for Synthetic Transactions
•  Site and Host Ping Tests should run on a multi-
second basis (15s to 30s)
•  Common, yet critical paths targeting functional
systems for availability should run on a continuous
interval (x < 5 minutes).
•  Complicated paths focusing on performance and
availability should run every 30 to 60 minutes.
•  Repeated tests when desired SLA or outcome not
achieved

What is Real User Experience Monitoring?
•  Passive web monitoring that observes web traffic to
measure the user experience.
•  Provides both quality of service and responsiveness
metrics in order to gauge service levels of performance
and availability.
•  Typically a continuous activity watching silently in a
parallel channel or as a pass through channel.
•  Able to capture characteristics about the entire HTTP
stream to be used for forensics and user incidents.
•  Most vendors package as an appliance, but beginning to
see the rise of “virtual” appliances.
•  Synthetic monitoring is just not enough…

Tools for RUM Monitoring
•  Dominated by commercial vendors who have a niche in
web performance and/or application performance
management.
–  Quest FxM
–  Coradiant TrueSight
–  Oracle Real User Experience Insight
–  Tealeaf
–  CA/NetQoS

•  Rise in new tools coming from network equipment
vendors like Cisco, Opnet and Citrix/NetScaler

Strategies for RUM Monitoring
•  Identify areas of dense usage in order to highlight
performance, availability and functional experience in
most common components of system.
•  Start with a wide lens of traffic watching and slowly
narrow the area of focus to minimize the “purge” of data.
•  The “purge” of data is going to happen, so be prepared
to move the data out of the system into an alternative
repository.
–  Some of the vendors have already solved this problem via an
Enterprise Data Warehouse (eg: Coradiant BI)
•  Most of these tools can show
–  Time 2 First Byte, Host Latency, Network Latency and E2E
•  Avoid the trap of focusing on Time 2 First Byte
–  You are serving an entire application from client to server

What is Performance Forensic Monitoring?
•  Deliberate instrumentation approach to capture
performance characteristics about an application
deployment.
•  Measures resource and interface statistics not typically
visible from the application directly.
•  Provides data points about application code execution
that can be tied down to both the user and/or the
application component.
•  Can’t measure everything, but can sample consistently.
–  Certain data points can be captured on a continuous basis such
as Java/J2EE container statistics

Tools for Forensic Monitoring
•  Recommended tool sets tie the PFM tool with the RUM
tool.
–  Foglight FxM seemless integration with Foglight Application
Cartridges and Database Performance Analysis
–  Coradiant TrueSight integration with Dynatrace APM (Coradiant
AV)
–  CA NetQoS integration with CA Wily IntroScope
–  Oracle RUE Insight with Oracle Enterprise Manager for
Applications and Databases.
•  Limited supply of open source tools that can perform a
fraction of the functionality.
–  No known integrations with RUM tools
–  Point based tools per container (not aggregators)
–  Example tools: JConsole, Java VisualVM

Strategies for Forensic Monitoring
•  Measure the essentials such as container interfaces and
resources.
•  Most vendors have rule agents to begin sampling with a
greater degree of instrumentation when certain rules are
broken.
•  Retain statistics for extended periods of time (greater than
1 year) for annual, month, weekly, daily and hourly
comparison purposes.
•  Construct trending thresholds for alert purposes to invoke
a planning exercise in advance of an incident.
–  Yes application forensics can be used for trending purposes for
events in the future as they are based on events in the past as
points of reference.

Please provide feedback for this session by emailing
BbWorldFeedback@blackboard.com.

The subject of the email should be title of this
session:
Scaling Blackboard for Large Scale Distance Learning
Communities

071510 sun b_1515_feldman_stephen_forpublic

Recommandé

Recommandé

Contenu connexe

Similaire à 071510 sun b_1515_feldman_stephen_forpublic

Similaire à 071510 sun b_1515_feldman_stephen_forpublic (20)

Plus de Steve Feldman

Plus de Steve Feldman (20)

071510 sun b_1515_feldman_stephen_forpublic