Scaling API-first – The story of a global engineering organization
Qualcomm lte-performance-challenges-09-01-2011
1. 1
LTE Performance – Expectations
& Challenges
Engineering Services Group
September 2011
2. 2
Agenda
Overview of ESG LTE Experience
ESG – AT&T Engagements for LTE
LTE Performance Expectations
Factors Impacting LTE Performance
Key Areas To Be Considered for LTE Launch
2
3. 3
ESG LTE Experience Overview
ESG
EUTRA
Vendor IOTs
R&D
3GPP SA5
Participation
Chipset Lab
Testing
• Technology trial participations
• RFP development
• LTE Protocols trainings & hands-on
optimization workshops delivered to
2600+ engineers
• LTE design guidelines
• LTE capacity & dimensioning
• Performance assessment &
troubleshooting in commercial LTE
networks
• Performance studies & evaluations
using ESG simulation platforms
Early exposure to LTE through Qualcomm’s leadership position in technology
3
4. 4
ESG-AT&T LTE Partnership Highlights
Multiple engagements with NP&E and A&P teams
LTE Technology Trial (2009)
• ESG SME in Dallas for 6 months
• Participation in Phase I & II Trial
• SME support and technical oversight of
execution by vendors
• Review results and progress of the trial
with the vendors
RAN Architecture & Planning Team
Field testing in BAWA & Dallas FOA clusters, lab
testing in Redmond
RAN Design Team
LTE Design Optimization
Guidelines
LTE Design System Studies
LTE Design & ACP Tool
Studies
Antenna Solutions Group
• LTE capacity
calculator for
venues
• IDAS/ODAS design
& optimization
guidelines
CSFB Performance Assessment (starting next week)
LTE Realization Group
4
7. 7
Key Areas of LTE Performance
LTE Call Setup and Registration
LTE Single-user Throughput
LTE Cell Throughput
User Plane Latency
Handover Success Rates and Data Interruption
7
8. 8
Expected LTE Performance Dependencies
LTE System Bandwidth
1.4 -> 20 MHz
FDD/TDD
Throughput expectations
LTE UE Category – Current category 3 Devices
Deployment Considerations
Number of eNodeB Transmit Antennas
Backhaul Bandwidth
System Configuration
Transmission Modes used for DL (Diversity, MIMO schemes)
Control channel reservation for DL
Resource Reservation for UL
System Parameters
8
10. 10
Key LTE Call Setup Metrics
10
Metric Typical Expected
Values
Reasons for Variability
Number of RACH and RACH
Power
RACH Attempts <3
RACH Power <23dBm
Users at cell-edge, Improper Preamble
Initial target Power, Power Ramping step
RACH Contention Procedure
Success Rate
>90% Failed Msg3/Msg4, Delayed Msg4
delivery, Contention Timer
RRC Connection Setup Success
Rate
>99% Poor RF conditions, Limited number of
RRC Connected users allowed causing
RRC Rejects, large RRC inactivity timers
RRC Connection Setup
Duration (Including RACH
duration)
30-60ms Multiple RACH attempts, Msg3
retransmission, delayed contention
procedure
Attach and PDN Connectivity
Success Rates
>99% Failure of ATTACH procedure (EPC issues)
or EPS Bearer setup, poor RF conditions,
Integrity/Security failures
Attach and PDN Connectivity
Duration
250-550ms Multiple Attach Request, Authentication
or Security related failures, EPC issues,
delayed RRC Reconfiguration to setup
Default RB
11. 11
Peak Single User DL Throughput – 10 MHz
11
• “Ideal” case
• 0% BLER, 100% scheduling
• Near Cell field location
• 5% BLER, 100% scheduling
Scenario
• LTE-FDD
• Cat 3 UE
• 2x2 MIMO
• Max DL MCS 28 used with 50
RBs and Spatial Multiplexing
12. 12
Peak Single User UL Throughput – 10 MHz
12
• “Ideal” case
• 0% BLER, 100% UL scheduling
• UL MCS 23 and 50 RBs
• Near Cell field location
• 5% BLER, 100% scheduling
• UL MCS 24 and 45 RBs (some
RBs reserved for PUCCH)
Scenario
• LTE-FDD
• Cat 3 UE
• Max UL MCS 23/24 depending
on number of UL RBs
13. 13
LTE DL Cell Throughput – Multiple Devices
Device-
RUN
Throughput [Mbps] Sched.
Rate
[%]
BLER
[%]
MCS Num
RB
CQI RI RSRP
[dBm]
RSRQ
[dB]
FTP L1 Norm.
L1**
T2 13.90 14.44 46.71 30.91 5.74 23.31 49.4 14.18 2 -73.85 -9.06
P2 16.58 16.65 53.04 31.39 5.40 25.12 49.76 14.48 2 -71.01 -8.98
P2 17.34 17.87 60.0 29.68 1.52 26.47 49.80 14.87 2 -68.87 -9.06
Total
(3 devices)
47.82 48.96 91.98
• All 3 devices are scheduled almost
equally (~30% each)
• Device with highest CQI reported
receives highest MCS and low BLER
and consequently highest DL L1
Throughput
• Total L1 Cell Throughput ~49 Mbps
• Total Scheduling rate ~92% (<100%)
• Num of DL RB are ~50 for
all devices
Above data is from a commercial LTE network with all 3 devices in Near cell conditions
• Peak DL Cell Throughput in close to Ideal Conditions* should be similar to Peak Single User DL Throughput
• For a 10 Mhz system, Ideal DL Cell throughput at TCP should be ~67Mbps
13
14. 14
User Plane Latency
Ave (ms) Min (ms) Max (ms) STD (ms)
42.1 36 62 4.3
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
30 40 50 60 70 80
Distribution
User Plane Latency (ms)
pdf cdf
Stationary, Near cell conditions
Ping size = 32 Bytes
Ping Server: Internal server
• Ping Round-Trip-Time distribution from one commercial network above is concentrated between 40 -50 ms
• Lower Ping RTT ~25 ms have been observed in some networks
• Ping RTT can be dependent on CN delays, backhaul, system parameters and device
• Ping Round-Trip Time (RTT) in an unloaded system should be ~20-25ms
• Such Ping tests are done to an internal server one hop away from LTE PGW (avoid internet delays)
15. 15
LTE Intra-frequency Handover Success Rate
DL Test Run Total HO HO Failure
(case)
Run 1 125 2 (A, B)
Run 2 108 0
Run 3 95 1 (A)
Total 328 3
UL Test Run Total HO HO Failure
(case)
Run 1 106 0
Run 2 118 0
Run 3 98 1 (A)
Total 320 1
Some Handover failure cases:
A) RACH attempt not successful and
T304 expires
B) HO command not received after
Measurement Report
HO Success Rate is high in both UL and DL
99.05
99.69
99.37
98.40
98.60
98.80
99.00
99.20
99.40
99.60
99.80
100.00
Percentage
[%]
HO Success rate
HO Success Rate
Download Upload Total
15
16. 16
LTE Intra-frequency Handover/Data Interruption
Ave (ms) Min (ms) Max (ms) STD (ms)
78 38 199 34
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 25 50 75 100 125 150 175 200
Distribution
HO Interrupt Time (ms)
pdf cdf
HO Interrupt Time:
Interval between
Last DATA/CONTROL
RLC PDU on source cell
and
First DATA/CONTROL
RLC PDU on target cell
Data Interruption Time:
Interval between
only DATA RLC PDUs
becomes much higher
than 199 ms
Current LTE Networks have higher HO and Data Interruption Times –
eNodeB buffer optimization and data forwarding support needed
16
18. 18
Factors Affecting LTE Performance
Deployment
Pilot Pollution,
Interference
Neighbor List
Issues, ANR
Parameters
(Access, RRC
Timers)
EUTRAN, EPC
Implementation
and Software
Bugs
Unexpected RRC
Connection
Releases
DL MCS and
BLER, Control
Channel impacts
eNodeB
Scheduler
limitations
Mobility
Intra-LTE
Reselection, HO
Parameters –
minimize Ping-
pongs
Inter-RAT HO
Boundaries and
Parameters
Data
Performance
Backhaul
Constraints
TCP Segment
losses in CN
MTU Size
settings on
devices
18
19. 19
RF Issues Impacting Call Setup Performance - 1
Sub-optimal RF optimization
delays LTE call-setup
• Mall served by PCI 367
• PCI 212 leaking in partly
19
20. 20
RF Issues Impacting Call Setup Performance - 2
UE NW
UE Power Up
Initial acquisition
(incl. attempt on PCI 367) Idle, camped: PCI 212
RRC Connection Request
RRC Connection Setup
RRC connected
RRC Setup Duration:
60 ms
RRC Conn. Setup Complete
PSS, SSS, PBCH, SIBs
Idle, not camped
1st Attach request incl. PDN
connectivity request
2nd Attach request incl. PDN
connectivity request
Duration:
4.533 sec
UL data to send
RACH not successful
RACH (Msg1, Msg2)
RACH (Msg1-Msg4)
UE Reselects to PCI 367
No attach response (accept)
PCI 212: RSRP = -110 dBm
PCI 367: RSRP = -104 dBm
3rd Attach request incl. PDN
connectivity request Attach Accept is sent
• Pilot Pollution can impact call-setup, causing intermediate failures impacting KPIs, reselections and higher
call-setup time
20
21. 21
RF Issues Causing LTE Radio Link Failure - 1
PCIs 426, 427,428 are not
detected (site is missing)
Lack of dominant server =>
Area of Pilot pollution
PCI 376
PCI 42 & PCI 142
• Missing sites during initial deployment phase requires careful neighbor planning or optimal use of ANR
21
22. 22
RF Issues Causing LTE Radio Link Failure - 2
1. UE is connected to PCI 411
2. UE reports event A3 twice for PCI 142 (Reporting int. = 480 ms)
3. UE reports event A3 for PCI 142 & 463
4. No Neighbor relation exists between PCI 411 and 142 (Clear
need for ANR). UE does not receive handover command, RLF
occurs
5. RRC Re-establishment is not successful, UE reselects to PCI 42
RLF DL BLER increases to 70%
UL power increases to 23 dBm
RSRP & SINR decrease to -110 dBm & -8 dB
MRM A3
RLF
22
23. 23
Backhaul Limitations Reduce LTE DL Throughput
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
50000
-10 -5 0 5 10 15 20 25 30 35
L1
Throughput
(kbps)
SINR (dB)
L1 Throughput vs SINR Throughput is always
lower than 50 Mbps,
even at high SINR
Backhaul limitation
negatively
Impacts the
allocation of radio
resources
Statistics are calculated by using metrics
averaged at 1 sec intervals
23
24. 24
0.00 0.00 0.00 0.00 0.00 0.00 0.02
0.12
0.18
0.64
0.00 0.00 0.00
0.04
0
0.2
0.4
0.6
0.8
1
0
0.2
0.4
0.6
0.8
1
MCS
PDF CDF
eNodeB Scheduler: MCS and BLER Relationship
0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.01
0.43
0.56
0
0.2
0.4
0.6
0.8
1
0
0.1
0.2
0.3
0.4
0.5
0.6
CQI
PDF CDF
• Highest CQI is 15 and highest DL MCS is 28
• Although we see a significant number of
CQI=15 reported, scheduler hardly assigns
any MCS=28!
• Whenever DL MCS 28 is scheduled BLER on
1st Tx is 100%, hence scheduler uses MCS 27
• Number of symbols for PDCCH is fixed at 2
and results in higher code-rate for MCS 28
• MCS=28: TBS = 36696 (@49&50 PRB)
• MCS=27: TBS = 31704 (@49&50 PRB)
10 Mbps L1 throughput difference!
(2x2 MIMO, 2 Code Words)
PDF
CDF
PDF
CDF
• Lower than expected Peak DL throughput as eNodeB scheduler avoids MCS 28 due to high BLER and fixed
control channel symbol assignment
24
25. 25
RRC
Releases
UE DL
Inactivity
Timer has not
not expired
RSRP ~ -102 dBm
PCI 465
PCI 237
• 10 RRC
Connections
are Released by
PCI 465
Release Cause: other
• UE logs do no show
high UE Tx power or
high DL BLER
• DL FTP Stalls due to
continuous RRC
Releases
Unexpected RRC Connection Releases
• Unexpected eNodeB RRC Connection Releases impact user experience causing FTP time-outs. EUTRAN traces
needed for investigation
25
26. 26
Lower eNodeB Scheduling reduces DL Throughput
P1_AvgL1Throughput P1_AvgScheduledRate P1_AvgMCS_DL P1_AvgL1BLER
Time
19:13:15
19:13:10
19:13:05
19:13:00
19:12:55
19:12:50
19:12:45
19:12:40
19:12:35
19:12:30
19:12:25
19:12:20
19:12:15
19:12:10
19:12:05
19:12:00
19:11:55
kbps
50,000
40,000
30,000
20,000
10,000
0
percentage
100
90
80
70
60
50
40
30
20
10
0
N/A
26
24
22
20
18
16
14
12
percentage
6
5
4
3
2
1
0
• L1 thpt >50 Mbps
• Following scheduling rate and DL MCS
• Scheduling rate ~ 85-90% (< 100%)
• Linked to lack of DL scheduling when SIB1
is transmitted and only 1 user/TTI support
• MCS ~26-27
• Low BLER – negligible impact on throughput
• Scheduling “dip” after ~78 sec
L1
Tput
Scheduling
MCS
BLER
Internal Modem Time
• eNodeB Scheduler implementation results in lower scheduling rate and lower DL throughput
26
27. 27
Impact of MTU Size and TCP Segment Losses
• TCP MSS: 1460, TCP MTU:
1500
• TCP packet stats:
• Re-tx: 765 (0.2%)
• ooOrder: 5380 (1.5%)
• TCP graph shows quite some
slow starts and irregularities
• MTU of 1500 can also result in
fragmentation of IP segments
on backhaul given GTP-U
headers => Negatively impacts
DL throughput
• TCP graph shows quite some
slow starts and irregularities
due to TCP segment losses in
Core Network => Negatively
impacts DL Application
throughput
• Setting device MTU sizes correctly and minimizing CN packet losses is important to avoid negative
Application layer throughput impacts
27
28. 28
Key Areas to be considered – LTE Initial Launch
• Optimize pilot polluted areas
• Verify neighbor list planning,
use ANR if available
• Optimization study of system
parameters is critical for
handling increased load
Deployment
• Insufficient backhaul can
reduce DL throughput
• Sporadic packet discards in
Core Network
• Correct MTU size enforcement
on all devices
Data Performance
•
•Optimize HO parameters to ensure
high Handover Success rates and
reduce handover ping-pongs
• Unexpected Radio Link Failures can
impact performance
• Inter-RAT optimization to ensure
suitable user-experience during
Initial build-out
Mobility
•
• Unexpected RRC related drops
and RACH failures may need to
be investigated
• Several RAN limitations exist
• Scheduler limitations must be
addressed before demand
increases
Implementation
28