Presented at PEARC21.
Many scientific high-throughput applications can benefit from the elastic nature of Cloud resources, especially when there is a need to reduce time to completion. Cost considerations are usually a major issue in such endeavors, with networking often a major component; for data-intensive applications, egress networking costs can exceed the compute costs. Dedicated network links provide a way to lower the networking costs, but they do add complexity. In this paper we provide a description of a 100 fp32 PFLOPS Cloud burst in support of IceCube production compute, that used Internet2 Cloud Connect service to provision several logically-dedicated network links from the three major Cloud providers, namely Amazon Web Services, Microsoft Azure and Google Cloud Platform, that in aggregate enabled approximately 100 Gbps egress capability to on-prem storage. It provides technical details about the provisioning process, the benefits and limitations of such a setup and an analysis of the costs incurred.
Take control of your SAP testing with UiPath Test Suite
Managing Cloud networking costs for data-intensive applications by provisioning dedicated network links
1. Managing Cloud networking costs
for data-intensive applications
by provisioning
dedicated network links
Igor Sfiligoi, Frank Würthwein, Thomas Hutton
University of California San Diego
Michael Hare, David Schultz, Benedikt Riedel, Steve Barnet, Vladimir Brik
University of Wisconsin–Madison
2. Premise
• Commercial Clouds can provide large amounts of compute capacity
• And Cloud compute costs are acceptable when using spot instances
• Network-intensive applications may however experience
large networking bills
• While ingress is free, egress is a metered commodity
• Dedicated links can provide significant cost savings on egress
• Between 50% and 75%
We present our experience
egress-ing 130 TB in half a day
3. Doing real science
• Doing CS experiments is fun
• But demonstrating
infrastructure capabilities while
doing real science is even better
• We produced simulation data
for the IceCube experiment
• Using Cloud compute instances
• Storage was located
at UW Madison and UC San Diego
4. Optical Properties
• Combining all the possible information
• These features are included in simulation
• We’re always be developing them
Nature never tell us a perfect answer but obtained a
satisfactory agreement with data!
The need for calibration
• Natural medium
• Hard to calibrate properly
• Dropped a detector into a grey box
• The ice is very clear, but…
• Is it uniform?
• How has construction changed the ice?
• Drastic changes
in reconstructed position
with different ice models
• Too complex for a
parametrized approach
• Using ray-tracing on GPUs
5. Total:
9M hours
(Jul 2020 – Jul 2021)
https://gracc.opensciencegrid.org/d/000000118/gpu-payload-jobs-summary?orgId=1&var-ReportableVOName=icecube&var-interval=7d
On-prem, but globally distributed.
(weekly)
IceCube used to distributed computing
Credit: https://pixy.org/1338869/
gridFTP
UW Madison
(mostly)
6. Total:
9M hours
(Jul 2020 – Jul 2021)
https://gracc.opensciencegrid.org/d/000000118/gpu-payload-jobs-summary?orgId=1&var-ReportableVOName=icecube&var-interval=7d
On-prem, but globally distributed.
(weekly)
IceCube used to distributed computing
Credit: https://pixy.org/1338869/
gridFTP
UW Madison
(mostly)
Adding Cloud resources
thus relatively trivial
(we presented another Cloud run at PEARC20)
7. The networking cost issue
• IceCube needed data-heavy simulation
• About 500 MB produced per fp32 TFLOP-hour of compute
• Egress costs start to be comparable to compute costs!
• If one used ”standard networking”
• Dedicated network links promised significant reduction in cost
All prices were valid as of December 2020. Average per-job values.
8. The need for many dedicated links
• IceCube storage can sink 100 Gbps
• Over 80 Gbps at UW Madison
• Plus over 20 Gbps at UC San Diego
• Internet2 had mostly 2x 10 Gbps links with Cloud providers
• The only bright exception was the California link to Azure at 2x 100 Gbps
• The links are shared, so one can never get the whole link for itself
• 5 Gbps limit in AWS and GCP
• 10 Gbps limit in Azure
• The link speeds are rigidly defined
• 1, 2, 5, 10 Gbps
• To fill an (almost) empty 10 Gbps link, one needs three links: 5 + 2 + 2
9. The need for many dedicated links
• IceCube storage can sink 100 Gbps
• Over 80 Gbps at UW Madison
• Plus over 20 Gbps at UC San Diego
• No dedicated links
over 10 Gbps available
• With AWS and GCP limited to 5 Gbps
• We ended up with over 20 links
10. A multi-team effort
• Final users cannot provision a dedicated link on their own.
The process involves:
• The Cloud (final) user,
• The intermediate network provider, and
• The local on-prem networking team.
• Human interaction plays a significant role
in establishing and tearing down of dedicated links.
• Full automation virtually impossible in most circumstances.
11. On-site preparations
• Dedicated IPs are needed for the Cloud compute resources
• Not always trivial to get an allocation for local networking team
• We used private IPs, which are easier to get by
(IPv6 may also be an easy(er) option, but we did not consider it)
• Dedicated paths must be established to the Internet2 peering points
• UW chose to provision a set of BGP-based Layer 3 virtual private networks
(L3VPNs) to Internet2 via their regional aggregator, BTAA OmniPop.
• UCSD first provisioned a Layer 2 virtual private network (L2VPN) over their
regional provider, CENIC,
and then layered on top a BGP-based L3VPN with Internet2.
12. Very different provisioning in the 3 Clouds
• AWS the most complex
• And requires initiation by
on-prem network engineer
• Many steps after initial request
• Create VPC and subnets
• Accept connection request
• Create VPG
• Associate VPG with VPC
• Create DCG
• Create VIF
• Relay back to on-prem the BGP key
• Establish VPC -> VPG routing
• Associate DCG -> VPG
• And don’t forget the Internet routers
• GCP the simplest
• Create VPC and subnets
• Create Cloud Router
• Create Interconnect
• Provide key to on-prem
• Azure not much harder
• Create VN and subnets
• Make sure the VN has Gateway subnet
• Create ExpressRoute (ER)
• Provide key to on-prem
• Create VNG
• Create connection between ER and VNG
• Note: Azure comes with many more options
to choose from
The above steps must be performed by the final user.
13. Not meant for frequent use
• In summary, provisioning (and tearing down) of
dedicated network connections is hard
• Involves many parties and many steps
• Once established, it works fine
• But the provisioning overhead is definitely non-negligible!
• Only pays off for major endeavours
At 130 TB one looks at a
$5.5k saving, which
makes it worthwhile
14. IceCube spiky traffic
• IceCube network traffic
very spiky
• All egress happens
right after compute complete
• Having many jobs allows for
smoothing of traffic
• But not if they all start
at the same time!
Test provisioning run – no job startup control
Network congestion
Overprovisining
(low utilization)
15. Slow resource provisioning in main run
• Slow resource provisioning results in spread out of job startup times
• The strategy worked for the main Cloud run
• Ramping up to 80 PFLOPs for about 2 hours
Each color represents resources tied to one network link.
One of the network links.
16. Good, but not perfect execution
• Using only spot Cloud instances
• Reached region capacity
at different times in
different places
• With 20+ resource groups,
steering was challenging
• Over-provisioned a couple of regions
• Resulting in saturated network links
Each color represents one network link.
17. Overall a success
• Produced 130 TB of simulation data (54k files)
• Used 225 fp32 PFLOP-hours of GPU compute
• Note: Almost saturated the UW research network link!
Each color represents one network link.
UW Madison
monitoring
18. The incurred cost
• Cost of the main run:
• $31k total, all included
• Of that, $5.5k was spent on networking
• Network/data transfer cost analysis
• $5.5k / 130TB = $42/TB
• Without dedicated links we would have paid: $83/TB * 130TB = $11k
50% saving
19. Summary and conclusion
• Cloud egress costs can be substantial for data-intensive applications
• Dedicated links can provide savings between 50% and 75%
• We showed that one can provision 100 Gbps in aggregate bandwidth
through Internet2
• But capacity planning and workload steering can be challenging
• High network provisioning overhead
• Especially hard for spiky workloads, like the IceCube one
• All in all, a success
• And we produced valuable science
20. Acknowledgements
• This work was partially funded by
the US National Science Foundation (NSF) though grants
OAC-1941481, MPS-1148698, OAC-1841530,
OAC-1826967 and OPP-1600823.