On the Internet Delay Space Dimensionality

•Télécharger en tant que PPT, PDF•

1 j'aime•5,950 vues

We investigate the dimensionality properties of the Internet delay space, i.e., the matrix of measured round-trip latencies between Internet hosts. Previous work on network coordinates has indicated that this matrix can be embedded, with reasonably low distortion, into a 4- to 9-dimensional Euclidean space. The application of Principal Component Analysis (PCA) reveals the same dimensionality values. Our work addresses the question: to what extent is the dimensionality an intrinsic property of the delay space, defined without reference to a host metric such as Euclidean space? Is the intrinsic dimensionality of the Internet delay space approximately equal to the dimension determined using embedding techniques or PCA? If not, what explains the discrepancy? What properties of the network contribute to its overall dimensionality? Using datasets obtained via the King [14] method, we study different measures of dimensionality to establish the following conclusions. First, based on its power-law behavior, the structure of the delay space can be better characterized by fractal measures. Second, the intrinsic dimension is significantly smaller than the value predicted by the previous studies; in fact by our measures it is less than 2. Third, we demonstrate a particular way in which the AS topology is reflected in the delay space; subnetworks composed of hosts which share an upstream Tier-1 autonomous system in common possess lower dimensionality than the combined delay space. Finally, we observe that fractal measures, due to their sensitivity to non-linear structures, display higher precision for measuring the influence of subtle features of the delay space geometry.

Formation Technologie Business

InetDim: Characterizing the Internet Delay Space Dimensionality Bruno Abrahao Robert Kleinberg Cornell University

[object Object],[object Object],InetDim Project

[object Object],[object Object],Current Investigation

[object Object],[object Object],[object Object],[object Object],[object Object],Definitions

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Coordinate-based positioning systems

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Measurement based positioning systems

[object Object],[object Object],[object Object],[object Object],[object Object],Motivation 1: Coordinate-based positioning systems

[object Object],[object Object],[object Object],[object Object],Motivation 1: Coordinate-based positioning systems

[object Object],[object Object],Motivation 2: Measurement-based positining systems

[object Object],[object Object],[object Object],Motivation 3: Internet properties

[object Object],[object Object],[object Object],[object Object],Motivation 4: Synthetic delay space generation

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Datasets 1/2

[object Object],[object Object],[object Object],[object Object],Datasets 2/2

[object Object],[object Object],[object Object],[object Object],[object Object],Part I: Dimensionality measures

Embedding Dimension ,[object Object],[object Object],[object Object],Meridian

Embedding Dimension Shortcomings (1/2) ,[object Object],[object Object],Meridian

[object Object],[object Object],[object Object],[object Object],Embedding Dimension Shortcomings (2/2)

[object Object],[object Object],Principal Component Analysis Meridian

[object Object],[object Object],[object Object],Principal Component Analysis ? ? ? Meridian

[object Object],[object Object],Intrinsic notion of dimensionality

[object Object],Intrinsic dimensionality metrics ,[object Object],[object Object],Meridian P2PSim Includes almost all intra-continental distances (usec)

Correlation Dimension ,[object Object],x r

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Fractal Dimension Family Easy to measure Sensitive to non-linear behavior

[object Object],Correlation Dimension of Embedded Matrix

[object Object],[object Object],[object Object],[object Object],[object Object],Fractals and Internet Models

[object Object],[object Object],[object Object],Part II: Dimensionality Components

[object Object],[object Object],Geographic Location

[object Object],[object Object],Dimensionality Reducing Decomposition ,[object Object],[object Object]

[object Object],Dimensionality Reducing Decomposition

[object Object],[object Object],[object Object],[object Object],[object Object],Dimensionality Reducing Decomposition

Dimensionality Reducing Decomposition Results ,[object Object]

Dimensionality Reducing Decomposition Results ,[object Object],[object Object]

[object Object],[object Object],[object Object],Sanity Check 1

[object Object],Distance-based Clustering ,[object Object]

[object Object],[object Object],[object Object],Sanity Check 2

Random subsets with lower cardinality ,[object Object]

[object Object],[object Object],Dimensionality Paradox

[object Object],Isomap Space is rich in non-linearity

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Conclusion

[object Object],[object Object],[object Object],Further Info

Recommandé

International Journal of Engineering Research and DevelopmentIJERD Editor

An ideal steganographic scheme in networks usingeSAT Publishing House

DISTRIBUTED COVERAGE AND CONNECTIVITY PRESERVING ALGORITHM WITH SUPPORT OF DI...IJCSEIT Journal

An ideal steganographic scheme in networks using twisted payloadeSAT Journals

Delta-Screening: A Fast and Efficient Technique to Update Communities in Dyna...Subhajit Sahu

Formulation of modularity factor for community detection applyingIAEME Publication

An Overview and Classification of Approaches to Information Extraction in Wir...M H

A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...ijwmn

Recommandé

International Journal of Engineering Research and DevelopmentIJERD Editor

An ideal steganographic scheme in networks usingeSAT Publishing House

DISTRIBUTED COVERAGE AND CONNECTIVITY PRESERVING ALGORITHM WITH SUPPORT OF DI...IJCSEIT Journal

An ideal steganographic scheme in networks using twisted payloadeSAT Journals

Delta-Screening: A Fast and Efficient Technique to Update Communities in Dyna...Subhajit Sahu

Formulation of modularity factor for community detection applyingIAEME Publication

An Overview and Classification of Approaches to Information Extraction in Wir...M H

A HYBRID FUZZY SYSTEM BASED COOPERATIVE SCALABLE AND SECURED LOCALIZATION SCH...ijwmn

CNNUkjae Jeong

Reliable and Efficient Data Acquisition in Wireless Sensor NetworkIJMTST Journal

Recognition and Detection of Real-Time Objects Using Unified Network of Faste...dbpublications

Efficient Neural Network Architecture for Image ClassficationYogendra Tamang

Security based Clock Synchronization technique in Wireless Sensor Network for...iosrjce

A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal

AlexNetBertil Hatt

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa

A new approach for area coverage problem in wireless sensor networks with hyb...ijmnct

A review on routing protocols and non uniformityiaemedu

Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...IDES Editor

Extended Visual Cryptography Using WatermarkingShivam Singh

Integration of feature sets with machine learning techniquesiaemedu

50120130406028 2IAEME Publication

cec2013Angelo Ferreira Assis

40120140505005 2IAEME Publication

40120140505005IAEME Publication

Understanding Convolutional Neural NetworksJeremy Nixon

1104.0355sudddd44

Data collection in multi application sharing wireless sensor networksPvrtechnologies Nellore

IRJET - Gender Recognition from Facial ImagesIRJET Journal

Volume 2-issue-6-2200-2204Editor IJARCET

Contenu connexe

Tendances

CNNUkjae Jeong

Reliable and Efficient Data Acquisition in Wireless Sensor NetworkIJMTST Journal

Recognition and Detection of Real-Time Objects Using Unified Network of Faste...dbpublications

Efficient Neural Network Architecture for Image ClassficationYogendra Tamang

Security based Clock Synchronization technique in Wireless Sensor Network for...iosrjce

A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...IJCNCJournal

AlexNetBertil Hatt

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.GeeksLab Odessa

A new approach for area coverage problem in wireless sensor networks with hyb...ijmnct

A review on routing protocols and non uniformityiaemedu

Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...IDES Editor

Extended Visual Cryptography Using WatermarkingShivam Singh

Integration of feature sets with machine learning techniquesiaemedu

50120130406028 2IAEME Publication

cec2013Angelo Ferreira Assis

40120140505005 2IAEME Publication

40120140505005IAEME Publication

Understanding Convolutional Neural NetworksJeremy Nixon

1104.0355sudddd44

Data collection in multi application sharing wireless sensor networksPvrtechnologies Nellore

Tendances (20)

CNN

Reliable and Efficient Data Acquisition in Wireless Sensor Network

Recognition and Detection of Real-Time Objects Using Unified Network of Faste...

Efficient Neural Network Architecture for Image Classfication

Security based Clock Synchronization technique in Wireless Sensor Network for...

A COOPERATIVE LOCALIZATION METHOD BASED ON V2I COMMUNICATION AND DISTANCE INF...

AlexNet

AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.

A new approach for area coverage problem in wireless sensor networks with hyb...

A review on routing protocols and non uniformity

Data Accuracy Models under Spatio - Temporal Correlation with Adaptive Strate...

Extended Visual Cryptography Using Watermarking

Integration of feature sets with machine learning techniques

50120130406028 2

cec2013

40120140505005 2

40120140505005

Understanding Convolutional Neural Networks

1104.0355

Data collection in multi application sharing wireless sensor networks

Similaire à On the Internet Delay Space Dimensionality

IRJET - Gender Recognition from Facial ImagesIRJET Journal

Volume 2-issue-6-2200-2204Editor IJARCET

Validation Study of Dimensionality Reduction Impact on Breast Cancer Classifi...ijcsit

DYNAMIC NETWORK ANOMALY INTRUSION DETECTION USING MODIFIED SOMcscpconf

Hybrid Target Tracking Scheme in Wireless Sensor NetworksIRJET Journal

C1804011117IOSR Journals

Resource Mapping Optimization for Distributed Cloud Services - PhD Thesis Def...AtakanAral

Survey on classification algorithms for data mining (comparison and evaluation)Alexander Decker

Empirical Network ClassificationColleen Farrelly

Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...JIEMS Akkalkuwa

Intrusion Detection System using K-Means Clustering and SMOTEIRJET Journal

LIDAR- Light Detection and Ranging.Gaurav Agarwal

EDGE-Net: Efficient Deep-learning Gradients Extraction Networkgerogepatton

Algorithm selection for sorting in embedded and mobile systemsJigisha Aryya

An efficient approach on spatial big data related to wireless networks and it...eSAT Journals

IRJET- Weakly Supervised Object Detection by using Fast R-CNNIRJET Journal

IRJET - Object Detection using Hausdorff DistanceIRJET Journal

Cao nicolau-mc dermott-learning-neural-cybernetics-2018-preprintNam Le

Similaire à On the Internet Delay Space Dimensionality (20)

IRJET - Gender Recognition from Facial Images

Volume 2-issue-6-2200-2204

Validation Study of Dimensionality Reduction Impact on Breast Cancer Classifi...

DYNAMIC NETWORK ANOMALY INTRUSION DETECTION USING MODIFIED SOM

Hybrid Target Tracking Scheme in Wireless Sensor Networks

C1804011117

Resource Mapping Optimization for Distributed Cloud Services - PhD Thesis Def...

Survey on classification algorithms for data mining (comparison and evaluation)

Empirical Network Classification

Review Paper on Shared and Distributed Memory Parallel Algorithms to Solve Bi...

Intrusion Detection System using K-Means Clustering and SMOTE

LIDAR- Light Detection and Ranging.

EDGE-Net: Efficient Deep-learning Gradients Extraction Network

Algorithm selection for sorting in embedded and mobile systems

An efficient approach on spatial big data related to wireless networks and it...

IRJET- Weakly Supervised Object Detection by using Fast R-CNN

IRJET - Object Detection using Hausdorff Distance

Cao nicolau-mc dermott-learning-neural-cybernetics-2018-preprint

Dernier

Influencing policy (training slides from Fast Track Impact)Mark Reed

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George

Activity 2-unit 2-update 2024. English translationRosabel UA

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood

Difference Between Search & Browse Methods in Odoo 17Celine George

4.18.24 Movement Legacies, Reflection, and Review.pptxmary850239

Karra SKD Conference Presentation Revised.pptxAshokKarra1

Food processing presentation for bsc agriculture honsManeerUddin

Integumentary System SMP B. Pharm Sem I.pptshraddhaparab530

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup

Concurrency Control in Database Management systemChristalin Nelson

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptxHumphrey A Beña

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo

How to do quick user assign in kanban in Odoo 17 ERPCeline George

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATIONHumphrey A Beña

Student Profile Sample - We help schools to connect the data they have, with ...Seán Kennedy

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxConquiztadors- the Quiz Society of Sri Venkateswara College

Field Attribute Index Feature in Odoo 17Celine George

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1

Dernier (20)

Influencing policy (training slides from Fast Track Impact)

Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17

Activity 2-unit 2-update 2024. English translation

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx

Difference Between Search & Browse Methods in Odoo 17

4.18.24 Movement Legacies, Reflection, and Review.pptx

Karra SKD Conference Presentation Revised.pptx

Food processing presentation for bsc agriculture hons

Integumentary System SMP B. Pharm Sem I.ppt

Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf

Concurrency Control in Database Management system

INTRODUCTION TO CATHOLIC CHRISTOLOGY.pptx

Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx

How to do quick user assign in kanban in Odoo 17 ERP

Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf

THEORIES OF ORGANIZATION-PUBLIC ADMINISTRATION

Student Profile Sample - We help schools to connect the data they have, with ...

YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx

Field Attribute Index Feature in Odoo 17

ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...

On the Internet Delay Space Dimensionality

1. InetDim: Characterizing the Internet Delay Space Dimensionality Bruno Abrahao Robert Kleinberg Cornell University

10.

11.

12.

13.

14.

15.

16. Meridian geolocation visualization

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27. Sampling from a unit square

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

41.

42.

43.

44.

45.

Notes de l'éditeur

In this work, we view the Internet as a metric space, where the metric is defined as the round-trip time to send a packet between two hosts. One of the most important properties that characterizes a metric space is its dimensionality. Properly estimating the dimensionality of a metric space is crucial for characterizing its structure and designing algorithms based on that space: if one's estimate of the dimension is too low, it is impossible to find an embedding that preserves distances; if the estimate is too high then we make the algorithms inefficient as we run into the so-called curse of dimensionality: there are too many degrees of freedom to explore and the algorithms get lost. A major application of this abstraction is a coordinate-based positioning system. These systems aim at mapping the network into a metric space in such a way that the geometric distances estimates the real latency with low degree of error. For such systems, the dimensionality of the target space is a tunable parameter. In addition the dimensionality is known to affect the accuracy of the predictions, the stability of coordinates over time and the time to converge to stable coordinates. Our uses a latency matrix obtained via the King method, which is a convenient way Of measuring the latency between nameservers without having login access to them, to answer two fundamental questions: What is the dimensionality of the Internet delay space? And… What forces contribute to its geometric properties? Apart from its implications to the performance of coordinate systems, this characterization is by itself a topic of practical interest as it uncovers properties and opens new questions on the nature and complexity of the network.
Measurement-based positioning systems implement the same functionality as the coordinate-based counterparts by performing measurements in a careful way.
In this work, we view the Internet as a metric space, where the metric is defined as the round-trip time to send a packet between two hosts. One of the most important properties that characterizes a metric space is its dimensionality. Properly estimating the dimensionality of a metric space is crucial for characterizing its structure and designing algorithms based on that space: if one's estimate of the dimension is too low, it is impossible to find an embedding that preserves distances; if the estimate is too high then we make the algorithms inefficient as we run into the so-called curse of dimensionality: there are too many degrees of freedom to explore and the algorithms get lost. A major application of this abstraction is a coordinate-based positioning system. These systems aim at mapping the network into a metric space in such a way that the geometric distances estimates the real latency with low degree of error. For such systems, the dimensionality of the target space is a tunable parameter. In addition the dimensionality is known to affect the accuracy of the predictions, the stability of coordinates over time and the time to converge to stable coordinates. Our uses a latency matrix obtained via the King method, which is a convenient way Of measuring the latency between nameservers without having login access to them, to answer two fundamental questions: What is the dimensionality of the Internet delay space? And… What forces contribute to its geometric properties? Apart from its implications to the performance of coordinate systems, this characterization is by itself a topic of practical interest as it uncovers properties and opens new questions on the nature and complexity of the network. Prior work: 5- to 9- dimensional Euclidean space with “reasonably” low distortion
data can be approximately embedded into a low-dimensional Euclidean space the distance matrix can be accurately approximated by a low-rank matrix
Dimensionality notions used in prior work often reflect the assumption that low-dimensional data can be approximately embedded in a low-dimensional Euclidean space or that the distance matrix can be accurately approximated by a low-rank matrix. Thus, one could define the embedding dimension of a space by finding the lowest-dimensional Euclidean space that admits an embedding with adequate percentiles of relative error, or one could define the dimension by using Principal Component Analysis to identify the smallest value of k for which the distance matrix has a rank-k approximation with adequate relative error. In these two plots we see the outcome of applying this process to the Internet measurements mentioned earlier. The vertical bar in the graph on the left shows the relative error obtained when embedding the data into a Euclidean space of dimension 1,2,3, etc. The graph on the right shows the percent of variance explained by the first k principal components of the distance matrix for varying values of k. What are the problems with these methods? 1) The problem with the first approach is that the embedding algorithm might fail to produce the real value of dimensionality, since the curse of dimensionality comes into play: after 7 dimensions we observe higher percentiles of relative errors. 2) The problem with PCA is that, 2.1) as a method grounded in linear algebra, it is oblivious to non-linear relationships between the dimensions. For instance, if we try to estimate the surface of a sphere, it will indicate a 3 dimensional object whereas it is actually 2 dimensional. 2.2) As in the case of this plot it is not always clear where to establish the cutoff point beyond which the subsequent components explain only a negligible variance. To the extent that these methods estimate the dimensionality of the delay space, they indicate a value between 4 and 7.
Dimensionality notions used in prior work often reflect the assumption that low-dimensional data can be approximately embedded in a low-dimensional Euclidean space or that the distance matrix can be accurately approximated by a low-rank matrix. Thus, one could define the embedding dimension of a space by finding the lowest-dimensional Euclidean space that admits an embedding with adequate percentiles of relative error, or one could define the dimension by using Principal Component Analysis to identify the smallest value of k for which the distance matrix has a rank-k approximation with adequate relative error. In these two plots we see the outcome of applying this process to the Internet measurements mentioned earlier. The vertical bar in the graph on the left shows the relative error obtained when embedding the data into a Euclidean space of dimension 1,2,3, etc. The graph on the right shows the percent of variance explained by the first k principal components of the distance matrix for varying values of k. What are the problems with these methods? 1) The problem with the first approach is that the embedding algorithm might fail to produce the real value of dimensionality, since the curse of dimensionality comes into play: after 7 dimensions we observe higher percentiles of relative errors. 2) The problem with PCA is that, 2.1) as a method grounded in linear algebra, it is oblivious to non-linear relationships between the dimensions. For instance, if we try to estimate the surface of a sphere, it will indicate a 3 dimensional object whereas it is actually 2 dimensional. 2.2) As in the case of this plot it is not always clear where to establish the cutoff point beyond which the subsequent components explain only a negligible variance. To the extent that these methods estimate the dimensionality of the delay space, they indicate a value between 4 and 7.
If the measured distances reflect a metric other than Euclidean distance Algorithm fails to produce low-distortion embedding in any dimension! e.g., hilly terrain
In our work, we explore the structural and statistical properties of the Internet delay space in order to better characterize its dimensionality. For instance, metric spaces that exhibit power-law behavior can be measured using fractal dimensions Which is a intrinsic measure that work without a reference to a external metric space. One of the way from which the power law behavior arises in the delay space is When we plot in logscale the number of nodes (y-axis) that are within a given distance (x-axis). 1) The first striking feature of this plot is a power law that persists over two orders of magnitude. Datasets that display a property like that are said to behavior like a fractal. 2) Another interesting observation is that this range of distances include all RTT between 2 and 100ms, Which in turn include all non-oceanic distances. 3) When we observe this behavior, we can measure the intrisic dimensionality of the dataset as the power-law exponent. The other surprising finding is that the intrinsic dimensionality measured by this method is so much lower than what can be estimated by previous methods!
Is the dimensionality value less than 2D due to the fact that the Internet lives on the surface of the Earth? Is the surface of the Earth 0.9-dimensional?
Another way in which our work illuminates the structure of the delay space is by revealing that the delay space is not homogeneously 1.8-dimensional but is made up of a small number of low-dimensional pieces. One might expect the subnetworks to be simper because the decomposition eliminated inefficient routes that go from a network, up to its Tier-1 provider, over to another Tier-1 provider down to its customer. Or perhaps this decomposition just decomposed the network into subsnetworks of equal complexity. I ’ll now show that that’s not the case. Upon decomposing the delay space into overlapping pieces, each corresponding to A Tier-1 AS and its downstream customers, we observe that each individual piece Has dimensionality 10% lower than the combined delay space. This dimensionality shift cannot be achieved by other kinds of decompositions, namely By decomposing into pieces of low diameter, or clustered geographically, or randomly selecting a set with lower cardinality. Another interesting finding is that this dimensionality shift can only be detected by fractal measures. The embedding dimension and PCA are oblivious to it. This also demonstrates the power and applicability of the fractal measures. Disproportionate to the rest of the links What is the geometric effect of analyzing each Tier-1 network, together with its downstream customers in isolation Expect better behaved pieces
So far, this presentation presented no evidence that the this study may lead to better network embeddings. In fact, the evidence so far suggests the opposite. However, now we ’ll see that it does lead to better non-linear embeddings.