Hairball Buster: A Graph Triage Method for Viewing and Comparing Graphs

Patrick Allen; Mark Matties; Elisha Peterson

Accesso libero

Hairball Buster: A Graph Triage Method for Viewing and Comparing Graphs

| 17 giu 2020

Volume 40 (2020): Numero 1 (January 2020)

INFORMAZIONI SU QUESTO ARTICOLO

Articolo precedente

Articolo Successivo

Cita

Pubblicato online: 17 giu 2020

Pagine: 1 - 24

DOI: https://doi.org/10.21307/connections-2019.009

Parole chiave
Graph analytic triage, Node-neighbor centrality, Standard canonical form for graphs, Comparing graphs

© 2020 Patrick Allen et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Sample ‘Hairball’ showing jazz players that performed with each other.

Visone backbone layout of jazz player data set.

Sample HB curve for jazz players that performed with each other.

Neighbors plot for jazz players that performed with each other.

Questions addressed by location of neighbor nodes.

Sample directed neighbors plot for jazz player data set (Green = In, Red = Out).

Force-directed representation of the Toaster data set.

Backbone layout representation of the Toaster data set.

HB representation of the Toaster data set (directionality ignored).

HB representation of the inverse of neighbor nodes (e.g. gaps).

HB inverse representation of just the top 100 ranked nodes with each other in Toaster data set.

Force Atlas 2 on top 20 nodes in Toaster data set.

HB chart of first 3,500 connections in Toaster data set.

HB chart of third 3,500 connections in Toaster data set.

HB chart of second 3,500 connections in Toaster data set.

HB chart of suspended Iranian Twitter™ accounts, user-id replies, and no retweets.

HB chart of suspended Iranian Twitter™ accounts, user-id replies, no retweets, first 200 nodes showing gaps among the top 3 and the next 40 nodes.

Sample chart of CodeDNA™ cluster outputs of malware binaries.

Sample CodeDNA™ cluster outputs of Linux coreutils binaries.

Sample CodeDNA™ cluster output in standard hairball buster (blue = nodes, gray dots = links).

Sample CodeDNA™ cluster output in HB with vertical offset.

Sample CodeDNA™ cluster output in HB with vertical offset and highlighting nodes with highest similarity scores.

Displaying different measures of centrality in HB.

Comparing different types of graphs and algorithms.

Sample Log10–log10 plot of jazz player data set with no offset.

Sample offset of origin to 10,10 for Log10–log10 plot of jazz player data set.

Sample offset of origin to 10,10 for semi–log plot of Toaster data set.

Performance calculations comparisons for HB vs backbone layout.

	Data sets			hb run time (s)				visone run time – quad Sim (s)				visone run time – tri Sim (s)
Filename	File size (B)	No. of nodes	No. of edges	1	2	3	Avg	1	2	3	Avg	1	2	3	Avg
random-1000-nodes.graphml	341,365	1,000	5,002	0.25	0.25	0.25	0.25	2.0	1.7	1.6	1.8	1.5	1.1	1.3	1.3
random-10000-nodes.graphml	3,555,915	10,000	49,826	0.67	0.69	0.70	0.69	7.3	6.9	6.8	7.0	7.0	7.1	6.8	7.0
random-100000-nodes.graphml	37,271,224	100,000	500,061	10.01	11.74	6.55	9.43	139.4	120.1	118.5	126.0	129.0	119.7	119.1	122.6
random-250000-nodes.graphml	95,452,841	250,000	1,250,487	16.84	15.36	15.24	15.81	349.3	357.3	361.3	356.0	356.8	352.7	334.5	348.0
random-500000-nodes.graphml	193,263,339	500,000	2,501,346	26.21	25.71	24.47	25.46	>1,200
random-1000000-nodes.graphml	388,461,043	1,000,000	4,997,089	44.25	43.75	45.19	44.40	Visone could not load graphml file. Insufficient memory
code-dna.graphml	155,222	28	292	<1 sec	<1 sec	<1 sec		<1 sec	<1 sec	<1 sec		<1 sec	<1 sec	<1 sec
jazz-directed.graphml	361,796	198	4,113	<1 sec	<1 sec	<1 sec		<1 sec	<1 sec	<1 sec		<1 sec	<1 sec	<1 sec
toster_CA_Edge.graphml	5,349,861	23,916	75,050	1.02	0.96	0.96	0.98	20.6	19.8	20.1	20.2	17.1	18.9	17.8	17.9
iran-tweet-replies.no-retweet.by-userid.graphml	294,153,484	228,626	440,244	1.26	1.12	1.13	1.17	>1,200

Comparing HB features to other graph analytic and visualization algorithms.

Feature	Hairball buster	Histogram/node-degree display	Force-directed	Visone backbone	Adjacency matrix	Block modeling
Understanding node relationships and graph characteristics
1. Distribution of nodes by degree	Yes	Yes	No	No	No^f	No^f
2. Quickly determine the number of high-degree nodes	Yes	Yes	No	No	Yes	No^f
3. Quickly identify which are the highest degree nodes	Yes	Yes^a	No^b	No	Yes	Yes
4. Determine if the highest degree nodes are directly connected to other high-degree nodes	Yes	No	Yes^c	No^b	Yes	Yes
5. Determine whether the highest degree nodes are connected to each other indirectly via two hops	Yes	No	Yes	Yes^c	Yes	Yes
6. Determine which lower-degree nodes are directly connected to the high-degree nodes	Yes	No	Yes	Yes	Yes	Yes
7. Provide visual cue of how much difference exists between the degree of the nodes, especially high-degree nodes	Yes	Yes	No	No	No	Yes
8. Determine if there is one central cluster or many clusters that contain the highest degree nodes	Yes	No	Yes	Yes	No	Yes
Representing large or directed networks, or with weighted links
9. Provide log–log or semi–log representation for very large data sets	Yes	Yes	No	No	No	No
10. Can visualize both directed and undirected graphs	Yes	No	Yes^e	Yes^e	Yes	Yes
11. Determine which nodes connect to the highest weighted links	Yes	No	Yes^d	Yes	Yes^g	Yes^g
Other centrality measures, standard format, low calculation cost
12. Distribution of nodes by other centrality measures	Yes	Yes	No	No	No	No
13. Provide a canonical representation of the graph	Yes	Yes	No	No	Yes	No
14. Low calculation cost	Yes	Yes	No	No	Yes	No^h

eISSN:: 0226-1766
Lingua:: Inglese

Frequenza di pubblicazione:: Volume Open
Argomenti della rivista:: Social Sciences, other

Feed RSS della rivista

Hairball Buster: A Graph Triage Method for Viewing and Comparing Graphs

Pubblicato online: 17 giu 2020

Pagine: 1 - 24

DOI: https://doi.org/10.21307/connections-2019.009

Parole chiave
Graph analytic triage, Node-neighbor centrality, Standard canonical form for graphs, Comparing graphs

© 2020 Patrick Allen et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Figure 13:

Figure 15:

Figure 14:

Figure 16:

Figure 17:

Figure 18:

Figure 19:

Figure 20:

Figure 21:

Figure 22:

Figure 23:

Figure 24:

Figure A1:

Figure A2:

Figure A3:

Performance calculations comparisons for HB vs backbone layout.

Comparing HB features to other graph analytic and visualization algorithms.

Hairball Buster: A Graph Triage Method for Viewing and Comparing Graphs

Pubblicato online: 17 giu 2020

Pagine: 1 - 24

DOI: https://doi.org/10.21307/connections-2019.009

Parole chiaveGraph analytic triage, Node-neighbor centrality, Standard canonical form for graphs, Comparing graphs

© 2020 Patrick Allen et al., published by Sciendo

This work is licensed under the Creative Commons Attribution 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Figure 13:

Figure 15:

Figure 14:

Figure 16:

Figure 17:

Figure 18:

Figure 19:

Figure 20:

Figure 21:

Figure 22:

Figure 23:

Figure 24:

Figure A1:

Figure A2:

Figure A3:

Performance calculations comparisons for HB vs backbone layout.

Comparing HB features to other graph analytic and visualization algorithms.

Parole chiave
Graph analytic triage, Node-neighbor centrality, Standard canonical form for graphs, Comparing graphs