There are numerous examples of two-way, two-mode data in the context of social networks. In affiliation networks (Faust, 1997; Wasserman & Faust, 1994), the rows of the network matrix commonly pertain to individuals (actors), while the columns often correspond to events the actors attended (Davis, Gardner, & Gardner, 1941) or clubs of which the actors are members (Galaskiewicz, 1985). Other examples of two-mode network data include the participation of social/political groups in community projects (Mische & Pattison, 2000), the participation of individuals in electronic forums (Opsahl, 2013), the ratings viewers provided for movies (Harper & Konstan, 2015), the endorsement of political advertisements by organizations (Brusco & Doreian, 2015a), subject-by-item response data (Coombs, 1964; Hubert, 1974), U.S. Supreme Court voting data (Doreian, Batagelj, & Ferligoj, 2004, 2005), United Nations General Assembly (UNGA) voting data (Brusco, Doreian, Lloyd, & Steinley, 2013a; Doreian, Lloyd, & Mrvar, 2013), and the presence of words in documents (Xu, Liu, & Gong, 2001).
A variety of methods have been used to analyze two-mode social network data, including centrality analysis (Faust, 1997),
As formalized by Doreian et al. (2004, 2005), the goal of the deterministic two-mode blockmodeling problem (DTMBP) is to produce
One of the limitations of the RH is that its computational requirement increases significantly as the number of objects (
The next section presents the motivation for the intended contribution of our paper. This is followed by a section providing a precise formulation of the DTMBP and a section that describes two competitive algorithms for its solution: RH and TMKLMedH. A simulation-based computational comparison of RH and TMKLMedH is subsequently provided, followed by a comparison of RH and TMKLMedH for voting data from the UNGA (Brusco et al., 2013a) and a larger network corresponding to movie ratings. The paper concludes with a summary of our findings and an identification of extensions for future research.
There are two primary sources of motivation for this paper: (i) our discovery that TMKLMedH is a direct competitor of RH for solving the DTMBP; and (ii) previous findings concerning the relationships among different implementation of
There are a variety of different implementations of
The advantage of
The relationship between TMKLMedH and RH is
To summarize, the contributions of our paper are:
The introduction of two-mode
The recognition that, for binary networks, TMKLMedH is a solution procedure for the DTMBP and, therefore, a direct competitor for RH.
The identification that TMKLMedH is to RH what
The provision of computational comparisons between TMKLMedH and RH across a diverse array of datasets, which shows the superiority of TMKLMedH over RH.
We use the following notation to formulate the DTMBP:
Π := the set of all partitions of the row objects into
π := a partition, (π = {
Ω := the set of all partitions of the column objects into
ω := a partition, (ω = {
Using this notation, a formulation of the DTMBP follows: (See also Brusco et al., 2013; Brusco & Steinley, 2007a, 2011).
where:
and
The goal of the DTMBP is finding the row-object (π) and column-object (ω) partitions minimizing the number of inconsistencies with perfect structural equivalence, whereby all blocks are either null or complete. The values of
Next, we propose an alternative formulation of the DTMBP using submatrix medians. The following additional terms are necessary:
With these definitions in place, an alternative formulation of the DTMBP based on submatrix medians is as follows:
Establishing the equivalence of
A relocation heuristic (RH) for the DTMBP was developed by Doreian et al. (2004, 2005, Chapter 8). In its original design, the RH used two neighborhood search operations: (i) transfers of objects from one cluster to another; and (ii) exchanges of cluster memberships for pairs of objects. Brusco and Steinley (2011) found the second of these operations to be ineffective when RH was implemented in a multistart framework using a fixed time limit. More specifically, the computational requirement associated with the examination of all possible exchanges is high, which implies fewer restarts of the algorithm can be realized within any prescribed time limit. Based on this finding, we implement a version of the RH using only transfers (see also Brusco et al., 2013a). The precise steps of the relocation algorithm are described below. In addition to previously defined terms, the description uses
Set
Set
If ∣
Set
If
If
Set
If
Set
Set
If ∣
Set
If
If
Set
If
If
If
The RH begins in Step 0 with the selection of an initial blockmodeling solution. This solution can be randomly-generated or produced via some other type of heuristic procedure (see Brusco et al., 2013a). A cycle through Step 1 provides a search of all possible relocations of the
This formulation of RH is not without its limitations. First, a potential source of bias can accrue from column-object relocations that are not considered until all improvements via row-object relocations have been made. Second, as noted by Brusco et al. (2013a), the algorithm is left-biased in the sense that row objects are evaluated in their natural order (1, 2,…,
We apply the TMKLMedH, an adaptation of a heuristic for two-mode
Randomly assign the row objects to
Row-object reassignment.
Compute
Compute
Update π by setting
If
Column-object reassignment.
Compute
Compute
Update ω by setting
If
Compute
The heuristic begins with the random assignment of row objects and column objects to clusters in Step 0. Step 1 is a refinement process that, holding the column assignments fixed, assigns row objects to their best cluster based on minimum deviation from the current submatrix medians. Step 2 is similar, but holds the row assignments fixed and reassigns the column objects. The algorithm terminates in Step 3 when a complete cycle through Steps 1 and 2 does not produce an improvement in the criterion function value. Because of the potential for empty clusters after Steps 1c and 2c, Steps 1d and 2d are positioned to rectify this problem by reassigning the row object or column object that is farthest from its current cluster centroid to fill the empty cluster.
The TMKLMedH is sensitive to the initial partitions in Step 0 and does not necessarily converge to a globally-optimal solution for the problem posed in Equation (3). As with RH, a common practice is to restart the algorithm many times using different random initial partitions in Step 0, and selecting the results from a restart that produces the best value of the criterion function (see Brusco & Doreian, 2015a, b; Brusco & Steinley, 2007a; Van Rosmalen et al., 2009). This multistart restart practice was adopted in our implementation.
We conducted a simulation-based comparison of RH and TMKLMedH across test problems that were generated through the manipulation of eight design features: (i) number of row objects (
The two levels for the first design feature were
The two levels for the third design feature were
The two levels for the fifth design feature were: (i) an equal distribution of the row objects across the row clusters; and (ii) 60% of the row objects in one row cluster and an equal distribution of row objects across the remaining row clusters. The same two levels were used for the sixth design feature, which corresponded to the distribution of column objects across column clusters. These settings for distributions of objects across clusters are common in both the general clustering literature (Milligan, 1980) and the two-mode blockmodeling literature (Brusco & Doreian, 2015a; Brusco & Steinley, 2007a).
The seventh design feature pertained to the density of the image matrix. The image matrix,
The eighth design feature pertained to the strength of the null and complete blocks. Under perfect structural equivalence, the null and complete blocks would have 100% zeros and ones, respectively. However, the generation of such problems would be much too easy for the algorithms. Even reducing the percentage to 90% or 80% was judged insufficient to differentiate between the algorithms. Therefore, the two settings for the eighth design feature were 70% and 60%. For example, at the 70% setting, complete blocks in the image matrix were generated to have roughly 70% ones and 30% zeros, whereas null blocks were generated to have roughly 70% zeros and 30% ones. At the 60% setting of the design feature, the strength of the blocks was further weakened, which tends to make the problems more formidable for the algorithms.
The RH and TMKLMedH algorithms were programmed in Fortran 90 and implemented on a desktop computer with an Intel ® CoreTM i7-4790 CPU @ 3.6GHz with 8 GB of RAM. Both algorithms were allotted a fixed time limit of three CPU minutes per test problem. For each test problem, the following performance measures were collected: (i) the value of the criterion function; (ii) the total number of restarts realized within the time limit; (iii) the adequacy of the recovery of the underlying cluster structure of the row objects, as measured by the adjusted Rand index (ARI: Hubert & Arabie, 1985); and (iv) the adequacy of the recovery of the underlying cluster structure of the column objects as measured by the ARI.
There are two performance measures in the simulation study that are particularly important. One is value of the criterion function as the minimized value is sought. The second is the extent to which the real partition structure is identified as captured by the ARI values. Accordingly, our primary emphasis is on these measures. We knew in advance that TMKLMedH would have far more restarts than RH within the allotted time limit. However, we did not know how the ratio of the numbers of restarts for these two algorithms would behave as a function of
The key finding of the simulation study was that TMKLMedH
Simulation results: (i) MPICF: mean percentage improvement in the criterion function realized from using TMKLMedH instead of RH; (ii) MPbetter: Mean percentage of test problems for which TMKLMedH provided a better criterion function value than RH; (iii) MRR: mean ratio of the number of restarts for TMKLMedH to the number for RH within the three-minute time limit; and (iv) ARI recovery measures for row and column clusters for RH and TMLKMedH.
Design feature levels | MPICF | MPbetter | MRR | RH (Row-ARI) | TMKLMedH (Row-ARI) | RH (Col-ARI) | TMKLMedH (Col-ARI) |
---|---|---|---|---|---|---|---|
Overall average |
|
|
|
|
|
|
|
.264 | 48.698 |
| .748 | .810 | .715 | .778 | |
.425 | 46.875 |
| .741 | .852 | .767 | .880 | |
.300 | 50.521 |
| .710 | .779 | .736 | .807 | |
.388 | 45.052 |
| .780 | .853 | .746 | .850 | |
|
| 23.709 |
|
| .695 | .791 | |
|
| 25.938 |
|
| .787 | .866 | |
|
| 21.900 | .698 | .790 |
|
| |
|
| 27.747 | .792 | .872 |
|
| |
Even row cluster density | .280 | 41.146 | 27.886 |
|
| .765 | .849 |
60% row cluster density | .409 | 54.427 | 21.762 |
|
| .717 | .808 |
Even column cluster density | .240 | 41.667 | 28.412 | .771 | .853 |
|
|
60% column cluster density | .449 | 53.906 | 21.236 | .719 | .809 |
|
|
33% Image matrix density | .367 | 48.438 | 26.753 | .745 | .829 | .740 | .830 |
66% Image matrix density | .321 | 47.135 | 22.895 | .744 | .833 | .742 | .827 |
70% block strength | .364 | 26.042 | 24.647 | .808 | .911 | .804 | .911 |
60% block strength | .325 | 69.531 | 25.001 | .682 | .751 | .678 | .747 |
Note – the overall averages and the results for the design features with the two biggest effects in each column are highlighted for emphasis.
Across the complete set of 768 test problems, TMKLMedH produced a blockmodel with a better criterion function value than the one obtained by RH for 347 (47.786%) of the 768 test problems (i.e., MPBetter = 47.786%). Likewise, across all test problems, the mean improvement in the criterion function resulting from the use of TMKLMedH instead of RH was MPRCF = .344%, with a maximum improvement of 4.672%. Table 1 reveals that the number of row (
The superior performance of TMKLMedH stems from getting far more restarts than RH within the same time limit. Across the 768 test problems, the minimum, mean, and maximum ratios of the number of TMKLMedH restarts to RH restarts were 3.0, 24.8, and 72.2, respectively. Table 1 shows the
We turn our attention to the recovery of the cluster structures of the row and column objects. Across the 768 test problems, the average ARI values for row objects were .745 and .831 for RH and TMKLMedH, respectively. The average ARI values for column objects (.741 for RH and .829 for TMKLMedH) were markedly like those of the row objects. These differences are nontrivial, as the TMKLMedH averages exceed the threshold for good recovery of 0.8 recommended by Steinley (2004). In contrast, the RH averages seldom met this threshold. The ARI results in Table 1 reveal the superior recovery of TMKLMedH was systematic across most design feature levels. The two exceptions are the recovery of the row-cluster structure when
Our first example for demonstrating the TMKLMedH uses United Nations General Assembly voting data, originally compiled by Doreian et al. (2013). Two network matrices were analyzed. The first was a 141 × 276 binary network matrix pertaining to votes of 141 countries on 276 ideological resolutions. The second was a 153 × 129 matrix for the votes of 153 countries on 129 military resolutions. Brusco et al. (2013a) performed a comparison of the performance of three two-mode blockmodeling algorithms on these two networks: (i) the relocation heuristic (RH) described in Section 3; (ii) a tabu search (TS) heuristic developed by Brusco and Steinley (2011); and (iii) a variable neighborhood search (VNS) heuristic developed by Brusco et al. (2013a) for two-mode blockmodeling. Each algorithm was applied to these network matrices for all numbers of clusters in the intervals 2 ≤
For comparison to the results obtained by Brusco et al. (2013a), we ran the TMKLMedH heuristic for all combinations of 4 ≤
Comparison of criterion function values for the UNGA networks.
UNGA Military resolutions network | UNGA Ideological resolutions network | ||||||||
TMKLMedK | RH | TS | VNS | TMKLMedH | RH | TS | VNS | ||
4 | 4 | 1743 | 1743 | 1743 | 1743 | 4220 | 4220 | 4220 | 4220 |
4 | 5 | 1730 | 1730 | 1730 | 1730 | 4144 | 4144 | 4144 | 4144 |
4 | 6 | 1730 | 1730 | 1730 | 1730 | 4136 | 4136 |
| 4136 |
4 | 7 | 1730 | 1730 | 1730 | 1730 | 4131 |
|
|
|
5 | 4 | 1713 | 1713 | 1713 | 1713 | 4200 | 4200 | 4200 | 4200 |
5 | 5 | 1663 | 1663 | 1663 | 1663 | 4020 | 4020 | 4020 | 4020 |
5 | 6 | 1649 |
| 1649 | 1649 | 3947 |
| 3947 | 3947 |
5 | 7 | 1646 |
| 1646 |
| 3890 |
|
| 3890 |
6 | 4 | 1707 | 1707 |
|
| 4194 |
| 4194 |
|
6 | 5 | 1633 | 1633 | 1633 | 1633 | 4001 | 4001 | 4001 | 4001 |
6 | 6 |
|
| 1612 | 1612 | 3841 | 3841 | 3841 | 3841 |
6 | 7 | 1599 |
|
| 1599 | 3763 |
| 3763 | 3763 |
7 | 4 | 1702 |
|
|
| 4194 | 4194 | 4194 |
|
7 | 5 | 1627 |
| 1627 |
| 3997 |
|
|
|
7 | 6 | 1577 |
| 1577 | 1577 | 3822 | 3822 |
| 3822 |
7 | 7 | 1565 |
| 1565 | 1565 | 3691 |
| 3691 | 3691 |
Note: The criterion function values for the RH, TS, and VNS methods were taken from the article by Brusco et al. (2013a, Tables 2 and 3, pp. 204-205). The cell values shown in bold font reflect instances where the heuristic method failed to match the best criterion function value found across all four methods
The results in Table 2 are quite surprising. Across the 32 test problems, TMKLMedH matched the best-found criterion function value for 31 of the 32 test problems. By contrast, the TS, VNS, and RH methods only matched the best-found criterion function value for 24, 24, and 17 of the test problems, respectively. It is also noteworthy that, for three test problems (military resolution network with
We also collected, for each test problem, the elapsed time for TMKLMedH to find its best-found solution. For 22 of the 32 problems, TMKLMedH found its best solution within the first minute of its allotted 10-minute time limit. Moreover, there were only three problems where the best-found solution was not realized until after five minutes of elapsed time.
To summarize, the results using the UNGA data strongly support the simulation study. The TMKLMedH yielded a better criterion function value than RH for 15 problems, and the same criterion function value for the remaining 17 problems. The RH never produced a better result than TMKLMedH within the allotted time. Although the performance of TMKLMedH relative to TS and VNS is very favorable, considerable care is warranted. The TS and VNS procedures use relocation heuristics in their implementation and, therefore, are significantly slower than TMKLMedH. If the TS and VNS algorithms were redesigned to use the same process of finding medians and reassigning cases, perhaps their performances could improve substantially.
The second comparison of RH and TMKLMedH for a real-world network was undertaken using movie-rating data from the MovieLens database (Harper & Konstan, 2015). This data was retrieved from the KONECT (the Koblenz Network Collection:
We ran the RH and TMKLMedH algorithms for all combinations of 2 ≤
Comparison of criterion function values and number of restarts for the MovieLens network.
Criterion function values | Number of restarts | ||||||
TMKLMedH | RH | PICF | TMKLMedH | RH | RatioTMKLMedH / RH | ||
2 | 2 | 90971 | 90971 | 0.000 | 2000 | 342 | 5.848 |
2 | 3 | 90889 | 90901 | 0.013 | 1980 | 201 | 9.851 |
2 | 4 | 90875 | 90900 | 0.028 | 1863 | 171 | 10.895 |
2 | 5 | 90875 | 90899 | 0.026 | 1799 | 129 | 13.946 |
2 | 6 | 90875 | 90892 | 0.019 | 1749 | 113 | 15.478 |
2 | 7 | 90875 | 90889 | 0.015 | 1687 | 103 | 16.379 |
3 | 2 | 90971 | 90971 | 0.000 | 1494 | 129 | 11.581 |
3 | 3 | 88846 | 88859 | 0.015 | 1345 | 86 | 15.640 |
3 | 4 | 88799 | 88858 | 0.066 | 1176 | 80 | 14.700 |
3 | 5 | 88783 | 88864 | 0.091 | 1130 | 59 | 19.153 |
3 | 6 | 88781 | 88852 | 0.080 | 1109 | 56 | 19.804 |
3 | 7 | 88773 | 88823 | 0.056 | 1111 | 52 | 21.365 |
4 | 2 | 90971 | 90971 | 0.000 | 1218 | 86 | 14.163 |
4 | 3 | 88846 | 88864 | 0.020 | 1079 | 57 | 18.930 |
4 | 4 | 87603 | 87907 | 0.347 | 936 | 42 | 22.286 |
4 | 5 | 87205 | 87237 | 0.037 | 893 | 42 | 21.262 |
4 | 6 | 87142 | 87550 | 0.468 | 806 | 36 | 22.389 |
4 | 7 | 87124 | 87748 | 0.716 | 782 | 36 | 21.722 |
5 | 2 | 90971 | 90971 | 0.000 | 1041 | 68 | 15.309 |
5 | 3 | 88850 | 88863 | 0.015 | 897 | 42 | 21.357 |
5 | 4 | 87165 | 87662 | 0.570 | 779 | 34 | 22.912 |
5 | 5 | 86498 | 87146 | 0.749 | 747 | 33 | 22.636 |
5 | 6 | 86043 | 86514 | 0.547 | 680 | 28 | 24.286 |
5 | 7 | 86010 | 86082 | 0.084 | 647 | 24 | 26.958 |
6 | 2 | 90971 | 90971 | 0.000 | 897 | 56 | 16.018 |
6 | 3 | 88846 | 88876 | 0.034 | 784 | 32 | 24.500 |
6 | 4 | 87172 | 87720 | 0.629 | 673 | 29 | 23.207 |
6 | 5 | 86105 | 87338 | 1.432 | 651 | 24 | 27.125 |
6 | 6 | 85790 | 86678 | 1.035 | 597 | 21 | 28.429 |
6 | 7 | 85529 | 86197 | 0.781 | 567 | 18 | 31.500 |
7 | 2 | 90971 | 90971 | 0.000 | 792 | 46 | 17.217 |
7 | 3 | 88846 | 88883 | 0.042 | 690 | 30 | 23.000 |
7 | 4 | 87169 | 87456 | 0.329 | 603 | 22 | 27.409 |
7 | 5 | 85999 | 86330 | 0.385 | 572 | 20 | 28.600 |
7 | 6 | 85573 | 85982 | 0.478 | 511 | 15 | 34.067 |
7 | 7 | 85188 | 85617 | 0.504 | 498 | 14 | 35.571 |
Note: PICF is the percentage improvement in the criterion function from using TMKLMedH instead of RH.
Table reveals that TMKLMedH produced a better criterion function value than RH for 30 of the 36 combinations of
We recognized that the small percentage differences in the criterion values obtained by TMKLMedH and RH might not necessarily translate into meaningful differences with respect to blockmodel interpretation. The first step in this process was to select appropriate values for
Heatmaps for the criterion function values obtained by RH and TMKLMedH are displayed in Figure 1. These heatmaps were produced using the conditional formatting capabilities in Microsoft Excel using a red-yellow-green color scheme. Deep red values reflect the poorest of the criterion function values obtained, whereas deep greens signify the finest of the criterion values obtained. The progression in the heatmaps from red to reddish-yellow to yellow (and from yellow to greenish-yellow to green) visually illustrates improvement in the criterion function.
Figure 1.
Heatmaps for RH (top panel) and TMKLMedH (bottom panel) criterion values for the MovieLens network.

An examination of the heatmaps in Figure 1 reveals some similarities and differences. Both heatmaps show deep red cells in the first row (
The criterion function values for RH and TMKLMedH at
Figure 2.
RH (top panel) and TMKLMedH (bottom panel) image matrices for blockmodels obtained using

The most striking similarity of the blockmodels obtained by RH and TMKLMedH is that each has one large cluster of individuals and one large cluster of movies that are loosely tied to the network (i.e., not associated with any
We have introduced a model known as two-mode
Several comparisons of TMKLMedH and RH were undertaken. The first comparison was based on a simulation experiment that showed TMKLMedH dramatically outperformed RH regarding criterion function values and partition recovery. The superior performance of TMKLMedH can be attributed to it realizing anywhere from 3 to 72 times more restarts than RH within the same three-minute time limit. The second evaluation based on previously published results for RH (and two other procedures) for two UNGA voting networks (Brusco et al., 2013a; Doreian et al., 2013). They revealed TMKLMedH as always outperforming RH. Moreover, TMKLMedH also outperformed both the tabu search and variable neighborhood search heuristics evaluated by Brusco et al. (2013a). Finally, TMKLMedH and RH were applied to a larger two-mode network from the KONECT database. The results reinforced the findings from the simulation and UNGA analyses, but also showed that the improvements realized from TMKLMedH relative to RH could lead to salient differences with respect to model selection and blockmodel interpretation.
Some limitations and extensions associated with the research presented in this paper can broadly be divided into three categories: (i) restriction of our attention to binary two-mode networks; (ii) limitations of the algorithmic and experimental implementations; and (iii) the issue of metaheuristics.
As noted in Sections 1 and 2, a key focus of this paper showed TMKLMedH as a direct competitor for RH for binary network matrices. However, TMKLMedH is not restricted to binary network data. For example, TMKLMedH could be directly applied to the movie rating data using the actual 1 to 5 ratings as weighted edges. In fact, in circumstances where edge weights are restricted to a narrow range, TMKLMedH could well be an effective alternative to the two-mode
The RH and TMKLMedH algorithms were both written in Fortran 90 and programmed using similar conventions. Nevertheless, it is possible that the programming of the same algorithms by other researchers on different software platforms could results in different findings with respect to the number of restarts within the prescribed limits. Moreover, it is reasonable to expect that, as the time limits are increased, the relative performance differences between RH and TMKLMedH are apt to shrink. Nevertheless, we are confident about the general findings of our simulation experiment and real-world examples being sound. The TMKLMedH is much faster than RH and allows for many more restarts within a specified time constraint. This is likely to be important for larger, more computationally demanding networks.
While RH and TMKLMedH were identified as competitors for the DTMBP problem, however, the use of a metaheuristic approach uses the two methods in tandem may have additional value. Such a pairing of the two methods would mimic the
Figure 1.

Figure 2.

Comparison of criterion function values for the UNGA networks.
UNGA Military resolutions network | UNGA Ideological resolutions network | ||||||||
TMKLMedK | RH | TS | VNS | TMKLMedH | RH | TS | VNS | ||
4 | 4 | 1743 | 1743 | 1743 | 1743 | 4220 | 4220 | 4220 | 4220 |
4 | 5 | 1730 | 1730 | 1730 | 1730 | 4144 | 4144 | 4144 | 4144 |
4 | 6 | 1730 | 1730 | 1730 | 1730 | 4136 | 4136 |
|
4136 |
4 | 7 | 1730 | 1730 | 1730 | 1730 | 4131 |
|
|
|
5 | 4 | 1713 | 1713 | 1713 | 1713 | 4200 | 4200 | 4200 | 4200 |
5 | 5 | 1663 | 1663 | 1663 | 1663 | 4020 | 4020 | 4020 | 4020 |
5 | 6 | 1649 |
|
1649 | 1649 | 3947 |
|
3947 | 3947 |
5 | 7 | 1646 |
|
1646 |
|
3890 |
|
|
3890 |
6 | 4 | 1707 | 1707 |
|
|
4194 |
|
4194 |
|
6 | 5 | 1633 | 1633 | 1633 | 1633 | 4001 | 4001 | 4001 | 4001 |
6 | 6 |
|
|
1612 | 1612 | 3841 | 3841 | 3841 | 3841 |
6 | 7 | 1599 |
|
|
1599 | 3763 |
|
3763 | 3763 |
7 | 4 | 1702 |
|
|
|
4194 | 4194 | 4194 |
|
7 | 5 | 1627 |
|
1627 |
|
3997 |
|
|
|
7 | 6 | 1577 |
|
1577 | 1577 | 3822 | 3822 |
|
3822 |
7 | 7 | 1565 |
|
1565 | 1565 | 3691 |
|
3691 | 3691 |
Comparison of criterion function values and number of restarts for the MovieLens network.
Criterion function values | Number of restarts | ||||||
TMKLMedH | RH | PICF | TMKLMedH | RH | RatioTMKLMedH / RH | ||
2 | 2 | 90971 | 90971 | 0.000 | 2000 | 342 | 5.848 |
2 | 3 | 90889 | 90901 | 0.013 | 1980 | 201 | 9.851 |
2 | 4 | 90875 | 90900 | 0.028 | 1863 | 171 | 10.895 |
2 | 5 | 90875 | 90899 | 0.026 | 1799 | 129 | 13.946 |
2 | 6 | 90875 | 90892 | 0.019 | 1749 | 113 | 15.478 |
2 | 7 | 90875 | 90889 | 0.015 | 1687 | 103 | 16.379 |
3 | 2 | 90971 | 90971 | 0.000 | 1494 | 129 | 11.581 |
3 | 3 | 88846 | 88859 | 0.015 | 1345 | 86 | 15.640 |
3 | 4 | 88799 | 88858 | 0.066 | 1176 | 80 | 14.700 |
3 | 5 | 88783 | 88864 | 0.091 | 1130 | 59 | 19.153 |
3 | 6 | 88781 | 88852 | 0.080 | 1109 | 56 | 19.804 |
3 | 7 | 88773 | 88823 | 0.056 | 1111 | 52 | 21.365 |
4 | 2 | 90971 | 90971 | 0.000 | 1218 | 86 | 14.163 |
4 | 3 | 88846 | 88864 | 0.020 | 1079 | 57 | 18.930 |
4 | 4 | 87603 | 87907 | 0.347 | 936 | 42 | 22.286 |
4 | 5 | 87205 | 87237 | 0.037 | 893 | 42 | 21.262 |
4 | 6 | 87142 | 87550 | 0.468 | 806 | 36 | 22.389 |
4 | 7 | 87124 | 87748 | 0.716 | 782 | 36 | 21.722 |
5 | 2 | 90971 | 90971 | 0.000 | 1041 | 68 | 15.309 |
5 | 3 | 88850 | 88863 | 0.015 | 897 | 42 | 21.357 |
5 | 4 | 87165 | 87662 | 0.570 | 779 | 34 | 22.912 |
5 | 5 | 86498 | 87146 | 0.749 | 747 | 33 | 22.636 |
5 | 6 | 86043 | 86514 | 0.547 | 680 | 28 | 24.286 |
5 | 7 | 86010 | 86082 | 0.084 | 647 | 24 | 26.958 |
6 | 2 | 90971 | 90971 | 0.000 | 897 | 56 | 16.018 |
6 | 3 | 88846 | 88876 | 0.034 | 784 | 32 | 24.500 |
6 | 4 | 87172 | 87720 | 0.629 | 673 | 29 | 23.207 |
6 | 5 | 86105 | 87338 | 1.432 | 651 | 24 | 27.125 |
6 | 6 | 85790 | 86678 | 1.035 | 597 | 21 | 28.429 |
6 | 7 | 85529 | 86197 | 0.781 | 567 | 18 | 31.500 |
7 | 2 | 90971 | 90971 | 0.000 | 792 | 46 | 17.217 |
7 | 3 | 88846 | 88883 | 0.042 | 690 | 30 | 23.000 |
7 | 4 | 87169 | 87456 | 0.329 | 603 | 22 | 27.409 |
7 | 5 | 85999 | 86330 | 0.385 | 572 | 20 | 28.600 |
7 | 6 | 85573 | 85982 | 0.478 | 511 | 15 | 34.067 |
7 | 7 | 85188 | 85617 | 0.504 | 498 | 14 | 35.571 |
Simulation results: (i) MPICF: mean percentage improvement in the criterion function realized from using TMKLMedH instead of RH; (ii) MPbetter: Mean percentage of test problems for which TMKLMedH provided a better criterion function value than RH; (iii) MRR: mean ratio of the number of restarts for TMKLMedH to the number for RH within the three-minute time limit; and (iv) ARI recovery measures for row and column clusters for RH and TMLKMedH.
Design feature levels | MPICF | MPbetter | MRR | RH (Row-ARI) | TMKLMedH (Row-ARI) | RH (Col-ARI) | TMKLMedH (Col-ARI) |
---|---|---|---|---|---|---|---|
Overall average |
|
|
|
|
|
|
|
.264 | 48.698 |
|
.748 | .810 | .715 | .778 | |
.425 | 46.875 |
|
.741 | .852 | .767 | .880 | |
.300 | 50.521 |
|
.710 | .779 | .736 | .807 | |
.388 | 45.052 |
|
.780 | .853 | .746 | .850 | |
|
|
23.709 |
|
|
.695 | .791 | |
|
|
25.938 |
|
|
.787 | .866 | |
|
|
21.900 | .698 | .790 |
|
|
|
|
|
27.747 | .792 | .872 |
|
|
|
Even row cluster density | .280 | 41.146 | 27.886 |
|
|
.765 | .849 |
60% row cluster density | .409 | 54.427 | 21.762 |
|
|
.717 | .808 |
Even column cluster density | .240 | 41.667 | 28.412 | .771 | .853 |
|
|
60% column cluster density | .449 | 53.906 | 21.236 | .719 | .809 |
|
|
33% Image matrix density | .367 | 48.438 | 26.753 | .745 | .829 | .740 | .830 |
66% Image matrix density | .321 | 47.135 | 22.895 | .744 | .833 | .742 | .827 |
70% block strength | .364 | 26.042 | 24.647 | .808 | .911 | .804 | .911 |
60% block strength | .325 | 69.531 | 25.001 | .682 | .751 | .678 | .747 |