Open Access

Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems

 and   
May 20, 2020

Cite
Download Cover

Figure 1

Schematic representation of tensor matricization.
Schematic representation of tensor matricization.

Figure 2

Schematic representation of incremental tensor matricization.
Schematic representation of incremental tensor matricization.

Figure 3

Schematic representation of the multi-dimensional incremental PARAFAC tensor decomposition method.
Schematic representation of the multi-dimensional incremental PARAFAC tensor decomposition method.

Figure 4

Factor matrices obtained by the decomposition of the incoming tensors.
Factor matrices obtained by the decomposition of the incoming tensors.

Figure 5

Comparison of relative errors according to the different densities of incoming tensor datasets.
Comparison of relative errors according to the different densities of incoming tensor datasets.

Figure 6

Comparison of execution time (min) according to the different initial size.
Comparison of execution time (min) according to the different initial size.

Comparison of execution time (min) according to the different initial size_

Tensor Synthetic Datasets100×100 ×100500×500×500800×800×800
Initial dimension size (I=J=K)105080100250400100400640
Initial Density 60%StaticIM-PARAFAC1.761.622.66126.84131.49270.04540.79543.601050.21
SamBaTen (CP-ALS)1.971.982.77235.55211.94553.881398.331323.992416.36
InParTen2-Static0.740.771.0836.9448.2872.14180.63177.64357.48
DynamicSamBaTen (Incremental)1.161.131.2142.9766.4789.53351.21477.21652.03
InParTen20.680.570.4637.0829.7917.11185.99111.2183.01
Initial Density 20%StaticIM-PARAFAC1.761.271.39125.8098.03110.03570.94397.80459.99
SamBaTen (CP-ALS)1.971.982.72264.09194.66265.961403.27973.12990.81
InParTen2-Static0.970.860.8846.0335.5839.55146.53108.07119.39
DynamicSamBaTen (Incremental)1.191.040.9939.6935.1830.91293.2250.71182.81
InParTen20.670.540.4437.1724.1112.40158.2195.4546.8

Comparison of relative errors according to the different initial size_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500Tensor 800×800×800
Initial size (I=J=K)105080100250400100400640
Initial Density 60%StaticIM-PARAFAC0.820.690.570.820.730.570.830.700.57
SamBaTen (CP-ALS)0.820.710.560.820.720.570.830.720.57
InParTen2-Static0.820.690.560.820.720.570.820.720.56
DynamicSamBaTen (Incremental)0.850.780.570.830.800.570.830.790.58
InParTen20.820.690.560.820.710.570.830.720.56
Initial Density 20%StaticIM-PARAFAC0.820.820.820.830.830.830.830.830.83
SamBaTen (CP-ALS)0.820.820.820.820.830.830.830.830.83
InParTen2-Static0.820.820.820.830.830.830.830.830.83
DynamicSamBaTen (Incremental)0.840.850.830.850.980.871.020.880.86
InParTen20.820.820.830.820.840.840.840.840.84

Comparison of execution time and relative error using real world datasets_

Execution Time(min)Relative Error
YELPMovieLensNetflixYELPMovieLensNetflix
StaticIM-PARAFAC7.4476.03924.910.980.750.79
SamBaTen (CP-ALS)44.71518.73N.A.0.980.99N.A.
InParTen2-Static2.04729.03893.330.960.760.79
DynamicSamBaTen (incremental)36.4236.65N.A.0.980.91N.A.
InParTen21.9423.50246.630.970.800.80

Comparison of execution time (min) according to the different densities of incoming tensor_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500800×800×800
Density(%)5%10%20%5%10%20%5%10%20%
Initial Density 60%StaticIM-PARAFAC1.191.371.6278.595.5131.5318.9402.3543.6
SamBaTen (CP-ALS)1.551.741.98134.9183.8211.9657.6836.21323.9
InParTen2-Static0.560.640.7726.934.248.3106.2130.2177.6
DynamicSamBaTen1.071.071.1335.545.766.5301.5354.2477.2
InParTen20.50.460.5710.917.629.837.1860.95111.2
Initial Density 20%StaticIM-PARAFAC0.870.991.2740.958.598.03167.9246.3397.8
SamBaTen (CP-ALS)1.421.481.9841.9116.2194.66345.6554.3973.1
InParTen2-Static0.640.720.8614.921.435.659.884.32108.1
DynamicSamBaTen0.920.961.0418.324.935.287.7125.9250.7
InParTen20.360.440.549.316.724.130.251.595.4

Comparison of relative errors according to the different densities of incoming tensor_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500800×800×800
Density5%10%20%5%10%20%5%10%20%
Initial Density 60%StaticIM-PARAFAC0.630.670.690.640.720.730.620.700.70
SamBaTen (CP-ALS)0.650.700.690.660.710.720.660.710.72
InParTen2-Static0.630.670.710.600.730.720.610.690.72
DynamicSamBaTen (Incremental)0.670.690.780.670.750.800.670.740.79
InParTen20.610.670.690.650.730.710.630.720.72
Initial Density 20%StaticIM-PARAFAC0.870.870.820.900.890.830.890.880.83
SamBaTen (CP-ALS)0.880.870.820.890.890.830.890.890.83
InParTen2-Static0.870.870.820.870.890.830.900.880.83
DynamicSamBaTen (Incremental)0.880.890.850.910.930.980.890.930.88
InParTen20.860.890.820.910.900.840.870.900.86

Summary of different incremental PARAFAC tensor decomposition algorithms_

Static tensor decompositionIncremental (Single-aspect)Incremental (Multi-aspect)DistributedScalability
Zhou et al. (2018)
Ma et al. (2018)
Gujral et al. (2018)
Yang and Yong (2019b)
Song et al. (2017)
Najafi et al. (2019)
InParTen2

Real-world tensor datasets_

Data NameData typeTotal DimensionInitial DimensionNon-zeroFile size
YELPUser×Location×Time70,817×15,579×10870,815×15,572×20334,1665.6MB
MovieLensUser×Movie×Time71,567×10,681×15771,559×9,717×510,000,054187MB
NetflixUser×Movie×Time2,649,429×17,770 ×2,1822,648,623×17,764×5098,754,3431.8GB

Table of symbols_

SymbolsDefinitions
XTensor
X(n)Mode-n unfolding matrix of tensor X
||X||FFrobenius norm of tensor X
nnz(X)Number of non-zero elements in tensor X
Kronecker product
Khatri-Rao product
*Hadamard product (Element-wise product)
Pseudo inverse
Language:
English
Publication timeframe:
4 times per year
Journal Subjects:
Computer Sciences, Information Technology, Project Management, Databases and Data Mining