Acceso abierto

Multi-Aspect Incremental Tensor Decomposition Based on Distributed In-Memory Big Data Systems

 y   
20 may 2020

Cite
Descargar portada

Figure 1

Schematic representation of tensor matricization.
Schematic representation of tensor matricization.

Figure 2

Schematic representation of incremental tensor matricization.
Schematic representation of incremental tensor matricization.

Figure 3

Schematic representation of the multi-dimensional incremental PARAFAC tensor decomposition method.
Schematic representation of the multi-dimensional incremental PARAFAC tensor decomposition method.

Figure 4

Factor matrices obtained by the decomposition of the incoming tensors.
Factor matrices obtained by the decomposition of the incoming tensors.

Figure 5

Comparison of relative errors according to the different densities of incoming tensor datasets.
Comparison of relative errors according to the different densities of incoming tensor datasets.

Figure 6

Comparison of execution time (min) according to the different initial size.
Comparison of execution time (min) according to the different initial size.

Comparison of execution time (min) according to the different initial size_

Tensor Synthetic Datasets100×100 ×100500×500×500800×800×800
Initial dimension size (I=J=K)105080100250400100400640
Initial Density 60%StaticIM-PARAFAC1.761.622.66126.84131.49270.04540.79543.601050.21
SamBaTen (CP-ALS)1.971.982.77235.55211.94553.881398.331323.992416.36
InParTen2-Static0.740.771.0836.9448.2872.14180.63177.64357.48
DynamicSamBaTen (Incremental)1.161.131.2142.9766.4789.53351.21477.21652.03
InParTen20.680.570.4637.0829.7917.11185.99111.2183.01
Initial Density 20%StaticIM-PARAFAC1.761.271.39125.8098.03110.03570.94397.80459.99
SamBaTen (CP-ALS)1.971.982.72264.09194.66265.961403.27973.12990.81
InParTen2-Static0.970.860.8846.0335.5839.55146.53108.07119.39
DynamicSamBaTen (Incremental)1.191.040.9939.6935.1830.91293.2250.71182.81
InParTen20.670.540.4437.1724.1112.40158.2195.4546.8

Comparison of relative errors according to the different initial size_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500Tensor 800×800×800
Initial size (I=J=K)105080100250400100400640
Initial Density 60%StaticIM-PARAFAC0.820.690.570.820.730.570.830.700.57
SamBaTen (CP-ALS)0.820.710.560.820.720.570.830.720.57
InParTen2-Static0.820.690.560.820.720.570.820.720.56
DynamicSamBaTen (Incremental)0.850.780.570.830.800.570.830.790.58
InParTen20.820.690.560.820.710.570.830.720.56
Initial Density 20%StaticIM-PARAFAC0.820.820.820.830.830.830.830.830.83
SamBaTen (CP-ALS)0.820.820.820.820.830.830.830.830.83
InParTen2-Static0.820.820.820.830.830.830.830.830.83
DynamicSamBaTen (Incremental)0.840.850.830.850.980.871.020.880.86
InParTen20.820.820.830.820.840.840.840.840.84

Comparison of execution time and relative error using real world datasets_

Execution Time(min)Relative Error
YELPMovieLensNetflixYELPMovieLensNetflix
StaticIM-PARAFAC7.4476.03924.910.980.750.79
SamBaTen (CP-ALS)44.71518.73N.A.0.980.99N.A.
InParTen2-Static2.04729.03893.330.960.760.79
DynamicSamBaTen (incremental)36.4236.65N.A.0.980.91N.A.
InParTen21.9423.50246.630.970.800.80

Comparison of execution time (min) according to the different densities of incoming tensor_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500800×800×800
Density(%)5%10%20%5%10%20%5%10%20%
Initial Density 60%StaticIM-PARAFAC1.191.371.6278.595.5131.5318.9402.3543.6
SamBaTen (CP-ALS)1.551.741.98134.9183.8211.9657.6836.21323.9
InParTen2-Static0.560.640.7726.934.248.3106.2130.2177.6
DynamicSamBaTen1.071.071.1335.545.766.5301.5354.2477.2
InParTen20.50.460.5710.917.629.837.1860.95111.2
Initial Density 20%StaticIM-PARAFAC0.870.991.2740.958.598.03167.9246.3397.8
SamBaTen (CP-ALS)1.421.481.9841.9116.2194.66345.6554.3973.1
InParTen2-Static0.640.720.8614.921.435.659.884.32108.1
DynamicSamBaTen0.920.961.0418.324.935.287.7125.9250.7
InParTen20.360.440.549.316.724.130.251.595.4

Comparison of relative errors according to the different densities of incoming tensor_

Tensor Synthetic Datasets100 × 100 ×100500× 500×500800×800×800
Density5%10%20%5%10%20%5%10%20%
Initial Density 60%StaticIM-PARAFAC0.630.670.690.640.720.730.620.700.70
SamBaTen (CP-ALS)0.650.700.690.660.710.720.660.710.72
InParTen2-Static0.630.670.710.600.730.720.610.690.72
DynamicSamBaTen (Incremental)0.670.690.780.670.750.800.670.740.79
InParTen20.610.670.690.650.730.710.630.720.72
Initial Density 20%StaticIM-PARAFAC0.870.870.820.900.890.830.890.880.83
SamBaTen (CP-ALS)0.880.870.820.890.890.830.890.890.83
InParTen2-Static0.870.870.820.870.890.830.900.880.83
DynamicSamBaTen (Incremental)0.880.890.850.910.930.980.890.930.88
InParTen20.860.890.820.910.900.840.870.900.86

Summary of different incremental PARAFAC tensor decomposition algorithms_

Static tensor decompositionIncremental (Single-aspect)Incremental (Multi-aspect)DistributedScalability
Zhou et al. (2018)
Ma et al. (2018)
Gujral et al. (2018)
Yang and Yong (2019b)
Song et al. (2017)
Najafi et al. (2019)
InParTen2

Real-world tensor datasets_

Data NameData typeTotal DimensionInitial DimensionNon-zeroFile size
YELPUser×Location×Time70,817×15,579×10870,815×15,572×20334,1665.6MB
MovieLensUser×Movie×Time71,567×10,681×15771,559×9,717×510,000,054187MB
NetflixUser×Movie×Time2,649,429×17,770 ×2,1822,648,623×17,764×5098,754,3431.8GB

Table of symbols_

SymbolsDefinitions
XTensor
X(n)Mode-n unfolding matrix of tensor X
||X||FFrobenius norm of tensor X
nnz(X)Number of non-zero elements in tensor X
Kronecker product
Khatri-Rao product
*Hadamard product (Element-wise product)
Pseudo inverse
Idioma:
Inglés
Calendario de la edición:
4 veces al año
Temas de la revista:
Informática, Tecnologías de la información, Gestión de proyectos, Bases de datos y minería de datos