This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Tomasz Szmuc, Rafał Mrówka, Marek Brańka, Jakub Ficoń, Piotr Pieta, A Novel Method for Fast Generation of 3D Objects from Multiple Depth Sensors., Journal of Artificial Intelligence and Soft Computing Research, 2023, 13(2): 95-105.Search in Google Scholar
Martin-Gomez, A., Li, H., Song, T., Yang, S., Wang, G., Ding, H., Navab, N., Zhao, Z., Armand, M., Sttar: surgical tool tracking using off-the-shelf augmented reality head-mounted displays., IEEE Transactions on Visualization and Computer Graphics, 2023, 1-16.Search in Google Scholar
Rodrigues, R.T., Miraldo, P., Dimarogonas, D.V., Aguiar, A.P., A framework for depth estimation and relative localization of ground robots using computer vision., IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2019, 3719-3724.Search in Google Scholar
Silva, R., Cielniak, G., Gao, J., Leaving the Lines Behind: Vision-Based Crop Row Exit for Agricultural Robot Navigation., Preprint at https://arxiv.org/abs/2306.05869, 2023.Search in Google Scholar
Sharma, A., Nett, R., Ventura, J., Unsupervised learning of depth and ego-motion from cylindrical panoramic video with applications for virtual reality., International Journal of Semantic Computing, 2020, 14(03): 333-356.Search in Google Scholar
Rasla, A., Beyeler, M., The relative importance of depth cues and semantic edges for indoor mobility using simulated prosthetic vision in immersive virtual reality., Proceedings of the 28th ACM Symposium on Virtual Reality Software and Technology, 2022, 1-11.Search in Google Scholar
Patakin, N., Vorontsova, A., Artemyev, M., Konushin, A., Single-stage 3d geometry-preserving depth estimation model training on dataset mixtures with uncalibrated stereo data., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 1705-1714.Search in Google Scholar
Peng, R., Wang, R., Wang, Z., Lai, Y., Wang, R., Rethinking depth estimation for multi-view stereo: A unified representation., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 8645-8654.Search in Google Scholar
Choe, J., Joo, K., Imtiaz, T., Kweon, I.S., Volumetric propagation network: Stereo-lidar fusion for long-range depth estimation., IEEE Robotics and Automation Letters, 2021, 6(3): 4672-4679.Search in Google Scholar
Hirschmuller, H., Accurate and efficient stereo processing by semi-global matching and mutual information., IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2005, 2: 807-814.Search in Google Scholar
Chang, J.-R., Chen, Y.-S., Pyramid stereo matching network., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 5410-5418.Search in Google Scholar
Liu, P., King, I., Lyu, M.R., Xu, J.,Flow2stereo: Effective self-supervised learning of optical flow and stereo matching., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, 6648-6657.Search in Google Scholar
Ullman, S., The interpretation of structure from motion., Proceedings of the Royal Society of London. Series B. Biological Sciences, 1979, 203(1153): 405–426.Search in Google Scholar
Zhou, T., Brown, M., Snavely, N., Lowe, D.G., Unsupervised learning of depth and ego-motion from video., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, 1851–1858.Search in Google Scholar
Godard, C., Mac Aodha, O., Firman, M., Brostow, G.J., Digging into self-supervised monocular depth estimation., Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, 3828–3838.Search in Google Scholar
Zhou, Z., Fan, X., Shi, P., Xin, Y., R-msfm: Recurrent multi-scale feature modulation for monocular depth estimating., Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, 12777–12786.Search in Google Scholar
Zhang, N., Nex, F., Vosselman, G., Kerle, N., Litemono: A lightweight cnn and transformer architecture for self-supervised monocular depth estimation., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023, 18537–18546.Search in Google Scholar
Zhang, X., Zhou, X., Lin, M., Sun, J., Shufflenet: An extremely efficient convolutional neural network for mobile devices., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 6848–6856.Search in Google Scholar
Eigen, D., Fergus, R., Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture., Proceedings of the IEEE/CVF International Conference on Computer Vision, 2015, 2650–2658.Search in Google Scholar
Hui, T.-W., Rm-depth: Unsupervised learning of recurrent monocular depth in dynamic scenes., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 1675–1684.Search in Google Scholar
Yan, J., Zhao, H., Bu, P., Jin, Y., Channel-wise attention-based network for self-supervised monocular depth estimation., 2021 International Conference on 3D Vision (3DV), 2021, 464–473.Search in Google Scholar
Zhao, C., Zhang, Y., Poggi, M., Tosi, F., Guo, X., Zhu, Z., Huang, G., Tang, Y., Mattoccia, S.: Monovit: Self-supervised monocular depth estimation with a vision transformer., 2022 International Conference on 3D Vision (3DV), 2022, 668–678 .Search in Google Scholar
He, M., Hui, L., Bian, Y., Ren, J., Xie, J., Yang, J., Ra-depth: Resolution adaptive self-supervised monocular depth estimation., European Conference on Computer Vision, 2022, 565–581.Search in Google Scholar
Shim, D., Kim, H.J., Swindepth: Unsupervised depth estimation using monocular sequences via swin transformer and densely cascaded network., arXiv preprint arXiv:2301.06715, 2023.Search in Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A., Speeding up convolutional neural networks with low rank expansions., Proceeding of the British Machine Vision Conference 2014. British Machine Vision Association, 2014.Search in Google Scholar
Howard, A.G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., Andreetto, M., Adam, H., Mobilenets: Efficient convolutional neural networks for mobile vision applications., arXiv preprint arXiv:1704.04861, 2017.Search in Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.-C., Mobilenetv2: Inverted residuals and linear bottlenecks., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 4510–4520.Search in Google Scholar
Howard, A., Sandler, M., Chu, G., Chen, L.-C., Chen, B., Tan, M., Wang, W., Zhu, Y., Pang, R., Vasudevan, V., et al., Searching for mobilenetv3., Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, 1314–1324.Search in Google Scholar
Ma, N., Zhang, X., Zheng, H.-T., Sun, J., Shufflenet v2: Practical guidelines for efficient cnn architecture design., Proceedings of the European Conference on Computer Vision, 2018,116–131.Search in Google Scholar
Mehta, S., Rastegari, M., Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer., International Conference on Learning Representations., 2021.Search in Google Scholar
Yang, R., Ma, H., Wu, J., Tang, Y., Xiao, X., Zheng, M., Li, X., Scalablevit: Rethinking the context-oriented generalization of vision transformer., Proceedings of the European Conference on Computer Vision, 2022, 480–496.Search in Google Scholar
Chen, Y., Dai, X., Chen, D., Liu, M., Dong, X., Yuan, L., Liu, Z.: Mobile-former: Bridging mobilenet and transformer., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, 5270–5279.Search in Google Scholar
Ho, J., Kalchbrenner, N., Weissenborn, D., Salimans, T., Axial attention in multidimensional transformers., arXiv preprint arXiv:1912.12180, 2019.Search in Google Scholar
Mehta, S., Rastegari, M., Separable self-attention for mobile vision transformers., Transactions on Machine Learning Research, 2022.Search in Google Scholar
Ronneberger, O., Fischer, P., Brox, T., U-net: Convolutional networks for biomedical image segmentation., Medical Image Computing and Computer-Assisted Intervention–MICCAI, 2015, 234–241.Search in Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E., Imagenet classification with deep convolutional neural networks., Advances in neural information processing systems, 2012, 5.Search in Google Scholar
Xie, S., Girshick, R., Dollar, P., Tu, Z., He, K., Aggregated residual transformations for deep neural networks., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2017, 1492–1500.Search in Google Scholar
Glorot, X., Bordes, A., Bengio, Y., Deep sparse rectifier neural networks., Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, 2011, 315–323.Search in Google Scholar
Maas, A.L., Hannun, A.Y., Ng, A.Y., et al., Rectifier nonlinearities improve neural network acoustic models., Proc. Icml, 2013, 30(1): 3.Search in Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P., Image quality assessment: from error visibility to structural similarity., IEEE transactions on image processing, 2004, 600–612.Search in Google Scholar
Girshick, R., Fast r-cnn., Proceedings of the IEEE/CVF International Conference on Computer Vision, 2015, 1440–1448.Search in Google Scholar
Zhou, H., Greenwood, D., Taylor, S., Self-supervised monocular depth estimation with internal feature fusion, arXiv preprint arXiv:2110.09482, 2021.Search in Google Scholar
Geiger, A., Lenz, P., Stiller, C., Urtasun, R., Vision meets robotics: The kitti dataset., The International Journal of Robotics Research., 2013, 32(11): 1231-1237.Search in Google Scholar
Saxena, A., Sun, M., Ng, A.Y., Make3d: Learning 3d scene structure from a single still image., IEEE Transactions on Pattern Analysis and Machine Intelligence, 2008, 31(5): 824-840.Search in Google Scholar
Eigen, D., Puhrsch, C., Fergus, R., Depth map prediction from a single image using a multi-scale deep network., Advances in neural information processing systems, 2014, 27.Search in Google Scholar
Wang, C., Buenaposada, J.M., Zhu, R., Lucey, S., Learning depth from monocular videos using direct methods., Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, 2022–2030.Search in Google Scholar