DeConvolve: Towards Textually Explainable and Human Cognizable Convolutional Networks

Convolutional Neural Networks (CNNs) have demonstrated remarkable accuracy and are employed in different applications. However, adding existing CNNs to physics-aware frameworks can distort image features, reducing classification accuracy. To overcome this, a new term is added to the loss function to reduce distortions and highlight human-recognizable structures in the feature maps. The proposed DeConvolve is an explainability methodology for multimodal Large Language Models (LLM) on feature maps to extract human-understandable sub-steps and provide textual explanations for model inference. DeConvolve recognizes three major impediments when using LLMs to describe feature maps: scattered regions of interest within the feature map, large areas of interest, and conflicting learning across filters in each convolutional layer. Finally, explanations for specific toy examples are derived through weighted semantic averaging. The data is curated in the format of images, classes, and the rationale behind a professional’s classification to train a Contrastive Language–Image Pre-training (CLIP)-based model for generating robust explanations.

Langue:: Anglais

Périodicité:: 4 fois par an
Sujets de la revue:: Informatique, Informatique

RSS Feed de la revue

DeConvolve: Towards Textually Explainable and Human Cognizable Convolutional Networks

Meeradevi,

Hrishikesh Haritas

Darshan Bankapure

Divyansh Mishra

Publié en ligne: 25 sept. 2025

Pages: 22 - 38

Reçu: 14 avr. 2025

Accepté: 30 juin 2025

DOI: https://doi.org/10.2478/cait-2025-0020

Mots clésConvolutional neural networks, Contrastive language-image pretraining, Gradient-weighted class activation mapping, Human recognizable, Large language models

© 2025 Meeradevi et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Mots clés
Convolutional neural networks, Contrastive language-image pretraining, Gradient-weighted class activation mapping, Human recognizable, Large language models