A review on high dynamic range (HDR) image quality assessment

Nowadays, multimedia presentations are becoming more and more important particularly because our world is increasingly digitized and always connected. Multimedia presentations may include different modalities that may range from simple text, audio, speech, sound, images, to more complex content such as touch sense and smell (Rahayu, 2011).

Visual-based multimedia presentations aim at reconstructing visual information that corresponds to the perception of the human visual system (HVS). Recently, high dynamic range (HDR) imaging is considered as one of the technological advances that can accomplish this purpose (Narwaria et al., 2015). Ideally, HDR imaging requires special tools, devices, and processing pipelines that are different from those used for today’s ordinary image processing in dealing with low dynamic range (LDR)/standard dynamic range (SDR) images.

However, considering the prevalence of today’s conventional imaging technology, HDR can also take advantage of SDR/LDR image processing methods. For example, this can be seen from the rise of smartphone and DSLR cameras in the past few years that can be used to capture HDR-processed images (Kundu et al., 2017a, b; Mantiuk et al., 2016). HDR images obtained in this way are usually created using the inverse tone mapping operator (ITMO) and multi-exposure fusion (MEF) methods because these two methods can produce images in a very wide range of lighting conditions (Azimi et al., 2015). These two methods were found to be able to process images that can produce visual information that has a range similar to that of the HVS. In addition, the resulting HDR images processed by these methods can also look natural, more attractive and informative, and can even reduce the noise level that may have been in the image (Kundu et al., 2017a, b; Ma et al., 2015; Rovid et al., 2007; Varkonyi-Koczy et al., 2008).

The development of HDR technologies will certainly require a special image quality assessment (IQA) method that is tailored to the characteristics of the HDR images. HDR technologies place more challenges on quality measurement methods due to the very high sensitivity of the HVS to errors and distortions on the images. IQA algorithm usually plays an important role in the image processing pipeline (Opozda and Sochan, 2014; Zhu et al., 2018a, b). It also aims that the quality measured consistently reflects the perceived quality of the image by the HVS.

There are two broad categories of image quality measurement methods, namely subjective and objective measurement methods. Subjective image quality measurement is considered the most reliable method because it directly involves human viewers in carrying out the quality evaluation of the images displayed. This method can represent how human visual systems’ perception responds to given visual stimuli. Unfortunately, there are major drawbacks for subjective methods: it is quite expensive to be done consistently, and it requires a lot of time to implement. Therefore, objective image quality measurement methods that do not involve human viewers have increased quite rapidly.

Based on the availability of original images used as a reference for quality measurements, objective image quality assessment can be classified into three categories: full-reference (FR), no-reference (NR), and reduced-reference (RR) methods. For conventional LDR/SDR images, a wide variety of FR, RR, and NR objective measurement methods have been around over the past few years. For HDR images, however, many have developed FR/NR methods but very few or even less have done the same thing for the RR method.

On the contrary, the use of the RR method to measure the quality of visual services by utilizing reduced information, under current conditions when video streaming services are on the rise, for example, can be very useful for service providers such as telcos or ISPs in monitoring the quality of their products. Or on the other hand, clients can ensure that the quality of service they receive is really as promised by the content provider.

Considering the various explanations that we have given above, this paper will provide a review of the objective quality evaluation method for HDR images. The presentation of this paper will be organized as follows. In the following sections, we will briefly describe the HDR image processing flow in general. Subsequently, an explanation about the image quality assessment method will be given in the third section, which will be followed in the fourth section by a further explanation of some of the HDR image quality measurement models found in the literature. Our concept of quality assessment for HDR images in a reduced-reference fashion is outlined in the fifth section. Finally, this paper will conclude with some closing remarks in the sixth section.

HDR imaging

HDR imaging pipeline

An illustration of HDR image formation and processing is given in Figure 1. In this illustration, it shows how HDR images are acquired from the source, processed by encoding/decoding methods involving data compression techniques, and then displayed and evaluated for their quality (Artusi et al., 2017; Mantiuk et al., 2016). First of all, HDR images can be produced either by a camera capturing objects from the real world or by computer graphics that create a model-based image. After that, the HDR images can be compressed and encoded so that they can be stored or transmitted more efficiently, by converting the image data format so that it requires less storage capacity or less bandwidth for transmission. Subsequently, the images can be displayed in various types of display devices, either natively or using conventional LDR/SDR display devices.

The display of HDR image content is still largely limited by the capabilities of the display device used. For devices with lower specifications to display HDR images properly, a tone mapping method that can capture the wider dynamic range of HDR images and convert them into a narrower dynamic range on conventional devices is needed. The color correction method can also be employed to resolve any mismatch between HDR content and the capabilities of the display devices. On the other hand, there is also an inverse tone mapping algorithm which can be used to reconstruct HDR content from a single SDR image or the multi-exposure image fusion (MEF) method which is able to produce HDR content from a combination of several SDR images with different exposure.

Last but not least, HDR image quality assessment is performed with the main objective to assess the various algorithms used in the pipeline.

HDR creation

Methods of constructing HDR images using MEF and ITMO have been widely described in previous studies that can be found in the literature.

MEF can be categorized as a method of combining images that has been introduced since the 1980s, but recently it has received more attention for further research (Gu et al., 2012; Li et al., 2012; Song et al., 2012; Zhang and Cham, 2012). Since humans act as users for most applications that apply MEF methods, these methods require an easy and simple but reliable quality assessment (Shen et al., 2013; Song et al., 2012; Zeng et al., 2014). A list of MEF-based methods that are relevant to the HDR image construction is given in Table 1.

Table 1.

Summary of MEF-based HDR images.

Paper	Method	Strength	Weakness
Ma et al. (2015)	Multi-exposure fusion algorithm	Well correlates with subjective judgments and significantly outperforms the existing IQA models for general image fusion	Cannot apply on a various image content
Rovid et al. (2007)	Gradient-based synthesized multiple exposure	Produces good quality HDR images from a series of poor quality photos taken by various exposures	Cannot apply on a colored image
Varkonyi-Koczy et al. (2008)	New multiple exposure time image synthesization technique	High-quality color HDR image which contains the maximum level of details and RGB color information	The current implementation of the proposed method is limited to process static scenes
Gu et al. (2012)	Fused gradient field	This method is efficient and effective	Existing algorithms can only be used for small movements
Li et al. (2012)	New quadratic optimization	Can enhance fine detail to produce sharper images as existing high dynamic range imaging schemes	Saturation images sometimes reduced by using both proposed exposure fusion schemes
Song et al. (2012)	New probabilistic exposure fusion scheme	New approach is advantageous compared with representative existing tone mapping operators	Rating and ranking are not suitable because both are too complex for an observer
Shen et al. (2013)	A novel fusion algorithm based on perceptual quality measures	Experiments demonstrated better performance of proposed algorithm compared to other methods	It is relatively difficult to extend these metrics to cases with several image sources
Goshtasby (2005)	Fuse multi-exposure images of a static scene taken by a stationary camera	It has no side effect and the local color and contrast in the input will not change	Select images to be mixed, the right size must be used to fuse the image
Mertens et al. (2009)	Fuse a bracket exposure sequence	Comparable to the existing tone mapping operator	Unoptimized implementation of software performs fusion of exposure within seconds
Yun et al. (2012)	Single exposure-based image fusion using multi-transformation	Shows a more visually pleasing output with the perceptually increased dynamic range	–
Huang et al. (2018a, b)	A new color multi-exposure image fusion	Successfully producing a better color display from the image blends and more texture details than other existing exposure fusion techniques	Based on the proposed approach, MEF cannot yet combine dynamic multi-exposure images and eliminate them
Kinoshita et al. (2018)	A new multi-exposure image fusion method based on exposure compensation	Better than other methods in terms of TMQI, statistical naturalness and discrete entropy	It is unclear how to determine appropriate exposure values, which are difficult to set at the time of photography

A number of multi-exposure fusion methods (Goshtasby, 2005; Reinhard et al., 2010) used localized fusion weights without sufficient consistency considerations over a large area which could lead to an unnatural appearance of the fusion result. Other proposed general fusion methods are also not optimal for individual applications and only apply to grayscale images. Goshtasby (2005) used a merging method based on the maximum information content obtained from a still camera to obtain multiple exposure images of static objects/scenes. The method uses a process of dividing the image into blocks of uniform size and then selecting images for each block with the maximum information available on that block. Mertens et al. (2009) proposed a fusion technique of the exposure sequence bracket into a high-quality image without having to convert it to the HDR domain. This method does not require the physical formation of HDR images so the process is simpler and more computationally efficient. In this way, the camera response curve calibration calculation is not required. The method combines several different exposures, taking suitable image contrast, high saturation, and good exposure to guide the image merging process. Song et al. (2012) demonstrate the initial image is estimated by maximizing visual contrast and gradient scenes, and the fused image is synthesized by pressing the image gradient reversal. A similar method of MEF based on gradients is proposed in the study of Gu et al. (2012). Based on Li et al. (2012) and Mertens et al. (2009) that improved the details of the fused image by solving the problem of quadratic optimization. The median and recursive MEF-based filter method was developed in the study of Li and Kang (2012), taking into account local contrast, brightness, and color differences. The use of media filters can also handle dynamic scenes. The new gradient-based approach to extract image details is introduced by Rovid et al. (2007). Multiple exposure images of the same scene as the input data are used. The images are divided into regions during the process.

On the other hand, in the last two decades, HDR image formation techniques using SDR content using ITMO methods have also been proposed. A list of ITMO-based methods that are relevant with the HDR image construction is given in Table 2. For example, in the study of Landis (2002), a global expansion technique was introduced for the first time using the exponential function for SDR image pixel values above a certain threshold to form an HDR image. This method works well for image-based lighting (IBL), but the results are not satisfactory in the HDR image visualization. Subsequently, Banterle et al. (2006) applied the inverse photographic tone reproduction method developed by Reinhard et al. (2002) to produce SDR image expansion. In this method, the median cut is used to estimate the area on the image with a high luminance value which is then used to map the pixel expansion. After that, linear interpolation is used to get the final HDR image. In this way, images that have good quality can be produced because the method can remove noise and blocky effects. Unfortunately, while this method works well for still images, it is not sufficient for video processing.

Table 2.

Summary of ITMO-based HDR images.

Paper	Method	Strength	Weakness
Larson et al. (1997)	Tone reproduction curves (TRC)	Performs well on a wide category of images	Produced a visual accurate images but not enhanced images
Reinhard et al. (2002)	Zone system and Automatic dodging and burning	Well-suited on a across-the-board of HDR images	This method only brought textured areas within range which is categorized simple
Durand and Dorsey (2000)	Extended version of Ferwerda et al. (1996)	Solve interesting problem in TMO	This system is slower than its state-of-the-art method
Fattal et al. (2002)	A gradient-based tone mapping operator	Able to compress a very wide dynamic range, present every details and less common noise or artifacts	Does not enhance global features
Mantiuk et al. (2006)	Contrast mapping and contrast equalization	Provide a high visual quality output with appealing brighness and contrast even no artifacts	Does not run in real-time application and does not include color in information
Qiu et al. (2006)	Optimized tone reproduction curve (TRC)	More simple than the previous, faster in time consuming and easier to implement	Weak at destroying spatial details
Eilertsen et al. (2015)	A real-time noise TMO	Minimize the contrast disortions, control the perceptibility of noises and adjust to a provide and shifting light, also can be apply in real-time	Lack in scenery creation and best subjective score
Rana et al. (2019)	SVR	Gained a consistent result under complex real-world ilumination transitions	The execution time are the longest among the-state-of-the-art
El Mezeni and Saranovac (2018)	Local tone mapping	Present details and good local and global contrast of proceed images also better result in overall image quality	Produce a little amount of noise

Furthermore, an ITMO method with simple linear expansion has also been proposed by Akyüz et al. (2007). In this method, psychophysical experiments have also been used which show that SDR image content can be displayed properly on an HDR screen. The disadvantage of this method is that it is not able to boost contrast in saturated regions. Another ITMO operator proposed in the study of Kinoshita et al. (2017) uses Reinhard’s global operator (Reinhard et al., 2002), which shows that the resulting images have good structural similarities and lower computation costs than other methods. Kovaleski and Oliveira (2009) also implemented a fully automated ITMO process with the aid of a neural network approach. In this approach, a cross-bilateral technique is used that can improve image details over a wide exposure range, especially in over-exposed image areas which are usually a problem in previous studies. The image brightness correction function is commonly used for the reverse tone mapping method and the results show superior quality when compared to conventional methods because there are less distortion artifacts displayed. Unfortunately, there is still a little problem with the color disappearance and difficulties in reviving the texture of the under- or over-exposed image areas.

False contour/edge artifacts

HDR creation such as described in the previous section is prone to distortion artifacts in the form of false edge/contour. The edge image derived from HDR-processed images should be further analyzed to provide more useful information related to the false edge/contour. Contour detection is usually done after edge detection (Lokmanwar and Bhalchandra, 2019).

Contour detection is one of the most important earlier steps in the segmentation process and object detection as well as understanding of image scene/content. Contour analysis, which begins with the detection process, is increasingly being used to produce high-quality image segmentation. Contour analysis is also used to handle more complex contours more efficiently even though it is used for images with cluttered backgrounds that we often get in pictures we take from real-life images (Manno-Kovacs, 2019). For example, we can use the modified Harris for edges and corners (MHEC) method which is considered efficient for contour detection purposes. Unfortunately, there are still drawbacks to this method which involves an iterative process that is slower than the others.

Contour detection can also lead to image edge detection errors identified as false contour. Typically, such false contours are found in low frequency and smooth gradient image areas (Ahn and Kim, 2005). A number of image processing techniques can make these false contours more visible than ever before; for example, contrast enhancement, sharpness enhancement, color modification, and so on. Some of these techniques are actually used in the HDR image generation process.

A summary of various contour detection methods is given in Table 3.

Table 3.

Summary of contour detection methods.

Paper	Method	Advantage	Drawback
Huang et al. (2018a, b)	False contour candidate in HEVC	Detecting very noticeable, remove and preseving texture and details	false remove false contour in larger sized
Ahn and Kim (2005)	Flat-region and bit-depth extension	Removes false contour effectively and preserving sharpness	Cannot remove the local holelike pattern effectively
Lokmanwar and Bhalchandra (2019)	Gaussian filter and spectral clustering	Enhancing peak level and smoothing direction	Contour detection only generates only around a strong boundary
Manno-Kovacs (2019)	MHEC (Harris for edge and corners) point set	Handle complex contour, ability for multiple object detection	Iterative active contour still slower than other method
Chua and Shen (2017)	CNN patch-level measurement	No need precisely predict boundary pixel	At large texture regions still erroneous

Image quality assessment

Image quality measurement methods have become a fairly hot research topic in recent years. The application of the method can be very broad, starting from the quality assessment of image coding techniques, monitoring the quality of services, watermarking, image enhancement, applications in the medical and entertainment world, and others. One of the fundamental quality assessment methods is subjective measurement methods, which, although they are expensive and time consuming, are still used as a reference for other objective methods. Objective methods are usually used as alternatives to reduce costs and time, apart from being easy to implement (Chandler, 2013; Ma et al., 2015). In the following subsections we will describe a little more about subjective and objective quality evaluation methods.

Subjective methods

A short list of subjective assessment methods is given in Table 4. Subjective quality measurement is a controlled experiment with human participation to measure the perceived quality of the image or video displayed to the user. In such experiment, the golden standard for benchmarking purposes is the human judgment without the advice of others (Patil and Patil, 2017). Not only that, subjective methods can also give insight into human behavior in the context of image quality assessment (Ma et al., 2015). It has been realized for a long time that the task of image quality evaluation to human viewers involves not only physiological process but there is also a psychological aspect to the process. As a consequence, the subjective methods also lend themselves to be used as a benchmark for various algorithms and methods in image/video processing, including image quality assessment algorithms.

Table 4.

Summary of several subjective assessment methods.

Paper	Method	Description
van Dijk et al. (1995)	Category scaling	Numerical category scaling techniques provide an efficient and valid way to get a compression ratio versus a quality curve and to assess the image quality perceived in a much smaller way
Th. Alpert (CCETT) and J.-P. Evain (EBU) (Alpert and Evain, 1997)	SSCQE and DSCQE	SSCQE to evaluate subjective quality, while the DCSQE is used to maintain image quality and information transmitted
Sheikh et al. (2006)	Double stimulus	The experiment used a double-stimulus methodology to measure quality more accurately for realignment purposes
Redi et al. (2010)	SS and QR	Single stimulus (SS) method presents several weaknesses. Quality ruler (QR) method is worth implementing efforts from the point of view of consistency and repetition of scores
Mantiuk et al. (2012)	Force-choice pairwise comparison	The forced-choice pairwise comparison method results in the smallest measurement variance and thus produces the most accurate results. This method is also the most time-efficient, assuming a moderate number of compared conditions
Persson (2014)	QR	The difference in assessment in the study seemed to be significantly dependent on the perceived similarity between the ruler image and the test image
Nuutinen et al. (2016)	Dynamic reference	The DR method is very suitable for experiments that require very accurate results in a short time because the DR method is more accurate than the ACR method and faster than the PC method
Zhu et al. (2018a, b)	AIT inspired MOS and PC	Using arrow’s impossibility theorem (AIT) proves that the meeting between unanimity and independence of irrelevant alternatives (IIA) will produce an ‘important subject’, which in fact determines the final rating of image quality

Subjective assessment may employ either single stimulus and double stimulus (Patil and Patil, 2017). In the assessment process, a group of observers are exposed to images with various quality and asked to evaluate these images. Evaluation is recorded as a subjective score, and for the same image, different scores recorded by different observers are averaged.

Subjective methods are not without their own shortcomings. There are several problems that can be associated with subjective methods (Hands, 1998). First of all, the method may take longer time to proceed, not to mention also costly. Since subjective experiment is equipped to make the subject evaluate every image in the dataset, it may take hours to finish. In order to achieve statistical validity for the evaluation, the number of human viewers involved in the experiment must also be large enough so that results are not obtained by chance. Lately, however, crowdsourcing method has also been employed to get more viewers involved in the evaluation process in a much shorter time than in a traditional subjective experiment (Kundu et al., 2017a, b). Such crowdsource-based methods are not without its challenges; for example, unlike traditional subjective experiment in a laboratory environment, there is only limited or even no control over the experimental setup (display device, illumination condition, viewing distance, etc.).

Regardless of the experimental setup used, subjective evaluation is costly because observers as test subjects must be recruited and paid. The traditional subjective experiment may cost more because the measurement may require a laboratory set up that can be difficult to organize with calibrated, specialized equipment. Subjective evaluation may also not be suitable for certain application (Winkler, 2005); for example, real-time situations where immediate responses are expected.

These problems are the main reasons why researchers turned to objective tests that can provide faster and more practical results.

Objective methods

Objective measurements are increasingly popular for image/video coding comparison. The evaluation is expressed as a mathematical formula that can be computed without human intervention. To get a better evaluation, subjective scores from subjective experiments may be used as a reference for these objective models. Objective quality measurement usually takes into account various types of distortion that may be present on the images: blur distortion, motion blurred, edge, contouring, blocking artifacts, granular noise, jerkiness, dirty window, etc. Objective image quality assessment methods lend itself to various applications such as quality control system, image processing algorithm benchmark, and transmission systems optimization.

The objective image quality measurement method can be differentiated based on the technique used to quantify the image quality. Quantification can be based on different errors (Narwaria et al., 2015), structural information (Aydin et al., 2008; Ma et al., 2015; Yeganeh and Wang, 2013), and also machine learning (Jia et al., 2017).

The quality metric based on error differences (Narwaria et al., 2015) can benefit from various image processing methods/algorithms both in spatial/frequency domain. The quality is then quantified based on the spatial-temporal or frequency domain analysis. Other methods (Ma et al., 2015; Yeganeh and Wang, 2013) based on structural similarity information use multi-scale analysis for a measure of signal quality. This method uses a structural similarity index (SSIM) metric that is modified with a natural scene statistical approach (NSS). Then recently, there has also been an approach like (Jia et al., 2017) using saliency map-based machine learning to improve the performance of the NR method. Such models are not without problems; for example, the problem was discovered due to a significant gap in luminance values when such a model was applied to HDR images.

The Video Quality Expert Group (VQEG) has listed three basic categories for image/video quality assessment methods. Their categories are based on the availability of reference images. These are full-reference, reduced-reference, and no-reference methods (RRNR-TV Group, 2004; Video Quality Experts Group, 2002; VQEG, 2000).

Full-reference (FR) methods evaluate image/video quality by comparing test images/video with the original, undistorted version of the images/videos (Opozda and Sochan, 2014). No-reference (NR) image quality models, on the other hand, try to mimic how HVS or the human eye perceive image quality without the need of original, reference image. It is also sometimes referred to as blind image quality assessment (Patil and Patil, 2017). Reduced-reference (RR) image quality assessment provides a balanced and trade-off solution between the two extremes represented by FR and NR quality models. RR methods are designed to use only partial data about the reference image to evaluate the processed one. Partial data can be formed of features extracted from the undistorted signals which are then compared with features extracted from the processed or degraded images (Gunawan, 2006). RR quality assessment was originally proposed to track the changes of visual quality that may be present in the video information distributed through communication networks.

As a method that employs overhead data for its purpose, RR quality evaluation concerns with the data rate used to transmit this side information. If, for example, high data rate side channel is somehow available, then RR method can use larger quantity of information about the reference images. If the side channel is big enough, it may also possible to send the whole original reference picture. On the other hand, if the data rate of the side channel is small, it is mandatory that RR method can also works with only a small side information.

HDR image quality assessment

In this section, some HDR IQA models found in the literature will be outlined. The outline will only cover the important elements of various FR and NR methods. To the best of author’s knowledge, there have been no literatures on HDR IQA in a RR framework to date. The summary of HDR image quality assessment is given chronologically in Table 5.

Table 5.

Summary of several HDR IQA methods.

Authors	Methods	Databases	Metrics
Mantiuk et al. (2011)	Full-reference error metrics	LIVE, TID2008	HDR-VDP-2
Yeganeh and Wang (2013)	Full-reference, tone-mapped images, multi-scale SSIM	Own dataset (Yeganeh and Wang, 2013)	TMQI
Ma et al. (2015)	Full-reference, MEF images	Own dataset (Ma et al., 2015)	MEF-IQA
Kundu et al. (2017a, b)	No-reference, natural scene statistics	ESPL-LIVE	HIGRADE
Jia et al. (2017)	No-reference, DL, convolutional neural networks with saliency maps	LIVE and CSIQ (SDR)	DL-NRIQA
Guan et al. (2018)	No-reference, tensor space, image manifold	Publicly available dataset	TDML with SVR-based
Ravuri et al. (2019)	Convolutional neural nets, SVM, tone mapping, deep no-reference tone-mapped image quality assessment, NRIQA	ESPL-LIVE and Yeganeh	RcNet
Yue et al. (2020)	Feature extraction; support vector machines; tone-mapped HDR; multi-exposure fused images; no-reference (NR); colorfulness, exposure, naturalness	Publicly available dataset	SVM-based features
Duan et al. (2020)	Local dimming algorithms, image contrast ratio, subjective, objective	Fairchild’s	BLD algorithms
Fang et al. (2020)	MEF algorithms; objective quality model; reduced ghosting artifacts; Heuristic algorithms; structural similarity	Own dataset and Mantiuk’s MEF deghosting images	MEF-SSIM_d
Kim and Kim (2020)	Convolutional neural nets; learning-based RTM scheme; low-complexity reverse tone mapping	Own dataset	RTM Scheme, HDR-VQM
Jiang et al. (2020)	Entropy; feature extraction; support vector machines; colorfulness index; tone mapping operators; luminance partition; NRIQA	TMID and ESPL-LIVE	SVR-based
Ellahi et al. (2020)	HMM, TMO, FR	ETHyma	HMM-based similarity measure
Krasula et al. (2020)	TMO, FR, NR, feature naturalness, structural similarity, and feature similarity	Yeganeh, Cadik, and TMIQD	FFTMI, based on SS-II, FN, and FSITM
Wang et al. (2021)	NRIQA, tone-mapped images	TMID and ESPL-LIVE	SVR-based with RBF kernel
Fang et al. (2021)	NRIQA, tone-mapped images, gradient, chromatics, statistics	ESPL-LIVE	VQGC

Full-reference model

There are several FR (full-reference) models for HDR image quality assessment; for example Duan et al., (2020), Krasula et al. (2020), Ma et al. (2015), Mantiuk et al. (2011), Yeganeh and Wang (2013).

HDR visual difference predictor (HDR-VDP) and HDR-VDP-2, proposed by Mantiuk et al. (2005) and its successor, (Mantiuk et al., 2011), are FR methods based on error metric. The metric uses various visual models based on contrast sensitivity in diverse lighting conditions. The models were also tested against psychophysical measurements to select the best parameters that can be adjusted with the data. Some feature invariant metrics based on structural similarity was also employed by this model.

Tone-mapped quality index (TMQI), proposed by Yeganeh and Wang (2013), is an objective quality evaluation on tone-mapped images in an FR framework. This method combined multi-scale capability of structural similarity measure (SSIM) (Wang et al., 2003) with a measure of naturalness. The SSIM in TMQI is used to evaluate the structural weaknesses across images, based on contrast, lighting, and local structure. Naturalness is based on statistics of thousand images portraying various types of natural scenery. These two parameters are then combined in a certain ways similar to a weighted sum of each parameter by taking into account sensitivity of each parameter to the overall quality.

MEF-IQA, proposed by Ma et al. (2015), is an FR method specialized for MEF-based images. It also uses multi-scale structural similarity, but now combined with structural consistency. It works by adapting HVS to extract structural information from natural images. MEF algorithms can use MEF-IQA to tune the parameters for the MEF. MEF-IQA also came with its own subjective data for their evaluation. The dataset consists of 17 original pictures that are subjected to various exposure levels. There are classical and sophisticated MEF algorithms being used to create the resulting MEF images.

Another FR model of HDR image quality assessment is local dimming algorithms (Duan et al., 2020). This method is a full-reference quality assessment technique that is applied to a number of backlight local dimming (BLD) algorithms. BLD algorithms are usually used to improve image contrast ratio and provide power efficiency for modern displays. The paper also offers a subjective evaluation procedure on each BLD generated images in which subjects must submit rank of these images based on their most natural looking.

Features fusion for natural tone-mapped images quality evaluation (FFTMI) (Krasula et al., 2020) is another method of tone-mapped HDR image quality assessment based on carefully selected perceptual relevant features. The features are combined in a linear fashion to avoid over fitting of the model when combined using a machine learning technique. Features are grouped into several categories, based on the availability of the reference image/feature. From an FR model, they used contrast/structure similarity and locally weighted mean phase angle (LWMPA) similarity measures. On the hand, from an NR model they took contrast, colorfulness, sharpness, aesthetics, saliency, and any other estimators not belonging to any previous categories. Based on their selection procedure, they came up with FFTMI metrics derived from FR TMQI-II structural similarity, FR feature similarity index for tone-mapped image (FSITM), and NR feature naturalness.

In the study of Ellahi et al. (2020), the hidden Markov model (HMM) as a test of similarity to assess TMO perceived quality is proposed. The findings suggest that the proposed HMM-based method that emphasizes temporal information yields better evaluation metrics than traditional approaches based solely on visual-spatial information.

No-reference model

As can be seen from Table 5, there are more NR models than FR models available in the literature; for example Guan et al. (2018), Kundu et al. (2017a, b), Ravuri et al. (2019), Yue et al. (2020), among others.

Blind high dynamic range image quality assessment using deep learning (DL-NRIQA), proposed by Jia et al. (2017), is a no-reference image quality assessment (NRIQA) method by combining deep convolutional neural networks (CNNs) with saliency maps on high dynamic range (HDR) images. Similarly, the HDR image GRADient evaluator (HIGRADE) is an NR model proposed by Kundu et al. (2017a, b). It is based on bandpass standard measurement in addition to natural scene statistics (NSS). NSS descriptors are employed to construct features. It works by an assumption that HDR process usually alters the image gradient NSS feature. The discrepancy can be used by the model to infer quality predictions.

In the study of Ravuri et al. (2019), a no-reference quality assessment technique for tone-mapped images was proposed. The method consists of two stages. In the first one, it uses convolutional neural network (CNN) to produce a distortion map from the tone-mapped images. In the second stage, the distortion map is modeled using an asymmetric generalized Gaussian distribution (AGGD). The quality score is then estimated based on the AGGD parameters with a help from SVR (support vector regression) method. The distortion map can also be used as features to estimate the quality index of tone-mapped images.

The method presented in the study of Yue et al. (2020) is proposed to use multiple quality-sensitive features for both MEF and ITMO-based HDR images. The features are based on colorfulness, exposure, and naturalness. The metrics is developed in the absence of any reference images. SVR is used to bridge the extracted features and the associated subjective ratings for the quality model.

In the study of Fang et al. (2021), a robust visual blind quality evaluation method for analyzing the visual characteristics of TMI using gradient and chromatic statistics (VQGC) is proposed. The method is motivated by the perception mechanism that the human visual system (HVS) is sensitive to image structures variation. They used the magnitude of the gradient to predict structural distortion accurately, the orientation to measure the variation of the image structure, and the magnitude and orientation of the relative gradients to capture microstructural changes. They also used color invariant descriptors to capture visual degradation of colors with local binary patterns (LBP) on four colored feature maps. Subsequently, the final quality conscious feature vector is obtained from the amalgamation of gradient and chromatic features, which is applied to assess the perceived quality of TMI by supporting vector regression (SVR).

Proposed method framework

Motivation

We can see from the previous section that for HDR imaging, there are numerous FR/NR methods, whilst none so far for RR. On the contrary, for LDR/SDR images there have been plenty of FR/NR/RR methods for quite some time, such as illustrated in the research roadmap in Figure 2. Therefore, our present study will focus on the investigation of the reduced-reference objective quality evaluation for HDR image. In particular, we are interested in the investigation of usable features for the RR model.

Based on the research roadmap, we use a framework like the one given in Figure 3. Our proposed method uses a simple feature based on some derivatives of gradient image (for example, edges, false edges, or contour) for the RR feature. Features made with the framework as described in Figure 3 can be built not only by utilizing edge strength, but also can use false contour/edge map information, histograms, or local features in the desired image area (region of interest, ROI) with certain criteria. As part of the RR feature, we may use false edge/contour map which is extracted from the luminance image. Therefore, the color image that is used in the process must first be converted into a gray scale image before subsequent steps.

Research framework for current proposed method.

We noted that similar features based on gradient have been used in previous works on HDR-related quality evaluation reported by others, but only in a full-reference or no-reference framework. In our present study, we would like to investigate how this simple feature can be adopted for RR feature in an HDR-related quality assessment framework.

We are interested in this feature because we noted that there are notable changes on the edges of the generated HDR image based on MEF and ITMO. This, for example, is illustrated in Figures 4 and 5, where we have an original image in HDR format and its associated global-adjusted MEF-based processed image, taken from the dataset (The University of Texas at Austin, 2006). By comparing these figures we can see that global brightness of the processed image is shifted compared to that of the original one. This is also reflected in the global shift of their histograms. The gradient images also show that there are differences between that of the original and the processed one in terms of strength and thickness. Histograms of gradient, on the other hand, exhibit little differences; they only demonstrate some minor changes.

Original and test/processed images and their histograms from the dataset. The test images were processed using global adjustment method.

The gradient of the original and test/processed images and their histograms. The test images were processed using global adjustment method.

Therefore, it is reasonable that the reduced-reference approach presented in this paper makes use of relative comparison of the derivatives of gradient images. For example, by comparing the false edge/contour map (FCEM) of a processed image (due to MEF or ITMO, for example) with the gradient image derived from the reference image that we assume contains no artifacts or distortions, one may be able to estimate the quality of the processed image relative to the reference image. Any discrepancy in the processed image will be shown by an increase or decrease in FCEM strength/magnitude.

Conclusions

We have reviewed various HDR image quality assessment methods in the literature and found that many have focused on the development of the FR and NR models. From these models, there are several perceptual attributes that can be beneficial for quality assessment purposes: contrast, details, color, and artifacts. Many algorithms also use natural statistics descriptor, feature naturalness, and feature similarity, which lend themselves to the use of no-reference method. However, we believe that RR model is also useful for several application scenarios, notably for monitoring purposes. Therefore, development of RR model is still considered necessary. In line with that argument, we have initiated research on the development of RR model for HDR IQA, using a research roadmap presented in Figure 2. Some of our preliminary results using feature based on a simple calculation on the images were also given in the previous section, and the result shows that the proposed method is promising although there is still room for further improvement.

eISSN:: 1178-5608
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: Volume Open
Fachgebiete der Zeitschrift:: Technik, Einführungen und Gesamtdarstellungen, andere

Zeitschrift RSS Feed

A review on high dynamic range (HDR) image quality assessment

Online veröffentlicht: 12. Juli 2021

Seitenbereich: 1 - 17

Eingereicht: 10. Dez. 2020

DOI: https://doi.org/10.21307/ijssis-2021-010

Schlüsselwörter
Reduce-reference (RR), Objective quality assessment, Image quality assessment (IQA), High dynamic range (HDR), Inverse tone mapping operator (ITMO), Multi-exposure fusion (MEF)

© 2021 Irwan Prasetya Gunawan et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

A review on high dynamic range (HDR) image quality assessment

Online veröffentlicht: 12. Juli 2021

Seitenbereich: 1 - 17

Eingereicht: 10. Dez. 2020

DOI: https://doi.org/10.21307/ijssis-2021-010

SchlüsselwörterReduce-reference (RR), Objective quality assessment, Image quality assessment (IQA), High dynamic range (HDR), Inverse tone mapping operator (ITMO), Multi-exposure fusion (MEF)

© 2021 Irwan Prasetya Gunawan et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Schlüsselwörter
Reduce-reference (RR), Objective quality assessment, Image quality assessment (IQA), High dynamic range (HDR), Inverse tone mapping operator (ITMO), Multi-exposure fusion (MEF)