Cervical cancer is the third most common cancer for women in the world [1]. Screening for cervical cancer is usually performed using a multi-tiered paradigm which begins with the Papanicolaou (Pap) smear with human papillomavirus (HPV) co-testing, followed by colposcopy guided biopsy and prevention of cervical cancer depends on colposcopic detection and treatment of high-grade cervical intraepithelial neoplasia (HG-CIN) in women referred with abnormal cytology. Cervical epithelium is a highly structured and stratified tissue that exhibits changes as it progresses from normal epithelium to HG-CIN. These changes are associated with losses in the layer of flattened epithelial cells close to the surface of the cervix, and increases in both the nuclear/cytoplasmic ratio and the extracellular space. All of these changes caused by the disease will eventually lead to a change in the impedance compared with a normal cervix. As a result, in contrast to colposcopy, the Electrical Impedance Spectroscopy (EIS) can be used as a non-visual technique to image epithelia. The research on HG-CIN detection using EIS had been carried out for many years [24] and the EIS measurement device ZedScanTM (see Fig.1) has been developed for real-time diagnostics [5].
Fig.1
The ZedScan handset for making the EIS measurements used in this paper. The handset is shown placed on the base.

Currently, EIS has been used as an adjunct to colposcopy for HG-CIN detection to improve the diagnostic performance [6]. The impedances are measured with ZedScanTM at 14 frequencies, logarithmically spaced between 76 Hz and 625 kHz. The template matching method has been used for analysing the 14-frequency EIS spectra measured from a maximum of 12 reading sites around the cervix for diagnosis [5,6] where the measured spectra are compared with the ‘template’ spectra generated from the 3-D finite element models of the normal and abnormal cervical tissues and matching between the measured spectra and the templates is made using the least squares method, finally the results from matching are then used to generate a probability index for the detection of HG-CIN.
Complementary to colposcopy, EIS has been shown able to differentiate between normal, pre-cancerous and cancerous tissues. It plays an important role in improving performance of colposcopy-only diagnosis as shown in previous studies [5,6]. The template matching method used in the previous studies for EIS data analysis relies on the template spectra generated from the 3-D finite element models of the normal and abnormal cervical tissues. These 3-D finite element models can be viewed as the computational versions of the associated physical tissue models. Building quality 3-D finite element model to obtain template spectra is a time and effort demanding job, requiring extensive domain knowledge and involving detailed histopathological analysis of normal and diseased tissues, and in some cases this may be difficult. This hinders the extension of EIS-based technique in new areas of medical diagnosis where template spectra are not available. In addition to producing a probability index used for HG CIN detection, it would also be desirable to be able to establish a direct link between this probability index and the associated tissue structure properties as this will provide important information for us to understand the histopathological mechanism that underpins the EIS-based cervical cancer prognosis and diagnosis, and to improve interpretability of the diagnostic results.
The problems mentioned above are addressed in this paper. A EIS data-driven modelling based approach was developed for EIS measurement data analysis. The new approach does not rely on the template spectra and HG CIN detection was formulated as a classification problem where a Cole model-based spectrum curve fitting method was proposed to extract the features from EIS readings and a logistic regression model was employed for performing the classification which revealed the association between the tissue structure changes caused by disease and the changes in the measured EIS through the Cole parameter estimates. The developed approach was also used for a longitudinal EIS data analysis to evaluate prognostic value of the EIS for cervical cancer diagnosis and the results are reported in this paper.
The EIS measurements with confirmed diagnostic outcomes used in the study presented in this paper were taken from 1704 women and there were at least 8 impedance spectra (taken from different reading sites around the cervix) for each individual. For HG CIN detection, the entire population was divided into two groups, those women with confirmed HG CIN which had N=528 ( 30.99% ) and those women without confirmed HG CIN which had N=1176, The objective of the study is to develop a template-free method for separating these two groups using the measured EIS. Among 1176 women with non-HG CIN, 569 women were followed up to three years after their initial colposcopy. Of these, 35 (6.15%) women were found to develop HG-CIN within three follow-up years and 534 women were not. The EIS data of these 569 women were used for a longitudinal study to evaluate prognostic value of the EIS for cervical cancer diagnosis. In this case, the entire population of size 569 was divided into two groups, with one group including all women who had developed HG-CIN within three follow-up years and another group including women who had not. The objective of this longitudinal study is to see if it is possible to identify women who are likely to develop HG-CIN within three follow-up years based on the EIS measurement taken at their initial colposcopy so as to evaluate the prognostic value of EIS for cervical cancer diagnosis.
The basic idea behind the EIS-based template match method for HG-CIN detection as mentioned above is to identify the difference in spectrum shapes between diseased and non-diseased tissues by directly comparing the measured spectra with the template spectra to generate features for diagnosis. In contrast to direct comparison, a model-based spectrum curve fitting approach was proposed in this paper to extract features from EIS readings for diagnosis with an aim to reveal how the tissue structure changes due to disease might be reflected in the measured
EIS, in addition to detecting HG CIN. Specifically, we try to fit a model to the measured spectrum, and then derive the required features for disease detection from the fitted model parameters.
This study was a service evaluation carried out in the Jessop Wing Colposcopy clinic in Sheffield and so no ethical approval was required [6]. All patient data mentioned above was anonymised.
The commonly used model for biological tissue impedance is the Cole equation of the following form [7,2]:
This is an equivalent model that is commonly used by researchers in the field to describe the relationship between the measured tissue impedance
Equation (1) is a nonlinear complex model and spectrum curve fitting for determination of the model parameters can be formulated as a complex nonlinear optimization problem. This can be solved using the trust-region-reflective algorithm [8], subject to the bounds determined with the measured EIS spectra. Fig. 2 below shows some typical results of Cole model-based EIS fitting with the aforementioned algorithm, where solid lines represent the measured spectra and dashed lines represent the model fitted spectra.
Fig.2
Comparison between measured and model fitted EIS

The structure of biological tissue is complex and the impedance change with frequency will depend upon many factors, such as cellular arrangement (layering of cells), extracellular space, cell size, conductivity of extracellular fluid, thickness of cell membrane, electrical properties of cell membrane and so on. When cervical epithelium progresses from normal epithelium to high-grade CIN, the tissue properties mentioned above will also be altered which are reflected in the changes in the measured EIS spectra, hence allowing EIS to be used for disease detection [9]. Ultimately, these changes will lead to changes in the four estimated parameters
A commonly used interpretation [10] of the four Cole model parameters for tissue structure is that the inverse of extracellular volume determines
When used as an adjunct to colposcopy, EIS spectral measurements are made at up to 12 reading sites around cervix (minimum number of sites is 8) of individual women. As the lesion can either be large to cover many sites or be small covering a few sites, or even a single site on cervix, two types of feature were derived from Cole model parameter estimates. The first type of feature consists of the four Cole model parameter estimates (denoted as:
The features defined by (2) can be viewed as a measure of spatial inhomogeneity of the tissue around cervix and are expected to provide information for detecting small lesions presented in a few or just a single reading site. The rationale behind this is that, if there are no lesions around cervix, EIS taken at all sites will have approximately the same shape, thus similar Cole model parameter estimates are expected when performing spectrum curve fitting and the differences defined by equation (2) will be small. However, if a lesion does exist and only presents in a few or a single site, the EIS taken at these sites will significantly differ from those taken at sites where no lesions were present. Hence, the differences defined in equation (2) will be large. To sum up, using both Cole model parameter estimates associated with the mean spectrum and the differences defined by equation (2) (i.e.
Cervical cancer diagnosis, or more specifically, HG CIN detection using EIS can be viewed as a problem of detecting changes in the measured EIS taken around cervix which are caused by the change in tissue structure due to HG CIN. Whereas, the evaluation of prognostic value of EIS for cervical cancer diagnosis can be viewed as a problem of detecting early signs in the EIS taken at the initial colposcopy which is caused by the incipient change in tissue structure as neoplasia develops (i.e. the early stage in the evolution of neoplasia). Both problems are formulated as a classification problem in this paper and machine learning technique (see e.g. [11, 12]) was employed to solve the problem. Specifically, the feature/predictor vector defined as:
derived from the fitted Cole model in last subsection will be used to build a predictive model for solving this classification problem.
The complexity of any predictive model for classification depends on the number of input dimensions (i.e. the number of features to be used). In the last subsection, eight handcrafted features defined in (3) have been derived from the EIS measurements. However, the effect of neoplasia on the four Cole model parameters, hence the features derived, is complex and some of these features may be redundant or not informative. Statistically, it is often more attractive to estimate a simpler model with non-informative features being removed as this usually leads to a reduced estimation variance and improved robustness in prediction, and also prevents over fitting for the given data set of fixed size. From a practical point of view, a simpler model may also be more interpretable. Our early study [13] had shown that using any single feature collected in x was not statistically sufficient to allow a separation of two groups. To this end, multivariate analysis of variance (MANOVA) [14] was used for evaluating and ranking the capability of the various combinations of the derived features collected in
Feature combinations | Feature combinations | ||
---|---|---|---|
1.1003 × 10−31 | 5.0124 × 10−31 | ||
1.5861 × 10−31 | 5.3276 × 10−31 | ||
2.6287 × 10−31 | 6.5955 × 10−31 | ||
3.1665 × 10−31 | 7.2683 × 10−31 | ||
3.4687 × 10−31 | 7.4318 × 10−31 |
Feature combinations | Feature combinations | ||
---|---|---|---|
0.0168 | 0.0286 | ||
0.0231 | 0.0295 | ||
0.0256 | 0.0296 | ||
0.0274 | 0.0314 | ||
0.0275 | 0.0335 |
Table 1 shows the results for comparing the multivariate means of the different combination of features from the two groups of women (i.e. HG CIN
Similarly, Table 2 below shows the results for comparing the multivariate means of the different combinations of features from the two groups of women (i.e. HG CIN developed
It can be seen that, in comparison with Table 1, the
As discussed above, the problems to be solved in this study can be viewed as a binary classification problem, once the features to be used for classification are determined, classification models can be trained using the available EIS measurements. There are many machine learning algorithms that can be used to solve the classification problem. Logistic regression, an established and widely used classification method in medical/clinical data analysis [15] for disease diagnosis, was selected in this study to solve our problems of HG CIN detection and evaluating the prognostic value of EIS due to its simplicity and interpretability.
Logistic regression is concerned with direct modelling the posterior probability
where a(
and the regression coefficient vector 𝛽 (with up to 9 elements i.e.
where
However, a major challenge for using the above model in this study is the determination of the model structure and evaluating the performance of the corresponding model with a class-imbalanced EIS data set of limited size. The problem is particularly severe in the data set used for the longitudinal study to evaluate the prognostic value of EIS, where the number of women who developed HG-CIN within three follow-up years in the whole population is very small (35 of 569). Hence simple partitioning of the data into two (i.e. training and test) sets for building and validating model may not work as this is likely to result in substantially different class distributions between the training and test sets and even no HG-CIN sample at all in some sets. To overcome this difficulty,
As can be seen, the logistic regression model defined by equations (4) and (6) is computationally simple. The posterior probability
Informed consent has been obtained from all individuals included in this study.
The research related to human use has been complied with all relevant national regulations, institutional policies and in accordance with the tenets of the Helsinki Declaration, and has been approved by the authors’ institutional review board or equivalent committee.
Following the discussion in the last section, the area under the receiver operating characteristic (ROC) curve (abbreviated as AUC), a commonly used index for measuring the performance of classifier [16], together with the stratified
The EIS device ZedScanTM has been developed as an adjunct diagnostic device to be used alongside colposcopy to provide an objective assessment of the cervical epithelial tissue in real time so that the colposcopist can take the ZedScan results into account when reaching their decision on patient management. With the current template matching method for HG CIN detection, the EIS device is programmed so that the threshold used for any given patient will depend upon the referral cytology result and also whether the colposcopist has identified the presence of HG CIN (i.e. colposcopic impression (CI), see [5]). In other words, the clinical information (i.e. referral cytology result and CI which can be viewed as two qualitative variables that taken values of either HG CIN or non-HG CIN) has been integrated into the template matching-based diagnostic decision making procedure when using ZedScanTM in clinic practice. The new logistic regression classification-based method for HG CIN detection developed in this paper will be used in a similar way with the same setting. Hence in addition to the quantitative features/predictors defined in (3), the qualitative clinical information mentioned above also need to be incorporated into the logistic regression model. This can be done with two dummy variables (denoted as CI and Ref hereafter) that take on two numerical values (e.g. 1= HG CIN and 0=non-HG CIN) and the full expression of a(
where
Extensive studies have been carried out to evaluate the diagnostic performance of logistic regression models with different structures, i.e. the models constructed with different polynomial terms (up to degrees 3) of the selected features in x using the stratified 2-fold cross validation procedure with a 1000/704 training/testing split described previously. To reduce the uncertainty in the performance estimates, the procedure was repeated 10 times for each model and a different splitting of the dataset into 2 folds was implemented (via random permutation of data points in two groups respectively) for each repetition. The AUCs of ROC for the testing data sets of each repetition were summarized in Table 3 below, where three models with the best mean AUC values over 10 repetitions for polynomial degrees 1, 2, and 3 respectively are listed and the terms of the regression model are specified in the first row of the table.
AUC values for testing sets from 10 repeated two-fold cross validation runs with three logistic regression models
Repetitions | |||
---|---|---|---|
1 | 0.9127 | 0.9160 | 0.9165 |
2 | 0.9177 | 0.9178 | 0.9190 |
3 | 0.9013 | 0.9034 | 0.9045 |
4 | 0.9165 | 0.9210 | 0.9238 |
5 | 0.8830 | 0.8840 | 0.8858 |
6 | 0.9206 | 0.9222 | 0.9230 |
7 | 0.9164 | 0.9165 | 0.9172 |
8 | 0.9053 | 0.9061 | 0.9075 |
9 | 0.9122 | 0.9146 | 0.9155 |
10 | 0.9181 | 0.9215 | 0.9222 |
Mean AUC | 0.9104 | 0.9123 | 0.9135 |
It can be seen that the model corresponding to the Column 4 of Table 3 has the largest mean AUC value among three models, so the final model structure is specified by the first row of Column 4 and a(
Once the final model structure is determined, all the coefficients in (7) can be estimated using the entire 1704 EIS data and the resulting estimated coefficients and the associated
Regression coefficient estimates and the associated
estimates | ||
---|---|---|
-2.9619 | 3.9518 × 10−32 | |
−7.4684 × 10−11 | 0.0047 | |
3.3987 | 0.0090 | |
3.0025 × 10−11 | 0.0044 | |
2.3621 | 3.9281 × 10−47 | |
2.2241 | 5.8068 × 10−35 |
To validate the new method for HG CIN detection developed above and to compare the performance of the new method with that of the template match method currently used, the new method with the final model (7) and coefficients given in Table 4 was applied to a new set of EIS data from Royal Free Hospital in London. The size of this new data set was relatively small with severe class-imbalance (17.12% of HG CIN), N= 111 patients. The ROC curve from new method is shown in Figure 3, For comparison, the ROC curves from the template match method currently used, as well as colposcopy only are also displayed in Figure 3, where the blue line is the ROC from the new method with the logistic regression model specified by equation (7) and AUC=0.83524, the red line was the ROC from template match method with AUC=0.81808.
Fig.3
ROC comparison between new method, template match method and colposcopy only.

It can be seen that the new method can achieve similar performance as the template match method and both of them outperform colposcopy alone. Figure 3 shows a clear improvement in diagnostic performance when EIS is used (with either the new method developed in this paper or the current template match method) alongside colposcopy in comparison with colposcopy alone.
The research on the evaluation of prognostic value of EIS carried out in this paper is the continuation of the study presented in [10]. All the women in the study had a negative outcome at their initial colposcopy and were then followed up for three years. The main objective of the research was to see if we were able to identify any increased risk of HG-CIN developing over the follow-up years based on the EIS readings taken at the initial colposcopy so as to evaluate the prognostic value of EIS readings.
The stratified 5-fold cross validation procedure discussed previously was applied to the data set taken from 569 women who had been followed up to three years so as to determine the final model to be used for evaluating the prognostic value of the EIS and the results were summarized in Table 5 and 6 below, where Columns 1 and 3 of these tables specify the feature combinations used for building the logistic regression models and columns 2 and 4 show the corresponding mean AUC values from 100 repeated 5-fold cross validation runs. A different partitioning of the dataset into 5 folds was implemented (via random permutation of data points in two groups respectively) for each run.
Mean AUC values from 100 5-fold cross validation runs with linear logistic regression models
Feature combinations | Mean AUC | Feature combinations | Mean AUC |
---|---|---|---|
0.5870 | 0.5723 | ||
0.5777 | 0.5716 | ||
0.5745 | 0.5715 | ||
0.5744 | 0.5686 | ||
0.5736 | 0.5678 |
Mean AUC values from 100 5-fold cross validation runs with nonlinear logistic regression models
Feature combinations | Mean AUC | Feature combinations | Mean AUC |
---|---|---|---|
0.6103 | 0.5911 | ||
0.5992 | 0.5899 | ||
0.5989 | 0.5895 | ||
0.5946 | 0.5891 | ||
0.5939 | 0.5885 |
Table 5 shows the ten linear combinations of features for building logistic regression models that have the largest mean AUC values among all possible linear combinations of 8 features defined in (3). As can be seen from Table 5, including more features in the linear logistic regression model does not necessarily improve the classification performance and the best linear logistic regression model (in terms of mean AUC value) is constructed with
Table 6 shows the ten nonlinear combinations of features for building the logistic regression models that have the largest mean AUC values among all possible nonlinear combinations of (up to the second order polynomial) 8 features. Similarly, it can be seen from Table 6, using more features/or polynomial terms in the logistic regression model does not necessarily improve the classification performance and the best nonlinear logistic regression model (in terms of mean AUC value) is constructed with the polynomial terms
Fig.4
2-D histogram of α̅-Δα data points from two groups

Once the “winning” model structure was determined, we could then train this model with the whole date set to finalize our classification model and determine the optimal operating point (OOP). In this case, the “winning” model was constructed with the polynomial terms
The OOP was chosen in this study such that Youden index [17]
Fig.5
An ROC curve of final model for separating two groups with OOP and the associated performance indices

In the previous study reported in [10], two single features derived directly from mean spectra of individual women, i.e. the impedance at 152Hz and the slope of the EIS spectra between frequencies 1.22 and 2.44kHz (used as a proxy for
Classification performance comparison between the new classifier developed and the previous classifiers
Classifier | AUC | Sensitivity | Specificity |
---|---|---|---|
Logistic regression | 0.628 | 45.714% | 82.022% |
Impedance at 152Hz | 0.621 | 38.7% | 83.4% |
Slope (between 1.22 | 0.596 | 45.2% | 70.1% |
and 2.44kHz) as α |
In Table 7, the sensitivity and specificity were calculated at the OOP determined from ROC curves of the corresponding classifiers. It can be seen from Table 7, overall, the performances from the logistic regression classifier developed in this paper and the previous classifiers are comparable, but the new classifier can achieve relatively balanced sensitivity and specificity. More importantly, with
the new classifier, the possibility of women who could develop HG CIN within follow-up years is expressed as an explicit function of the features
The two objectives of the research in this paper were: 1.) to develop a template-free EIS data analysis method for disease detection to enable the EIS–based techniques to be used for new areas of medical diagnosis where the template spectra are not available; 2.) In addition to being template-free, the developed method should also provide information on how the changes in cervical tissue structure/property due to disease could be reflected in the changes of the observed EIS spectra, this would ultimately help us to better-understand the mechanism that underpins the EIS-based disease detection. To achieve the first objective, a data-driven approach in combination with machine learning, or more specifically classification, techniques were employed in this study to develop the new EIS data analysis method. To achieve the second objective, a Cole model-based spectrum curve fitting approach was developed to extract features from EIS readings for classification and a logistic regression technique was used to build interpretable classification models for HG CIN detection and evaluation of prognostic value of EIS. This enabled us to associate the probability of HG CIN being present, or developing HG CIN later, with the change in tissue structure due to disease via Cole parameter estimates.
Two logistic regression models, as specified by equations (7) and (8), were developed using real service EIS data from the Jessop Wing Colposcopy clinic in Sheffield, one for HG CIN detection and another one for evaluation of prognostic value of EIS. With the logistic regression model specified by (7), the probability of HG CIN being present given the EIS readings is expressed as an explicit function of the features
The new method had been validated using a set of real EIS data from Royal Free Hospital in London and the classification performance was comparable to that of the template match method currently used with the EIS device ZedScanTM , This demonstrates the usefulness of the methodology and the associated core algorithms developed. As the new method is purely data driven, it can readily be extend to other areas of medical diagnosis where the template spectra are not available, e.g. oral cancer diagnosis [18]. In addition, it can be observed from Figure 3 that, though the new method and the template match method offer similar classification performance overall, there are some subtle differences. The new method tends to have a slightly higher specificity, whereas template match method tends to have a slightly higher sensitivity. Based on this observation, it might be possible to improve the overall diagnostic performance by combining or integrating two methods together. This can be done by taking the score from the template match method as an extra feature to build and train a new logistic regression model for classification. This is another research topic that is being carried out by the authors, but it is out of the scope of this paper.
The logistic regression model specified by equation (8) had been developed with the data set for evaluation of the prognostic value of EIS. It shows that the increased risk of developing HG CIN within follow-up years essentially depends on the handcrafted features
A weakness in this study is that the classification performance with the model specified by equation (8) was not so great as can be seen in Figure 5 in comparison with that of Figure 3 for HG CIN detection. This is, in some extent, expected and in agreement with the previous result from multivariate analysis of variance because it is more difficult to identify early signs caused by the incipient change in tissue properties as neoplasia evolves at its early stage than to detect signs that caused by a severe neoplasia or HG CIN. Another contributing factor is the limited data set available, in particular, the small portion of women who developed HG CIN within the follow-up years in the study population. It also needs to be pointed out that the result presented in Figure 5 is based on EIS only, it may be possible to further improve performance by incorporating some clinical information into the classification model as we did for HG CIN detection. Nevertheless, the results do reach statistical significance, hence EIS does contain prognostic information on evolving cervical neoplasia, which provides important information that should be useful for the development of a practical patient management scheme following a negative colposcopy.
To sum up, the two main novelties of the methodology developed in this paper are: 1.) to introduce a Cole model-based spectrum curve fitting approach to extract features from EIS readings for classification. This allows the increased risk of HG CIN being present or developing HG CIN to be associated with the changes in tissue structure due to disease and helps us to understand the underpinning mechanism of EIS-based disease detection. 2.) to introduce the maximum differences of the Cole model parameter estimates over all reading sites around the cervix as features, in addition to the Cole parameter estimates of the mean spectra. These maximum differences can be viewed as a measure for the spatial inhomogeneity of tissue around cervix and allow the small or incipient lesions to be detected. The signs due to these small or incipient lesions could be smoothed out by averaging or covered by diversity of conditions between individual patients, hence may be difficult to be detected using features derived from the mean spectra alone.
The single dispersion Cole equation had been used in this study for feature extraction. Because it appeared to be the case that EIS spectra taking from cervical tissue were dominated by a single dispersion. However, it needs to be pointed out that in other applications such as oral cancer diagnosis, there may be more than one identifiable dispersion. In such a case, it may be necessary to use a multiple dispersion Cole equation for feature extraction.
Fig.1

Fig.2

Fig.3

Fig.4

Fig.5

AUC values for testing sets from 10 repeated two-fold cross validation runs with three logistic regression models
Repetitions | |||
---|---|---|---|
1 | 0.9127 | 0.9160 | 0.9165 |
2 | 0.9177 | 0.9178 | 0.9190 |
3 | 0.9013 | 0.9034 | 0.9045 |
4 | 0.9165 | 0.9210 | 0.9238 |
5 | 0.8830 | 0.8840 | 0.8858 |
6 | 0.9206 | 0.9222 | 0.9230 |
7 | 0.9164 | 0.9165 | 0.9172 |
8 | 0.9053 | 0.9061 | 0.9075 |
9 | 0.9122 | 0.9146 | 0.9155 |
10 | 0.9181 | 0.9215 | 0.9222 |
Mean AUC | 0.9104 | 0.9123 | 0.9135 |
Mean AUC values from 100 5-fold cross validation runs with linear logistic regression models
Feature combinations | Mean AUC | Feature combinations | Mean AUC |
---|---|---|---|
0.5870 | 0.5723 | ||
0.5777 | 0.5716 | ||
0.5745 | 0.5715 | ||
0.5744 | 0.5686 | ||
0.5736 | 0.5678 |
p-values from MANOVA using EIS data taken from 1704 women for HG CIN detection
Feature combinations | Feature combinations | ||
---|---|---|---|
1.1003 × 10−31 | 5.0124 × 10−31 | ||
1.5861 × 10−31 | 5.3276 × 10−31 | ||
2.6287 × 10−31 | 6.5955 × 10−31 | ||
3.1665 × 10−31 | 7.2683 × 10−31 | ||
3.4687 × 10−31 | 7.4318 × 10−31 |
Mean AUC values from 100 5-fold cross validation runs with nonlinear logistic regression models
Feature combinations | Mean AUC | Feature combinations | Mean AUC |
---|---|---|---|
0.6103 | 0.5911 | ||
0.5992 | 0.5899 | ||
0.5989 | 0.5895 | ||
0.5946 | 0.5891 | ||
0.5939 | 0.5885 |
Regression coefficient estimates and the associated p-values for the final logistic regression model
estimates | ||
---|---|---|
-2.9619 | 3.9518 × 10−32 | |
−7.4684 × 10−11 | 0.0047 | |
3.3987 | 0.0090 | |
3.0025 × 10−11 | 0.0044 | |
2.3621 | 3.9281 × 10−47 | |
2.2241 | 5.8068 × 10−35 |
Classification performance comparison between the new classifier developed and the previous classifiers
Classifier | AUC | Sensitivity | Specificity |
---|---|---|---|
Logistic regression |
0.628 | 45.714% | 82.022% |
Impedance at 152Hz | 0.621 | 38.7% | 83.4% |
Slope (between 1.22 | 0.596 | 45.2% | 70.1% |
and 2.44kHz) as α |
p-values from MANOVA using EIS data taken at initial colposcopy of 569 women for evaluation of prognostic value of EIS
Feature combinations | Feature combinations | ||
---|---|---|---|
0.0168 | 0.0286 | ||
0.0231 | 0.0295 | ||
0.0256 | 0.0296 | ||
0.0274 | 0.0314 | ||
0.0275 | 0.0335 |