Revisiting Strategies for Fitting Logistic Regression for Positive and Unlabeled Data

Adam Wawrzeńczyk; Jan Mielniczuk

Uneingeschränkter Zugang

Revisiting Strategies for Fitting Logistic Regression for Positive and Unlabeled Data

Adam Wawrzeńczyk

und

Jan Mielniczuk

| 04. Juli 2022

International Journal of Applied Mathematics and Computer Science

Band 32 (2022): Heft 2 (June 2022)

Towards Self-Healing Systems through Diagnostics, Fault-Tolerance and Design (Special section, pp. 171-269), Marcin Witczak and Ralf Stetter (Eds.)

Über diesen Artikel

Vorheriger Artikel

Nächster Artikel

Zitieren

Online veröffentlicht: 04. Juli 2022

Seitenbereich: 299 - 309

Eingereicht: 05. Nov. 2021

Akzeptiert: 10. Feb. 2022

DOI: https://doi.org/10.34768/amcs-2022-0022

Schlüsselwörter
positive and unlabeled learning, empirical risk, logistic regression, concave-convex optimization

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Positive unlabeled (PU) learning is an important problem motivated by the occurrence of this type of partial observability in many applications. The present paper reconsiders recent advances in parametric modeling of PU data based on empirical likelihood maximization and argues that they can be significantly improved. The proposed approach is based on the fact that the likelihood for the logistic fit and an unknown labeling frequency can be expressed as the sum of a convex and a concave function, which is explicitly given. This allows methods such as the concave-convex procedure (CCCP) or its variant, the disciplined convex-concave procedure (DCCP), to be applied. We show by analyzing real data sets that, by using the DCCP to solve the optimization problem, we obtain significant improvements in the posterior probability and the label frequency estimation over the best available competitors.

eISSN:: 2083-8492
Sprache:: Englisch

Zeitrahmen der Veröffentlichung:: 4 Hefte pro Jahr
Fachgebiete der Zeitschrift:: Mathematik, Angewandte Mathematik

Zeitschrift RSS Feed

Revisiting Strategies for Fitting Logistic Regression for Positive and Unlabeled Data

Online veröffentlicht: 04. Juli 2022

Seitenbereich: 299 - 309

Eingereicht: 05. Nov. 2021

Akzeptiert: 10. Feb. 2022

DOI: https://doi.org/10.34768/amcs-2022-0022

Schlüsselwörterpositive and unlabeled learning, empirical risk, logistic regression, concave-convex optimization

© 2022 Adam Wawrzeńczyk et al., published by Sciendo

This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License.

Schlüsselwörter
positive and unlabeled learning, empirical risk, logistic regression, concave-convex optimization