Predicting AI job market dynamics: a data mining approach to machine learning career trends on glassdoor
, , , , , , y
11 jul 2025
Acerca de este artículo
Categoría del artículo: Research Article
Publicado en línea: 11 jul 2025
Recibido: 17 mar 2025
DOI: https://doi.org/10.2478/ijssis-2025-0034
Palabras clave
© 2025 Renuka Agrawal et al., published by Sciendo
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Figure 1:

Figure 2:

Figure 3:

Figure 4:

Figure 5:

Figure 6:

Figure 7:

Figure 8:

Figure 9:

Figure 10:

Figure 11:

Figure 12:

Description of attributes present in dataset
S. No. | Name of attribute | Description |
---|---|---|
1. | Job title |
The designation of the job being listed. E.g., data scientist, data engineer, other, manager, Director, Machine Learning Engineer |
2. | Salary estimate | The estimated salary range for the job provided by Glassdoor/Employer |
3. | Job description | The full description of the job, including roles, responsibilities, and qualifications |
4. | Rating | Rating of the company, from employee reviews on Glassdoor. Initial reviews range from −1 to 5 |
5. | Company name |
The name of the company offering the job. E.g., IBM, New York (United States of America), Adobe, Microsoft etc. |
6. | Location |
The location of the job E.g., Remote (San Jose, CA, USA), (Atlanta, GA, USA) |
7. | Size |
The number of employees at the company E.g., 1,001–5,000 employees, 10,000+ employees |
8. | Founded | The year the company was founded |
9. | Type of ownership |
The ownership structure of the company E.g., private, public, government |
10. | Industry |
The specific industry the company operates in E.g., Telecommunications Services, Chemical Manufacturing, Computer Hardware Development |
11. | Sector |
The broader sector associated with the company’s operations E.g., Education, Information Technology, Manufacturing |
12. | Revenue | The estimated annual revenue of the company in US$ |
Comparison of model performance
Random Forest | 0.9853 | 0.0646 | 0.1966 | 0.8133 | 0.0166 | 0.0646 |
Lasso | 0.8750 | 0.2103 | 0.6402 | 0.0061 | 0.0888 | 0.2103 |
LightGBM | 0.9559 | 0.4441 | 1.3520 | 0.5373 | 0.1160 | 0.4430 |
XGBoost | 0.9963 | 0.9224 | ||||
Voting | 0.0646 | 0.5501 | 0.0117 | 0.1803 |
Work done by different researchers in similar domain
Ref. No. | Methodology used | Domain | Dataset used | Performance/outcome |
---|---|---|---|---|
[ |
Linear regression, Lasso, random forest | Salary prediction for Data Science Job | Kaggle—Glassdoor | MAE: For random forest—11.22, for linear regression—18.86, for ridge regression—19.67 |
[ |
SVM | Skill based job recommendation system | Job portals, company websites, scraping data from other online sources | Accuracy, precision, recall, and F1 score was calculated |
[ |
Bidirectional, decoder-encoder, stacked, Conv LSTM | Trend analysis system to predict future job markets using historical data | Web scraping, manually collecting data, government sources | Accuracy: for bidirectional LSTM—95.71%, for decoder– encoder LSTM—91.56%, for stacked LSTM—87.24%, for Conv LSTM—83.7% |
[ |
NB, KNN, NBST | Predictive analysis | Student employment in the employment market of Chongqing S colleges and universities in the past 3 years | Mean value [test time (ms)]: NB—18.607, KNN—22.224, NBST—49.026 |
[ |
MNB, SVM, DT, KNN, RF | Job posting classification | Kaggle, titled by “[real or fake] fake job posting prediction” | For MNB 95.6%, for SVM 97.7%, for DT 97.4%, for KNN 97.8%, for 98.2%, for RF 98.2% |
[ |
LR, SVM, KNN, DT, RF, AdaBoost(DT), GB, voting classifier soft & hard, XGBoost | Campus placement analyzer: Using supervised machine learning algorithms | Training and placement department of MIT which consists of all the students of Bachelor of Engineering (B.E) from three different colleges of their campus | Accuracy: Logistic Regression 58%, support vector machine 69%, KNN 63.22%, decision tree 69%, random forest 75.25%, AdaBoost(DT) 77%, gradient boosting 77%, voting classifier soft 69.11%, voting classifier hard 68.43%, XGBoost 78% |
[ |
Voting classifier | Ensemble approach for classifying job positions | Glassdoor website | For voting classifier soft—100% |
[ |
NB, SGD, LR, KNN, RF classifier | Detecting and preventing fake job offers | Kaggle—real/fake job posting prediction | For random forest classifier—97.48% |
[ |
NLP, KNN | Resume-based job recommendation system using NLP and deep learning | Combined from multiple sources | Improving the efficiency and success rate of the hiring process |
Normalization of column—salary estimate
Original value | Value after normalization |
---|---|
–1 | 116.0 (median) |
$100 K–$151 K (Glassdoor est.) | 125.5 |
Employer provided salary: $100 K–$120 K | 110 |
Employer provided salary:$107 K | 107 |
Employer provided salary: $60.00 per hr | 140.4 |
Employer provided salary: $53.62–$64.58 per hr | 138.3 |