ارائه یک مدل نوآورانه یادگیری ماشین ترکیبی مبتنی‌بر یادگیری عمیق برای پیش‌بینی تصمیمات استخدامی

نوع مقاله : مقاله پژوهشی

نویسندگان

1 استاد، گروه مدیریت صنعتی، دانشکده مدیریت، دانشگاه تهران، تهران، ایران.

2 دانشجوی دکتری، گروه مدیریت صنعتی، دانشکده مدیریت، دانشگاه تهران، تهران، ایران.

10.48308/jimp.16.1.170

چکیده

مقدمه و اهداف: در عصر رقابتی امروز، تصمیمات استخدامی دیگر نمی‌توانند صرفاً بر قضاوت‌های انسانی تکیه داشته باشند. با افزایش حجم داده‌ها، پیچیدگی ویژگی‌های متقاضیان و ضرورت دقت بالا در انتخاب نیروی انسانی، بهره‌گیری از هوش‌مصنوعی و یادگیری ماشین به یک الزام راهبردی برای سازمان‌ها تبدیل شده است. اگرچه مدل‌های یادگیری ماشین کلاسیک، مانند درخت تصمیم یا رگرسیون لجستیک، نتایج قابل‌قبولی داشته‌اند، اما این مدل‌ها در مواجهه با داده‌های نامتوازن، ساختارهای پیچیده و نیاز به دقت بالا، محدودیت‌های جدی دارند. پژوهش حاضر با هدف طراحی یک مدل یادگیری ماشین ترکیبی مبتنی‌بر یادگیری عمیق انجام شده که بتواند با ترکیب مزایای شبکه‌های عصبی و الگوریتم‌های یادگیری ماشین جمعی پیشرفته، مدلی قدرتمند، دقیق و تفسیرپذیر برای پیش‌بینی تصمیمات استخدامی ارائه دهد.
روش‌ها: برای توسعه مدل پیشنهادی، از یک ساختار استکینگ چندلایه استفاده شده است که در آن شبکه عصبی عمیق (DNN)  به همراه 4 الگوریتم قدرتمند شامل Random Forest، Gradient Boosting، LightGBM و CatBoost به‌عنوان مدل‌های پایه عمل می‌کنند. خروجی این مدل‌ها به XGBoost به‌عنوان فرامدل منتقل می‌شود تا پیش‌بینی نهایی انجام شود. برای متوازن‌سازی مجموعه داده نامتوازن، از روش NearMiss استفاده شده و برای تنظیم بهینه پارامترها، الگوریتم TPE در چارچوب Optuna به کار رفته است. همچنین، فرآیند انتخاب ویژگی‌ها با روش حذف بازگشتی با اعتبارسنجی متقاطع (RFECV)  انجام شده تا مهم‌ترین متغیرهای مؤثر بر تصمیم استخدام شناسایی شوند.
یافته‌ها: مدل ترکیبی پیشنهادی بر روی یک مجموعه داده نمونه شامل ۱۵۰۰ نمونه در برابر ۱۶ مدل یادگیری ماشین شناخته شده ارزیابی شده است. نتایج نشان داد که مدل پیشنهادی در تمام زمینه‌های دقت، صحت، فراخوانی و امتیاز F1 باصحت ٪۹۲.۴۷ و امتیاز F1  ٪۹۲.۱۲ از تمام معیارهای کلیدی عملکرد پیشی گرفته است. برخی مدل‌های دیگر مانند CatBoost و LightGBM نیز نمرات خوبی داشتند، اما هیچ مدل دیگری بهتر از معیارهای گزارش شده برای مدل پیشنهادی عمل نکرد. افزون بر این، تحلیل اهمیت ویژگی‌ها (Feature Importance) که با بهره‌گیری از الگوریتم XGBoost انجام شد، نشان داد که متغیرهایی مانند «استراتژی جذب نیرو»، «سطح تحصیلات» و «امتیاز مصاحبه» بیشترین سهم را در پیش‌بینی نتیجه نهایی استخدام داشته‌اند. این نتایج نه‌تنها موجب بهبود اثربخشی مدل در پیش‌بینی تصمیمات استخدامی شد، بلکه با شفاف‌سازی عوامل مؤثر، اطلاعات ارزشمندی را برای تصمیم‌گیرندگان منابع‌انسانی فراهم کرد که می‌توانند بر پایه آن سیاست‌های جذب و ارزیابی خود را بازطراحی کنند.
نتیجه‌گیری: مدل ترکیبی یادگیری ماشین ارائه‌شده در این پژوهش، با تلفیق منسجم الگوریتم‌های کلاسیک و ساختارهای یادگیری عمیق در قالب معماری استکینگ چندلایه، چارچوبی نوین و اثربخش برای پیش‌بینی دقیق تصمیمات استخدامی فراهم کرده است. این مدل نه‌تنها در آزمون‌های عددی و مقایسه‌ای عملکرد ممتاز و پایداری از خود نشان داده، بلکه از نظر کاربردی نیز واجد ویژگی‌هایی چون تفسیرپذیری، تعمیم‌پذیری و انعطاف‌پذیری است. دستاوردهای پژوهش حاکی از آن است که بهره‌گیری از چنین مدل‌های ترکیبی می‌تواند منجر به تحول اساسی در سیستم‌های تصمیم‌یار منابع‌انسانی شود و فرآیند انتخاب و ارزیابی متقاضیان شغلی را هوشمندانه‌تر، سریع‌تر و عادلانه‌تر سازد. از سوی دیگر، تلفیق تحلیل ویژگی‌ها با تکنیک‌های پیش‌بینی، امکان ارائه بازخورد هدفمند و داده‌محور به مدیران جذب نیرو را نیز فراهم می‌سازد. با توجه به این نتایج امیدبخش، پیشنهاد می‌شود در تحقیقات آتی از مجموعه‌داده‌های بزرگ‌تر، داده‌های غیرساختاریافته مانند رزومه‌های متنی، و مصاحبه‌های ویدیویی استفاده شود.

کلیدواژه‌ها

موضوعات


عنوان مقاله [English]

A Novel Hybrid Machine Learning Model Based on Deep Learning for Predicting Recruitment Decisions

نویسندگان [English]

  • Mehregan Mohammad Reza 1
  • Arman Rezasoltani, 2
  • Amir Mohammad Khani 2
1 Professor, Department of Industrial Management, Faculty of Industrial Management and Technology, College of Management, University of Tehran, Tehran, Iran.
2 Ph.D. Candidate, Department of Industrial Management, Faculty of Management, University of Tehran, Tehran, Iran.
چکیده [English]

Introduction and Objectives: In today's In a highly competitive environment, recruitment decisions can no longer rely only on human judgment. The increasing volume of applicant data and the complexity of the search for candidate attributes and high precision in the selection of personnel have become a need. Artificial intelligence (AI) and machine learning have been chosen as the only solution to such a problem. (ML) a strategic necessity for organizations. Despite that, classical ML models, such as decision trees and logistic regression, are giving acceptable However, when results are applied to imbalanced datasets, complex data structure setups, and high accuracy requirements, they are greatly limited. This study aims to achieve a hybrid machine learning model designed on the forces of both kinds of neural networks as well as the classical algorithms. I demonstrate how to deliver a powerful, accurate, and interpretable solution to predict recruitment outcomes.
Methods: A multi layer stacking architecture was used to develop the proposed model, in which Deep Neural Network (DNN) is employed with four of the high performing base learners such as Random Forest, Gradient Boosting, LightGBM and CatBoost. Finally, XGBoost was used as meta learner to learn the final prediction from the outputs of these base models. To handle the class imbalance problem, NearMiss undersampling technique was tried and we used the Tree structured Parzen Estimator (TPE) algorithm provided as a part of the Optuna framework for hyperparameter optimization. Additionally, Recursive Feature Elimination with Cross Validation (RFECV) was used for feature selection to find the most important variables related to the hiring decisions.
Findings: The proposed hybrid model has been evaluated on a sample dataset of 1500 samples against 16 well-known machine learning models. Results indicated that the proposed model surpassed all key performance metrics in all areas of accuracy, precision, recall and F1 score with an accuracy of 92.47% and F1 score of 92.12%. There were some other models such as CatBoost and LightGBM that also had good scores, no other models performed better than those metrics reported for the proposed model.Likewise, the feature importance assessment of the same dataset with the help of XGBoost displayed that the recruitment strategy, education level, and interview score were the major predictors of final hiring decisions. These findings were not only beneficial in improving model performance but also valuable for improving the research and data examination of the HR decision makers in relation to the policies and criteria used in recruitment.
Conclusion: This research develops the hybrid machine learning model that smoothly combines classical algorithms and deep learning by a stacked architecture, which provides an advanced and highly effective structure for predicting hiring outcomes accurately. The model achieved both statistical superiority in benchmark comparisons and practical benefits.These findings imply that the usage of such hybrid models can rewrite the context for intelligent HR systems by streamlining candidate evaluation as faster, fairer, and more data-driven. In addition, HR managers receive focused, evidence-based feedback from feature analysis when predicting with modeling. Future work involving larger datasets and unstructured data such as resumes and interview videos coupled with tools for making the black box more explainable, such as SHAP or LIME, is encouraged to add transparency and build organizational trust in AI-based decision-making systems.

کلیدواژه‌ها [English]

  • Hybrid Machine Learning
  • Deep Learning
  • Recruitment Prediction
  • Parameter Optimization
  • Intelligent Human Resources
  1. Aguilar-Ruiz J.S., Michalak (2024). Classification performance assessment for imbalanced multiclass data. Scientific Reports, 14(1), https://doi.org/11038/s41598-024-61365-z
  2. Akram N., Irfan R., Al-Shamayleh A.S., Kousar A., Qaddos A., Imran M., Akhunzada A. (2024). Online recruitment fraud (ORF) detection using deep learning approaches. IEEE Access, 12, 109388–109408. https://doi.org/10.1109/access.2024.3435670
  3. Almarzooq H., Waheed U. bin. (2024). Automating hyperparameter optimization in geophysics with Optuna: A comparative study. Geophysical Prospecting. https://doi.org/10.1111/1365-2478.13484
  4. Ampomah E.K., Qin Z., Nyame G. (2020). Evaluation of tree-based ensemble machine learning models in predicting stock price direction of movement. Information, 11(6), 332. https://doi.org/10.3390/info11060332
  5. Afaq A. (2025). Integrating predictive analytics for workforce planning. Journal of Information Systems Engineering and Management, 10(30s), 93–111. https://doi.org/10.52783/jisem.v10i30s.4780
  6. Aubaidan B.H., Kadir R.A., Lajb M.T., Anwar M., Qureshi K.N., Taha B.A., Ghafoor K. (2025). A review of intelligent data analysis: Machine learning approaches for addressing class imbalance in healthcare – challenges and perspectives. Intelligent Data Analysis: An International Journal. https://doi.org/10.1177/1088467x241305509
  7. Awad M., Fraihat S. (2023). Recursive feature elimination with cross-validation with decision tree: Feature selection method for machine learning-based intrusion detection systems. Journal of Sensor and Actuator Networks, 12(5), 67. https://doi.org/10.3390/jsan12050067
  8. Ayoko O.B., Fujimoto Y. (2023). Diversity, inclusion, and human resource management: A call for more belongingness and intersectionality research. Journal of Management & Organization, 29(6), 983–990. https://doi.org/10.1017/jmo.2023.72
  9. Azad M., Nehal T.H., Moshkov M. (2024). A novel ensemble learning method using majority-based voting of multiple selective decision trees. Computing, 107(1). https://doi.org/10.1007/s00607-024-01394-8
  10. Bhutoria A.J., Lewis C. (2011). 100 things you should know about HR management with SAP. SAP Press.
  11. Carvalho M., Pinho A.J., Brás S. (2025). Resampling approaches to handle class imbalance: A review from a data perspective. Journal of Big Data, 12(1). https://doi.org/10.1186/s40537-025-01119-4
  12. Chen W., Du C. (2022). Human resource decision-making and recommendation based on Hadoop distributed big data platform. Mathematical Problems in Engineering, 2022, 1–9. https://doi.org/10.1155/2022/8325677
  13. Dhinakaran D., Srinivasan L., Raja S.E., Valarmathi K., Gomathy Nayagam M. (2025). Synergistic feature selection and distributed classification framework for high-dimensional medical data analysis. MethodsX, 14, 103219–103219. https://doi.org/10.1016/j.mex.2025.103219
  14. Du K.-L., Zhang R., Jiang B., Zeng J., Lu J. (2025). Foundations and innovations in data fusion and ensemble learning for effective consensus. Mathematics, 13(4), 587. https://doi.org/10.3390/math13040587
  15. Dudáš A. (2024). Graphical representation of data prediction potential: Correlation graphs and correlation chains. The Visual Computer. https://doi.org/10.1007/s00371-023-03240-y
  16. El Kharoua R. (2024). Predicting hiring decisions in recruitment data [Data set]. Kaggle. https://doi.org/10.34740/KAGGLE/DSV/8715385
  17. Fürnkranz J. (2011). Decision tree. In: Sammut C., Webb G.I. (eds), Encyclopedia of Machine Learning. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-30164-8_204
  18. Gohari K., Kazemnejad A., Mohammadi M., Eskandari F., Saberi S., Esmaieli M., Sheidaei A. (2023). A Bayesian latent class extension of naive Bayesian classifier and its application to the classification of gastric cancer patients. BMC Medical Research Methodology, 23(1). https://doi.org/10.1186/s12874-023-02013-4
  19. Graf R., Zeldovich M., Friedrich S. (2022). Comparing linear discriminant analysis and supervised learning algorithms for binary classification—A method comparison study. Biometrical Journal. https://doi.org/10.1002/bimj.202200098
  20. Guido R., Ferrisi S., Lofaro D., Conforti D. (2024). An overview on the advancements of support vector machine models in healthcare applications: A review. Information, 15(4), 235. https://doi.org/10.3390/info15040235
  21. Heydari M., Alinezhad A., Vahdani B. (2024). Application of deep learning networks to design quality control process in the motor oil industry. Journal of Industrial Management Perspective, 14(1), 211–237. https://doi.org/10.48308/jimp.14.1.211 (In Persian)
  22. Hornyák O., Iantovics L.B. (2023). AdaBoost algorithm could lead to weak results for data with certain characteristics. Mathematics, 11(8), 1801. https://doi.org/10.3390/math11081801
  23. Irem E. (2024). Brain tumor classification and detection using a hybrid deep learning model. Global Journal of Computer Sciences: Theory and Research, 14(2), 24–29. https://doi.org/10.18844/gjcs.v14i2.9604
  24. Jafarnjad A., Rezasoltani A., Khani A.M. (2025). Analyzing and predicting hiring decisions using machine learning and deep learning. Journal of Public Administration, 17(2), 295–327. https://doi.org/10.22059/jipa.2025.390322.3649 (In Persian)
  25. Jafarnejad A., Rezasoltani A., Khani A.M. (2025). Predicting heart disease using automated machine learning based on genetic algorithms. Journal of Information Technology Management, 17(2), 91–122. https://doi.org/10.22059/jitm.2024.382556.3829(In Persian)
  26. Jafarnejad Chaghoshi A., Khani A.M., Rezasoltani A. (2024). Risk modeling in banking services for the blind using fuzzy FMEA and graph neural network (GNN). Journal of Industrial Management Perspective, 14(4), 223–255. https://doi.org/10.48308/jimp.14.4.223(In Persian)
  27. Jafarnejad Chaghoshi A., Rezasoltani A., Khani A.M. (2024). Unleashing the power of ensemble learning: Predicting national ranks in Iran’s university entrance examination. Industrial Management Journal, 16(3), 457–481. https://doi.org/10.22059/imj.2024.381521.1008178(In Persian)
  28. Khani A.M., Kazazi A., Taqhavi Fard M.T. (2022). Evaluating the quality of services of the cultural and social deputy of Tehran municipality in the field of culture and art. Social Development & Welfare Planning, 13(50), 205–250. https://doi.org/10.22054/qjsd.2021.58035.2110(In Persian)
  29. Kim K.G. (2016). Book review: Deep learning. Healthcare Informatics Research, 22(4), 351–354.
  30. Krishnaiah V., Hullukere Kadegowda Y. (2022). Undergraduate engineering students employment prediction using hybrid approach in machine learning. International Journal of Electrical and Computer Engineering (IJECE), 12(3), 2783–2791. https://doi.org/10.11591/ijece.v12i3.pp2783-2791
  31. Kundu S., Palani S. (2024). Unlocking talent: Exploring the potential of AI and ML algorithms in recruitment process. Proceedings of the IEEE International Conference on Electrical, Electronics, Communication Technologies (ICEECT), 1–6. https://doi.org/10.1109/iceect61758.2024.10739192
  32. Li H., Wang Q., Liu J., Zhao D. (2022). A prediction model of human resources recruitment demand based on convolutional collaborative BP neural network. Computational Intelligence and Neuroscience, 2022, e3620312. https://doi.org/10.1155/2022/3620312
  33. Lokker C., Abdelkader W., Bagheri E., Parrish R., Cotoi C., Navarro T., Germini F., Linkins L.-A., Haynes R.B., Chu L., Afzal M., Iorio A. (2024). Boosting efficiency in a clinical literature surveillance system with LightGBM. PLOS Digital Health, 3(9), e0000299–e0000299. https://doi.org/10.1371/journal.pdig.0000299
  34. Lopez-Pacheco M., Yu W. (2021). Complex valued deep neural networks for nonlinear system modeling. Neural Processing Letters. https://doi.org/10.1007/s11063-021-10644-1
  35. Mahajan P., Uddin S., Hajati F., Moni M.A. (2023). Ensemble learning for disease prediction: A review. Healthcare, 11(12), 1808–1808. https://doi.org/10.3390/healthcare11121808
  36. Mishra D., Naik B., Nayak J., Souri A., Dash P.B., Vimal S. (2022). Light gradient boosting machine with optimized hyperparameters for identification of malicious access in IoT network. Digital Communications and Networks. https://doi.org/10.1016/j.dcan.2022.10.004
  37. Mohammed A.Q. (2019). HR analytics: A modern tool in HR for predictive decision making. Journal of Management, 10(3). https://doi.org/10.34218/jom.6.3.2019.007
  38. Nooraei Abadeh M., Bahadori S., Mirzaei M., Ebrahimi N. (2024). A quantitative approach for prioritizing supply chain priorities in smart industries using data-driven prediction: Two common industrial case studies. Journal of Industrial Management Perspective, 14(3), 169–188. https://doi.org/10.48308/jimp.14.3.169(In Persian)
  39. Okatta C.G., Ajayi F.A., Olawale O. (2024). Navigating the future: Integrating AI and machine learning in HR practices for a digital workforce. Computer Science & IT Research Journal, 5(4), 1008–1030.
  40. Pagan M., Simoncelli E.P., Rust N.C. (2016). Neural quadratic discriminant analysis: Nonlinear decoding with V1-like computation. Neural Computation, 28(11), 2291–2319. https://doi.org/10.1162/neco_a_00890
  41. Pala A., Oleynik A., Utseth I., Handegard N.O. (2023). Addressing class imbalance in deep learning for acoustic target classification. ICES Journal of Marine Science, 80(10), 2530–2544. https://doi.org/10.1093/icesjms/fsad165
  42. Passemiers A., Folco P., Raimondi D., Birolo G., Moreau Y., Fariselli P. (2024). A quantitative benchmark of neural network feature selection methods for detecting nonlinear signals. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-82583-5
  43. Pessach D., Singer G., Avrahami D., Chalutz Ben-Gal H., Shmueli E., Ben-Gal I. (2020). Employees recruitment: A prescriptive analytics approach via machine learning and mathematical programming. Decision Support Systems, 134(1), 113290. https://doi.org/10.1016/j.dss.2020.113290
  44. Priyanka M., Kaur M.R., Anmoldeep M., Malhotra M.R. (2024). From hiring to retention: The impact of AI on modern HR practices. In Blockchain and AI in Business, 271.
  45. Rainio O., Teuho J., Klén R. (2024). Evaluation metrics and statistical tests for machine learning. Scientific Reports, 14(1), 1–14. https://doi.org/10.1038/s41598-024-56706-x
  46. Rajagopalan B., Lall U. (1999). A k-nearest-neighbor simulator for daily precipitation and other weather variables. Water Resources Research, 35(10), 3089–3101. https://doi.org/10.1029/1999wr900028
  47. Ravichandran T., Lertwongsatien C. (2005). Effect of information systems resources and capabilities on firm performance: A resource-based perspective. Journal of Management Information Systems, 21(4), 237–276. https://doi.org/10.1080/07421222.2005.11045820
  48. Rizkallah L.W. (2025). Enhancing the performance of gradient boosting trees on regression problems. Journal of Big Data, 12(1). https://doi.org/10.1186/s40537-025-01071-3
  49. Saha D., Bhandari D., Mukherjee G. (2023). Job recommendation: A hybrid approach using text processing. In Proceedings of International Conference on Computational Intelligence and Data Engineering, 74–85. https://doi.org/10.1007/978-981-99-3478-2_8
  50. Sasirekha V., Abinash T., Venkateswara Prasad B. (2024). HR analytics and people management. In Emerging Trends in Human Resource Management. https://doi.org/10.2174/9789815274196124010007
  51. Shen Y., Wu S., Wang Y., Wang J., Yang Z. (2024). Interpretable model for rockburst intensity prediction based on Shapley values-based Optuna-random forest. Underground Space. https://doi.org/10.1016/j.undsp.2024.09.002
  52. Shukla R., Singh T.R. (2024). AlzGenPred - CatBoost-based gene classifier for predicting Alzheimer’s disease using high-throughput sequencing data. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-82208-x
  53. Shinde P.P., Shah S. (2018). A review of machine learning and deep learning applications. In Proceedings of the International Conference on Computing, Communication, Control and Automation (ICCUBEA). https://doi.org/10.1109/ICCUBEA.2018.8697857
  54. Srinivasu P.N., Jaya Lakshmi G., Gudipalli A., Narahari S.C., Shafi J., Woźniak M., Ijaz M.F. (2024). XAI-driven CatBoost multi-layer perceptron neural network for analyzing breast cancer. Scientific Reports, 14(1). https://doi.org/10.1038/s41598-024-79620-8
  55. Starbuck C.M. (2023). Logistic regression. In Springer EBooks, 223–238. https://doi.org/10.1007/978-3-031-28674-2_12
  56. Temsamani Khallouk Yassine, Said A. (2023). Implementing AI in HRM: Leveraging machine learning for smart recruitment systems. In Proceedings of the International Conference on Technology, Management, Operations and Decisions (ICTMOD). https://doi.org/10.1109/ictmod59086.2023.10472910
  57. Toufighi, S. P., Khani, A. M., Rezasoltani, A., Sahebi, I. G., & Vang, J. (2025). Forecasting stock market anomalies in emerging markets: An OPTUNA-optimized isolation forest and K-means approach. Machine Learning With Applications, 22, 100770. https://doi.org/10.1016/j.mlwa.2025.100770
  58. Wahyuning S., Sudibyo S.K. (2024). Leveraging machine learning for talent acquisition: Predicting high-performance candidates in human resource management. Journal of Management and Informatics, 3(1), 87–104. https://doi.org/10.51903/jmi.v3i1.44
  59. Wang Q., Lu H. (2024). A novel stacking ensemble learner for predicting residual strength of corroded pipelines. NPJ Materials Degradation, 8(1). https://doi.org/10.1038/s41529-024-00508-z
  60. Wiens M., Verone‐Boyle A., Henscheid N., Podichetty J.T., Burton J. (2025). A tutorial and use case example of the eXtreme Gradient Boosting (XGBoost) artificial intelligence algorithm for drug development applications. Clinical and Translational Science, 18(3). https://doi.org/10.1111/cts.70172
  61. Wilkens U., Lutzeyer I., Zheng C., Beser A., Prilla M. (2025). Augmenting diversity in hiring decisions with artificial intelligence tools. The International Journal of Human Resource Management, 1–38. https://doi.org/10.1080/09585192.2025.2492867
  62. Yanagimoto H. (2017). Support vector machines with neural network. Frontiers in Artificial Intelligence and Applications. https://doi.org/10.3233/978-1-61499-800-6-124
  63. Yassine T., Said A. (2024). Predictive hiring and AI: Elevating recruitment with optimized neural networks and gradient descent. International Journal of Intelligent Information Systems, 13(6), 117–127. https://doi.org/10.11648/j.ijiis.20241306.11