Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/595259
Title: | A model for churn prediction based on qualitative support interaction features for hotel technology provider |
Researcher: | De, Soumi |
Guide(s): | Prabu, P |
Keywords: | Active Learning, Churn Prediction, Computer Science Computer Science Artificial Intelligence Engineering and Technology Topic Classification, Uncertainty Sampling. |
University: | CHRIST University |
Completed Date: | 2024 |
Abstract: | Customer retention is a significant driver of a company s growth. Machine learning has gained immense popularity as a means to predict customers at risk of churn. Churn prediction models are capable of highlighting customers who are at high risk of churn well in advance. A popular approach to improve the performance of churn prediction models is by using input variables that are mainly quantitative and structured in nature. There are limited works in literature that newlineinvestigate smart means to effectively utilize and integrate unstructured data into churn prediction models, and study the impact on model efficacy. One of the roadblocks to effectively utilize unstructured data is the associated cost of annotation which is both time consuming and requires intensive manual effort. To overcome this obstacle, researchers often adopt a semi-supervised newlineapproach called active learning that aims to achieve state-of-the-art performance using minimal number of samples. Although active learning boosts classifier performance, the underlying query strategies are unable to eliminate redundancy in selected samples for manual annotation. Redundant samples lead to increased cost and sub-optimal performance of learner. Inspired by this challenge, the study proposes a new representation-based query strategy that selects highly newlineinformative and representative subsets of samples for manual annotation. Data comprises newlinemessages of a set of customers sent to a service provider. Series of experiments are conducted to analyse the effectiveness of the proposed query strategy, called Entropy-based Min Max Similarity (E-MMSIM), in the context of topic classification for churn prediction. The foundation of E-MMSIM is an algorithm that is popularly used to sequence proteins in protein databases. The algorithm is modified and utilized to select the most representative and informative samples. The performance is evaluated using F1-score, AUC and accuracy. |
Pagination: | x, 89p.; |
URI: | http://hdl.handle.net/10603/595259 |
Appears in Departments: | Department of Data Science |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 166.39 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 920.33 kB | Adobe PDF | View/Open | |
03_abstract.pdf | 127.74 kB | Adobe PDF | View/Open | |
04_table_of_contents.pdf | 129.31 kB | Adobe PDF | View/Open | |
05_introduction_1.pdf | 172.41 kB | Adobe PDF | View/Open | |
06_literature_review_2.pdf | 406.25 kB | Adobe PDF | View/Open | |
07_research_methodology_3.pdf | 453.99 kB | Adobe PDF | View/Open | |
08_model_for_churn_prediction_4.pdf | 426.97 kB | Adobe PDF | View/Open | |
09_results_and_discussion_5.pdf | 2 MB | Adobe PDF | View/Open | |
10_conclusion_6.pdf | 152.55 kB | Adobe PDF | View/Open | |
11_annexures.pdf | 202.56 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 314.99 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: