Task
Students will develop a DM solution for saving the cost of a direct marketing campaign by reducing false positive (wasted call) and false negative (missed customer) decisions. Working on this assignment, students can consider the following scenario. A Bank has decided to save the cost of a direct marketing campaign based on phone calls offering a product to a client. A cost efficient solution is expected to support the campaign with predictions for a given client profile whether the client buys the product or not.
Examples of cost-efficient DM solutions for direct marketing are provided on the UCI Machine Learning repository describing a Bank Marketing problem.
How students will work
Each student is expected to run individual experiments to find an efficient solution and describe experimental results in an individual report. Students could work on the assignment task as: (i) a group manager, (ii) a group member, or (iii) an individual. If students will work in a group, the group manager arranges the comparison and ranking of designed solutions.
Method and Technology
To design a solution, students will use Data Mining techniques such as Decision Trees. Students are recommended to use R scripting: (i) a Cloud CoCalc, (ii) a development suite RStudio or an RStudio Cloud free for students. Other scripting languages such as Python supported e.g. by Google Colab online platform could be also used.
|
Task Students will develop a DM solution for saving the cost of a direct marketing campaign by reducing false positive (wasted call) and false negative (missed customer) decisions. Working on this assignment, students can consider the following scenario. A Bank has decided to save the cost of a direct marketing campaign based on phone calls offering a product to a client. A cost efficient solution is expected to support the campaign with predictions for a given client profile whether the client buys the product or not. Examples of cost-efficient DM solutions for direct marketing are provided on the UCI Machine Learning repository describing a Bank Marketing problem. How students will work Each student is expected to run individual experiments to find an efficient solution and describe experimental results in an individual report. Students could work on the assignment task as: (i) a group manager, (ii) a group member, or (iii) an individual. If students will work in a group, the group manager arranges the comparison and ranking of designed solutions. Method and Technology To design a solution, students will use Data Mining techniques such as Decision Trees. Students are recommended to use R scripting: (i) a Cloud CoCalc, (ii) a development suite RStudio or an RStudio Cloud free for students. Other scripting languages such as Python supported e.g. by Google Colab online platform could be also used. Project Code and Data The assignment project code is available as an R Script. The Bank Marketing data set is available as a csv file. Other data sets (Kaggle or UCI) could also be used. Report submission and report template Each solution will be evaluated in terms of the costs of false decisions made on the validation data. Reports will be submitted via BREO. Reports can be prepared with a template. BREO similarity in reports must be < 20% (scripting is not counted). |
|
Is there a size limit? |
|
2500 words (task 1) & 2500 words (task 2) |
|
What do I need to do to pass? (Threshold Expectations from UIF) |
|
|
How do I produce high quality work that merits a good grade? |
|
|
How does assignment relate to what we are doing in scheduled sessions? |
|
Data Mining techniques and use cases developed in R will be considered during lectures and tutorials. |
Report submission and report template
Each solution will be evaluated in terms of the costs of false decisions made on the validation data. Reports will be submitted via BREO. Reports can be prepared with a template. BREO similarity in reports must be < 20% (scripting is not counted).