Home Credit Predictor NY Grad Model Mini-Projects

Banking Campaign Predictor

Using the Banking Data - Marketing Targets dataset from Kaggle, my group and I predicted whether a successfully acquired a customer. Here, success was defined as whether a customer subscribed to a term deposit.

I collaborated on preprocessing and exploratory data analysis. For the former, I decided which columns to drop and why. I also used Label Encoding to convert unranked categorical features into numerical values. I did the same of ranked categorical features but using maps to assign unique values a rank. I used the Pandas get_dummies method to convert the binary features into usable data. For the EDA portion, I coded a seaborn heatmap to visualize the correlations and used matplotlib to create histograms and boxplots to understand the distribution of data.

I split the data into training and testing sets and spearheaded the coding of the Random Forest Classifier, which achieved a 92% accuracy score. I used an iterative approach to find the best hyperparameters for the model. I elected to include features that improved the model's accuracy and precision, especially for the minority class. My groupmate and I tested the impact of various hyperparameter combinations on the confusion matrix and classification report to refine the model. After determining the most important features, I used seaborn to visualize the correlation between a successful outcome and these features.

When both models were complete, I analyzed the results and synthesized the findings into a concise conclusion about the models' performances. As a group, we created a PowerPoint presentation to explain our approach and findings.


Github Repo