Loan Eligibility Prediction Python Machine Learning Project

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Loan Eligibility Prediction Python Machine Learning Project. Loan approval is a very important process for banking organizations. The system approved or reject the loan applications. Recovery of loans is a major contributing parameter in the financial statements of a bank. It is very difficult to predict the possibility of payment of loan by the customer. In recent years many researchers worked on loan approval prediction systems. Machine Learning (ML)techniques are very useful in predicting outcomes for large amount of data.

Key Features

  • Interface to predict loan application approval
  • data insights withhin Jupyter Notebook
  • Trained Model
  • multiple machine learning algorithms.

Technology :

  • Flask==1.1.1
  • html5lib==1.0.1
  • json5==0.8.5
  • jsonify==0.5
  • numpy==1.16.5
  • pandas==0.25.1
  • scikit-image==0.15.0
  • scikit-learn==0.21.3
  • scipy==1.3.1
  • gunicorn==19.9.0
  • requests==2.22.0

Loan Defaulter Prediction Machine Learning Projects

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Using supervised machine learning to train a model with credit default data to determine the probability and/or classification (“default” vs “non-default”) of the user’s liability. The UI will take user input such as, such as education level, sex, marital status, payment history and income, and will return a classification.

An app like this would be useful for financial and lending institutions to understand and manage the risk of their loans and lending portfolios.

 

Goals/Outcome

  • Determining probability of user liability
  • Creating an interactive UI that will take users input and return an output
  • To determine if a neural network vs logistic regression is the better model for classification

Models Created

  • Logistic Regression
  • Random Forest Model
  • Deep Neural Network

About

Probability of Credit Card Default, Machine Learning

Technologies Used : -

  • beautifulsoup4==4.6.0
  • certifi==2018.4.16
  • chardet==3.0.4
  • click==6.7
  • Flask==1.0
  • gunicorn==19.8.0
  • idna==2.6
  • itsdangerous==0.24
  • Jinja2==2.10
  • MarkupSafe==1.0
  • numpy==1.14.3
  • pandas==0.22.0
  • python-dateutil==2.7.2
  • pytz==2018.4
  • requests==2.18.4
  • scikit-learn==0.19.1
  • scipy==1.0.1
  • six==1.11.0
  • SQLAlchemy==1.2.7
  • urllib3==1.22
  • Werkzeug==0.14.1

 

Used Car Price Prediction Using Machine Learning

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Car Price Prediction is a really an interesting machine learning problem as there are many factors that influence the price of a car in the second-hand market. In this competition, we will be looking at a dataset based on sale/purchase of cars where our end goal will be to predict the price of the car given its features to maximize the profit.

Datasets Link - Kaggle Data 

Technologies Used : -

  1. Python 3.7
  2. Pandas
  3. Numpy
  4. Flask

Running the web app

Locally

  • Install requirements
    pip install -r requirements.txt
  • Run flask web app
    python app.py

Skin cancer Detection using Machine learning

Subscribe YouTube For Latest Update Click Here

Buy Now ₹1501

Buy Now Project Report ₹1001

Skin cancer Detection using Machine learning .The purpose of this project is to create a tool that considering the image of a mole, can calculate the probability that a mole can be malign.

Skin cancer is a common disease that affect a big amount of peoples. Some facts about skin cancer:

Every year there are more new cases of skin cancer than the combined incidence of cancers of the breast, prostate, lung and colon.

An estimated 87,110 new cases of invasive melanoma will be diagnosed in the U.S. in 2017.

The estimated 5-year survival rate for patients whose melanoma is detected early is about 98 percent in the U.S. The survival rate falls to 62 percent when the disease reaches the lymph nodes, and 18 percent when the disease metastasizes to distant organs.

Development process and Data

The idea of this project is to construct a CNN model that can predict the probability that a specific mole can be malign.

Data: Skin cancer Detection using Machine learning

To train this model I'm planning to use a set of images from the International Skin Imaging Collaboration:

Mellanoma Project ISIC https://isic-archive.com.

The specific datasets to use are:

ISICUDA-21: Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and benign lesions are included.

Benign: 23

Malign: 37

ISICUDA-11 Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and

benign lesions are included.

Benign: 398

Malign: 159

ISICMSK-21: Benign and malignant skin lesions. Biopsy-confirmed melanocytic and non-melanocytic lesions.

Benign: 1167 (Not used)

Malign: 352

ISICMSK-12: Both malignant and benign melanocytic and non-melanocytic lesions. Almost all images confirmed by histopathology. Images not taken with modern digital cameras.

Benign: 339

Malign: 77

ISICMSK-11: Moles and melanomas. Biopsy-confirmed melanocytic lesions, both malignant and benign.

Benign: 448 Malign: 224

As summary the total images to use are:

Benign ImagesMalign Images
1208849

Some sample images are shown below: 1. Sample images of benign moles:

Sample images of malign moles:

Preprocessing:

The following preprocessing tasks are going to be developed for each image: 1. Visual inspection to detect images with low quality or not representative 2. Image resizing: Transform images to 128x128x3 3. Crop images: Automatic or manual Crop 4. Other to define later in order to improve model quality

CNN Model:

The idea is to develop a simple CNN model from scratch, and evaluate the performance to set a baseline. The following steps to improve the model are: 1. Data augmentation: Rotations, noising, scaling to avoid overfitting 2. Transferred Learning: Using a pre-trained network construct some additional layer at the end to fine tuning our model. (VGG-16, or other) 3. Others to define.

Model Evaluation:

To evaluate the different models we will use ROC Curves and AUC score. To choose the correct model we will evaluate the precision and accuracy to set the threshold level that represent a good tradeoff between TPR and FPR.

python 3.6.8

Heart Disease Prediction using Machine Learning Project

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Buy Now Project Report ₹1001

Introduction

Heart diseases is a term covering any disorder of the heart. Heart diseases have become a major concern to deal with as studies show that the number of deaths due to heart diseases have increased significantly over the past few decades in India, in fact it has become the leading cause of death in India.

A study shows that from 1990 to 2016 the death rate due to heart diseases have increased around 34 per cent from 155.7 to 209.1 deaths per one lakh population in India.

Thus preventing Heart diseases has become more than necessary. Good data-driven systems for predicting heart diseases can improve the entire research and prevention process, making sure that more people can live healthy lives. This is where Machine Learning comes into play. Machine Learning helps in predicting the Heart diseases, and the predictions made are quite accurate.

Problem Description :

A dataset is formed by taking into consideration some of the information of 920 individuals. The problem is : based on the given information about each individual we have to calculate that whether that individual will suffer from heart disease.

Dataset :

The Heart disease data set consists of patient data from Cleveland, Hungary, Long Beach and Switzerland. The combined dataset consists of 14 features and 916 samples with many missing values. The features used in here are,

  1. Age : displays the age of the individual.
  2. Sex : displays the gender of the individual using the following format : 1 = male 0 = female.
  3. Chest-pain type : displays the type of chest-pain experienced by the individual using the following format : 1 = typical angina 2 = atypical angina 3 = non - anginal pain 4 = asymptotic
  4. Resting Blood Pressure : displays the resting blood pressure value of an individual in mmHg (unit)
  5. Serum Cholestrol : displays the serum cholestrol in mg/dl (unit)
  6. Fasting Blood Sugar : compares the fasting blood sugar value of an individual with 120mg/dl. If fasting blood sugar > 120mg/dl then : 1 (true) else : 0 (false)
  7. Resting ECG : 0 = normal 1 = having ST-T wave abnormality 2 = left ventricular hyperthrophy
  8. Max heart rate achieved : displays the max heart rate achieved by an individual.
  9. Exercise induced angina : 1 = yes 0 = no
  10. ST depression induced by exercise relative to rest : displays the value which is integer or float.
  11. Peak exercise ST segment : 1 = upsloping 2 = flat 3 = downsloping
  12. Number of major vessels (0-3) colored by flourosopy : displays the value as integer or float.
  13. Thal : displays the thalassemia : 3 = normal 6 = fixed defect 7 = reversable defect
  14. Diagnosis of heart disease : Displays whether the individual is suffering from heart disease or not : 0 = absence 1,2,3,4 = present.

Technologies Used : -

  1. Python 3.7
  2. Pandas
  3. Numpy
  4. Flask

Running the web app

Locally

  • Install requirements
    pip install -r requirements.txt
  • Run flask web app
    python main_file.py

Models used and accuracy

A Random forest classifier achieves an average multi-class classification accuracy of 56-60%(183 test samples). It gets 75-80% average binary classification accuracy(heart disease or no heart disease).

Diabetes Prediction using Machine Learning Project Code

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now Source Code ₹1501

Buy Now Project Report ₹1001

In this Diabetes Prediction using Machine Learning Project Code, the objective is to predict whether the person has Diabetes or not based on various features like Number of Pregnancies, Insulin Level, Age, BMI.The data set that has used in this project has taken from the kaggle . "This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage." and used a simple random forest classifier.

Learning Objectives : -

The following points were the objective of the project (The main intention was to create an end-to-end ML project.)

  1. Data gathering
  2. Descriptive Analysis
  3. Data Visualizations
  4. Data Preprocessing
  5. Data Modelling
  6. Model Evaluation
  7. Model Deployment

Technical Aspect : -

  1. Training a machine learning model using scikit-learn.
  2. Building and hosting a Flask web app.
  3. A user has to put details like Number of Pregnancies, Insulin Level, Age, BMI etc .
  4. Once it get all the fields information , the prediction is displyed on a new page .

Technologies Used : -

  1. Python 3.7
  2. Pandas
  3. Numpy
  4. Flask

 

Datasets

https://www.kaggle.com/uciml/pima-indians-diabetes-database

 

Installation

  1. Download  and unzip it.
  2. After downloading, cd into the flask directory.
  3. Begin a new virtual environment with Python 3 and activate it.
  4. Install the required packages using pip install -r requirements.txt

RUN

  1. Execute the command: python app.py

Black Friday Sales Prediction project with source code

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Source Code ₹1501

Buy Project Report ₹1001

Black Friday Sales Prediction project with source code . In this project, we are getting to predict what proportion the purchasers will spend during Black Friday, using various features like age, gender, legal status . The dataset we are going to use is the Black Friday dataset from Kaggle which contains about 550068 rows and 12 features that can be downloaded here. We will follow all the steps of a Data Science lifecycle from data collection to model deployment.

This Project contains a jupyter notebook file used to train a CatBoostRegressor model for predicting the amount of sales on a black friday based on several feautures.
The model was then integrated into a flask web application

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Dataset Link: https://www.kaggle.com/c/black-friday/data

Training Model File 

model.ipynb

model-checkpoint.ipynb

Output Generated File

catBoost.pkl

Read Before Purchase  :

  1. One Time Free Installation Support.
  2. Terms and Conditions on this page: https://projectworlds/terms
  3. We offer Paid Customization installation Support
  4.  If you have any questions please contact  Support Section
  5. Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
  6. You can download the product after the purchase by a direct link on this page.

 

 

 

Fake Product Review Detection using Machine Learning

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Buy Now Project Report ₹1001

Online reviews play a very important role for decision-making in today's e-commerce. Large parts of the population, i.e. customers read product or store reviews before deciding what to buy or where to buy and whether to buy or not. Because writing fake / fraudulent reviews comes with monetary gain, online review websites there has been a huge increase in tricky opinion spam. Basically, an untruthful review is a fake review or fraudulent review or opinion spam. Positive reviews of a target object can attract more customers and increase sales; negative reviews of a target object can result in lower demand and lower sales. Fake review detection has attracted considerable attention in recent years. Most review sites, however, still do not filter fake reviews publicly. Yelp is an exception that over the past few years

has filtered reviews. Yelp's algorithm, however, is a business secret. In this work, by analyzing their filtered reviews, we try to find out what Yelp could do. The results will be useful in their filtering effort for other review hosting sites. Filtering has two main approaches: supervised and unmonitored learning. There are also about two types in terms of the characteristics used: linguistic characteristics and behavioral characteristics. Through supervised learning approach we have tried to make a model which can identify the fake review with almost 70 percent accuracy.

As the Internet continues to grow in size and importance, the quantity and impact of online reviews is increasing continuously. Reviews can influence people across a wide range of industries, but they are particularly important in e-commerce, where comments and reviews on products and services are often the most convenient, if not the only, way for a buyer to decide whether to buy them.

Model training

Refer to the Jupyter notebooks in research folder to know the steps taken for preprocessing, model development and algorithms used. Although we experemented with different models, we found Naive Bayes to be most accurate with F1 score of 77%.

Installing and running this app:

  1. Requirements: Use pip install/conda install to download following packages
  2. Numpy, pandas
  3. sklearn
  4. spacy
  5. Django 2.1
  6. pickle
  7. tqdm
  8. running the app:

Installation Step :- 

  1. Go to folder containing manage.py and run command: python manage.py runserver
  2. Once the server starts, open browser. The app runs on http://127.0.0.1:8000/
  3. fake_reviews.txt and real_reviews.txt contains some reviews that can be used to test the working of model.

Fake News Detection using Machine Learning Natural Language Processing

Subscribe YouTube For Latest Update Click Here

Buy Source Code ₹1501

Buy Project Report  ₹1001

Fake News Detection using Machine Learning Natural Language Processing . A NLP and Machine Learning based web application used for detecting fake news. Uses NLP for preprocessing the input text. Uses XGBoost model for predicting whether the input news is Fake or Real.

here are tons of stories articles, where the news is fake or cooked up. With numerous advances in tongue Processing and machine learning, we will actually build an ml model which is in a position to detect if a bit of stories ... Here we'll be using artificial neural network models to verify the genuinity of the article.

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python3

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Database

SQLite

Dataset Link: https://www.kaggle.com/c/fake-news/data

Training Model File 

Fake_News_Classifier_Using_LSTM.ipynb

Fake_News_Classifier_using_Machine_Learning.ipynb

Output Generated File

xgb_fake_news_predictor.pkl

 

 

Driver Distraction Prediction Using Deep Learning, Machine Learning

Subscribe YouTube For Latest Update Click Here

Buy Now ₹1501

Buy Now Project Report ₹1001

Driver Distraction Prediction Using Machine Learning”, where given driver images, each taken during a car with a driver doing something within the car (texting, eating, talking on the phone, makeup, reaching behind, etc). The goal was to predict the likelihood of what the driving force is doing in each picture.

Driving a car may be a complex task, and it requires complete attention. Distracted driving is any activity that takes away the driver’s attention from the road. Several studies have identified three main sorts of distraction: visual distractions (driver’s eyes off the road), manual distractions (driver’s hands off the wheel) and cognitive distractions (driver’s mind off the driving task).

Dataset details -

  • Image Size - 480 X 640 pixels
  • Training Images count - 22424 images
  • Test Images count - 79726 images
  • Image type - RGB
  • Image field of view - Dashboard images with view of Driver and passenger
  • The 10 classes to predict are:
    • c0: safe driving
    • c1: texting - right
    • c2: talking on the phone - right
    • c3: texting - left
    • c4: talking on the phone - left
    • c5: operating the radio
    • c6: drinking
    • c7: reaching behind
    • c8: hair and makeup
    • c9: talking to passenger
  • Loss - multi-class logarithmic loss

State-Farm-Distracted-Driver-Detection

Kaggle hosted the challenge few years ago which focused on identifying distracted drivers using Computer Vision
Details of challenge can be found here - 
https://www.kaggle.com/c/state-farm-distracted-driver-detection

Implementation Details

  • DL Model - CNN's build from scratch ( 6 Conv Layer, 5 Dropout Layer, 3 Dense Layer)
  • Framework - Keras / Pytorch version in the process.
  • CNN Model Visualization/Model Interpretability - GradCAM
  • Final Accuracy -Train acc - 99.06%, Val acc-99 .46%

GRAD-CAM implementation for a test image with label drinking

GRAD-CAM is a technique to highlight how a model classifies new instanes by creating a heat map which highlights only the area which has contributed the most in prediction.
As seen in below image model classifies driver as distracted by drinking by highlighting the hand and glass.