Projectworlds > Blog > Machine Learning Projects With Source Code

Credit Card Fraud Detection Machine Learning Project

Posted on June 30, 2021March 9, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Frauds in mastercard transactions are common today as most folks are using the mastercard payment methods more frequently. this is often thanks to the advancement of Technology and increase in online transaction leading to frauds causing huge loss . Therefore, there's need for effective methods to scale back the loss. additionally , fraudsters find ways to steal the mastercard information of the user by sending fake SMS and calls, also through masquerading attack, phishing attack then on. This paper aims in using the multiple algorithms of Machine learning like support vector machine (SVM), k-nearest neighbor (Knn) and artificial neural network (ANN) in predicting the occurrence of the fraud. Further, we conduct a differentiation of the accomplished supervised machine learning and deep learning techniques to differentiate between fraud and non-fraud transactions.

link of dataset=https://www.kaggle.com/mlg-ulb/creditcardfraud

The datasets contains credit card transactions over a two day collection period in September 2013 by European cardholders. There are a total of 284,807 transactions, of which 492 (0.172%) are fraudulent.

The dataset contains numerical variables that are the result of a principal components analysis (PCA) transformation. This transformation was applied by the original authors to maintain confidentiality of sensitive information. Additionally the dataset contains Time and Amount, which were not transformed by PCA. The Time variable contains the seconds elapsed between each transaction and the first transaction in the dataset. The Amount variable is the transaction amount, this feature can be used for example-dependant cost-senstive learning. The Class variable is the response variable and indicates whether the transaction was fraudulant.

The dataset was collected and analysed during a research collaboration of Worldline and the Machine Learning Group of Université Libre de Bruxelles (ULB) on big data mining and fraud detection.

Models

Applied various classification techniques like :-
Logistic Regression Light
GBM K Nearest Neighbors (KNN ) Classification
Trees Random Forest
SVM XGBoost Classifier

Technology Used in the project :-

We have developed this project using the below technology
HTML : Page layout has been designed in HTML
CSS : CSS has been used for all the desigining part
JavaScript : All the validation task and animations has been developed by JavaScript
Python : All the business logic has been implemented in Python
Flask: Project has been developed over the Flask Framework

Supported Operating System :-

We can configure this project on following operating system.
Windows : This project can easily be configured on windows operating system. For running this project on Windows system, you will have to install
Python 3.6.10, PIP, Django.
Linux : We can run this project also on all versions of Linux operating systemMac : We can also easily configured this project on Mac operating system.

Installation Step : -

python 3.6.8
command 1 - python -m pip install --user -r requirements.txt
command 2 - python app.py

Read Before Purchase :

One Time Free Installation Support.
Terms and Conditions on this page: https://projectworlds/terms
We offer Paid Customization installation Support
If you have any questions please contact Support Section
Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
You can download the product after the purchase by a direct link on this page.

Hypo Thyroid Disease prediction Machine Learning Project

Posted on June 30, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Hypothyroid diseases (underactive thyroid) is a condition in which the body doesn't produce enough of important thyroid hormones. The condition may lead to various symptoms at late ages. More information about the disease is available at https://www.mayoclinic.org/diseases-conditions/hypothyroidism/symptoms-causes/syc-20350284 .

The Data

The data was from: http://archive.ics.uci.edu/ml/datasets/thyroid+disease. I used "allhypo.data" for the analysis. "allhypo.names" contains the column names of the data. Include the info about primary data processing in the Jupyter notebook list below.

set of algorithms performed to carry out the analysis of the "thyroid-disease" database published in the UCI page
URL data source
data: https://archive.ics.uci.edu/ml/machine-learning-databases/thyroid-disease/sick-euthyroid.data
names: https://archive.ics.uci.edu/ml/machine-learning-databases/thyroid-disease/sick-euthyroid.names

Algorithms

Naıve Bayes
KNN
ANN
Random Forest
SVM
FSF
PCA
LCA

Related sources

Ionita, Irina. (2016). Prediction of Thyroid Disease Using Data Mining Techniques. BRAIN. Broad Research in Artificial Intelligence and Neuroscience. Vol.7. pp.115-124.
URL: https://www.researchgate.net/publication/321145710_Prediction_of_Thyroid_Disease_Using_Data_Mining_Techniques

Ammulu K., Venugopal. (2017). Thyroid Data Prediction using Data Classification Algorithm. IJIRST –International Journal for Innovative Research in Science & Technology. Vol.4. Issue 2. July 2017. ISSN (online): 2349-6010
URL: http://www.ijirst.org/articles/IJIRSTV4I2054.pdf

Geetha K., Santosh S. Eficient Thyroid Disease Classification Using Differential Evolution with SVM. Journal of Theoretical and Applied Information Technology. Vol.88. No.3. E-ISSN: 1817-3195
URL: http://www.jatit.org/volumes/Vol88No3/4Vol88No3.pdf

Banu, Gulmohamed. (2016). Predicting Thyroid Disease using Linear Discriminant Analysis (LDA) Data Mining Technique. Communications on Applied Electronics. 4. 4-6. 10.5120/cae2016651990. URL: https://www.caeaccess.org/research/volume4/number1/banu-2016-cae-651990.pdf

Lou H, Wang L, Duan D, Yang C,Mammadov M (2018) RDE: A novel approach to improve the classification performance and expressivity of KDB. PLoS ONE 13(7): e0199822. URL: https://doi.org/10.1371/journal.pone.0199822

Read Before Purchase :

One Time Free Installation Support.
Terms and Conditions on this page: https://projectworlds/terms
We offer Paid Customization installation Support
If you have any questions please contact Support Section
Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
You can download the product after the purchase by a direct link on this page.

Live Face Mask Detection Project in Machine Learning

Posted on May 13, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Buy Now Project Report ₹1001

Face Mask Detection web applicaion built with Flask, Keras-TensorFlow, OpenCV. It can be used to detect face masks both in images and in real-time video.

The goal is to create a masks detection system, able to recognize face masks both in images, both in real-time video, drawing bounding box around faces. In order to do so, I finetuned MobilenetV2 pretrained on Imagenet, in conjunction with the OpenCV face detection algorithm: that allows me to turn a classifier model into an object detection system. Live Face Mask Detection Project in Machine Learning.

Technologies

Keras/Tensorflow
OpenCV
Flask
MobilenetV2

Installation:

You have to install the required packages, you can do it:

via pip pip install -r requirements.txt
or via conda conda env create -f environment.yml

Once you installed all the required packages you can type in the command line from the root folder:

python app.py

and click on the link that the you will see on the prompt.

Datasets

The dataset used for training the model is available here.

Loan Eligibility Prediction Python Machine Learning Project

Posted on May 9, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Loan Eligibility Prediction Python Machine Learning Project. Loan approval is a very important process for banking organizations. The system approved or reject the loan applications. Recovery of loans is a major contributing parameter in the financial statements of a bank. It is very difficult to predict the possibility of payment of loan by the customer. In recent years many researchers worked on loan approval prediction systems. Machine Learning (ML)techniques are very useful in predicting outcomes for large amount of data.

Key Features

Interface to predict loan application approval
data insights withhin Jupyter Notebook
Trained Model
multiple machine learning algorithms.

Technology :

Flask==1.1.1
html5lib==1.0.1
json5==0.8.5
jsonify==0.5
numpy==1.16.5
pandas==0.25.1
scikit-image==0.15.0
scikit-learn==0.21.3
scipy==1.3.1
gunicorn==19.9.0
requests==2.22.0

Loan Defaulter Prediction Machine Learning Projects

Posted on April 13, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Using supervised machine learning to train a model with credit default data to determine the probability and/or classification (“default” vs “non-default”) of the user’s liability. The UI will take user input such as, such as education level, sex, marital status, payment history and income, and will return a classification.

An app like this would be useful for financial and lending institutions to understand and manage the risk of their loans and lending portfolios.

Goals/Outcome

Determining probability of user liability
Creating an interactive UI that will take users input and return an output
To determine if a neural network vs logistic regression is the better model for classification

Models Created

Logistic Regression
Random Forest Model
Deep Neural Network

About

Probability of Credit Card Default, Machine Learning

Technologies Used : -

beautifulsoup4==4.6.0
certifi==2018.4.16
chardet==3.0.4
click==6.7
Flask==1.0
gunicorn==19.8.0
idna==2.6
itsdangerous==0.24
Jinja2==2.10
MarkupSafe==1.0
numpy==1.14.3
pandas==0.22.0
python-dateutil==2.7.2
pytz==2018.4
requests==2.18.4
scikit-learn==0.19.1
scipy==1.0.1
six==1.11.0
SQLAlchemy==1.2.7
urllib3==1.22
Werkzeug==0.14.1

Used Car Price Prediction Using Machine Learning

Posted on April 10, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Car Price Prediction is a really an interesting machine learning problem as there are many factors that influence the price of a car in the second-hand market. In this competition, we will be looking at a dataset based on sale/purchase of cars where our end goal will be to predict the price of the car given its features to maximize the profit.

Datasets Link - Kaggle Data

Technologies Used : -

Python 3.7
Pandas
Numpy
Flask

Running the web app

Locally

Install requirements
pip install -r requirements.txt
Run flask web app
python app.py

Skin cancer Detection using Machine learning

Posted on April 3, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Buy Now ₹1501

Buy Now Project Report ₹1001

Skin cancer Detection using Machine learning .The purpose of this project is to create a tool that considering the image of a mole, can calculate the probability that a mole can be malign.

Skin cancer is a common disease that affect a big amount of peoples. Some facts about skin cancer:

Every year there are more new cases of skin cancer than the combined incidence of cancers of the breast, prostate, lung and colon.

An estimated 87,110 new cases of invasive melanoma will be diagnosed in the U.S. in 2017.

The estimated 5-year survival rate for patients whose melanoma is detected early is about 98 percent in the U.S. The survival rate falls to 62 percent when the disease reaches the lymph nodes, and 18 percent when the disease metastasizes to distant organs.

Development process and Data

The idea of this project is to construct a CNN model that can predict the probability that a specific mole can be malign.

Data: Skin cancer Detection using Machine learning

To train this model I'm planning to use a set of images from the International Skin Imaging Collaboration:

Mellanoma Project ISIC https://isic-archive.com.

The specific datasets to use are:

ISICUDA-21: Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and benign lesions are included.

Benign: 23

Malign: 37

ISICUDA-11 Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and

benign lesions are included.

Benign: 398

Malign: 159

ISICMSK-21: Benign and malignant skin lesions. Biopsy-confirmed melanocytic and non-melanocytic lesions.

Benign: 1167 (Not used)

Malign: 352

ISICMSK-12: Both malignant and benign melanocytic and non-melanocytic lesions. Almost all images confirmed by histopathology. Images not taken with modern digital cameras.

Benign: 339

Malign: 77

ISICMSK-11: Moles and melanomas. Biopsy-confirmed melanocytic lesions, both malignant and benign.

Benign: 448 Malign: 224

As summary the total images to use are:

Benign Images	Malign Images
1208	849

Some sample images are shown below: 1. Sample images of benign moles:

Sample images of malign moles:

Preprocessing:

The following preprocessing tasks are going to be developed for each image: 1. Visual inspection to detect images with low quality or not representative 2. Image resizing: Transform images to 128x128x3 3. Crop images: Automatic or manual Crop 4. Other to define later in order to improve model quality

CNN Model:

The idea is to develop a simple CNN model from scratch, and evaluate the performance to set a baseline. The following steps to improve the model are: 1. Data augmentation: Rotations, noising, scaling to avoid overfitting 2. Transferred Learning: Using a pre-trained network construct some additional layer at the end to fine tuning our model. (VGG-16, or other) 3. Others to define.

Model Evaluation:

To evaluate the different models we will use ROC Curves and AUC score. To choose the correct model we will evaluate the precision and accuracy to set the threshold level that represent a good tradeoff between TPR and FPR.

python 3.6.8

Heart Disease Prediction using Machine Learning Project

Posted on March 14, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now ₹1501

Buy Now Project Report ₹1001

Introduction

Heart diseases is a term covering any disorder of the heart. Heart diseases have become a major concern to deal with as studies show that the number of deaths due to heart diseases have increased significantly over the past few decades in India, in fact it has become the leading cause of death in India.

A study shows that from 1990 to 2016 the death rate due to heart diseases have increased around 34 per cent from 155.7 to 209.1 deaths per one lakh population in India.

Thus preventing Heart diseases has become more than necessary. Good data-driven systems for predicting heart diseases can improve the entire research and prevention process, making sure that more people can live healthy lives. This is where Machine Learning comes into play. Machine Learning helps in predicting the Heart diseases, and the predictions made are quite accurate.

Problem Description :

A dataset is formed by taking into consideration some of the information of 920 individuals. The problem is : based on the given information about each individual we have to calculate that whether that individual will suffer from heart disease.

Dataset :

The Heart disease data set consists of patient data from Cleveland, Hungary, Long Beach and Switzerland. The combined dataset consists of 14 features and 916 samples with many missing values. The features used in here are,

Age : displays the age of the individual.
Sex : displays the gender of the individual using the following format : 1 = male 0 = female.
Chest-pain type : displays the type of chest-pain experienced by the individual using the following format : 1 = typical angina 2 = atypical angina 3 = non - anginal pain 4 = asymptotic
Resting Blood Pressure : displays the resting blood pressure value of an individual in mmHg (unit)
Serum Cholestrol : displays the serum cholestrol in mg/dl (unit)
Fasting Blood Sugar : compares the fasting blood sugar value of an individual with 120mg/dl. If fasting blood sugar > 120mg/dl then : 1 (true) else : 0 (false)
Resting ECG : 0 = normal 1 = having ST-T wave abnormality 2 = left ventricular hyperthrophy
Max heart rate achieved : displays the max heart rate achieved by an individual.
Exercise induced angina : 1 = yes 0 = no
ST depression induced by exercise relative to rest : displays the value which is integer or float.
Peak exercise ST segment : 1 = upsloping 2 = flat 3 = downsloping
Number of major vessels (0-3) colored by flourosopy : displays the value as integer or float.
Thal : displays the thalassemia : 3 = normal 6 = fixed defect 7 = reversable defect
Diagnosis of heart disease : Displays whether the individual is suffering from heart disease or not : 0 = absence 1,2,3,4 = present.

Technologies Used : -

Python 3.7
Pandas
Numpy
Flask

Running the web app

Locally

Install requirements
pip install -r requirements.txt
Run flask web app
python main_file.py

Models used and accuracy

A Random forest classifier achieves an average multi-class classification accuracy of 56-60%(183 test samples). It gets 75-80% average binary classification accuracy(heart disease or no heart disease).

Diabetes Prediction using Machine Learning Project Code

Posted on March 13, 2021January 21, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Now Source Code ₹1501

Buy Now Project Report ₹1001

In this Diabetes Prediction using Machine Learning Project Code, the objective is to predict whether the person has Diabetes or not based on various features like Number of Pregnancies, Insulin Level, Age, BMI.The data set that has used in this project has taken from the kaggle . "This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage." and used a simple random forest classifier.

Learning Objectives : -

The following points were the objective of the project (The main intention was to create an end-to-end ML project.)

Data gathering
Descriptive Analysis
Data Visualizations
Data Preprocessing
Data Modelling
Model Evaluation
Model Deployment

Technical Aspect : -

Training a machine learning model using scikit-learn.
Building and hosting a Flask web app.
A user has to put details like Number of Pregnancies, Insulin Level, Age, BMI etc .
Once it get all the fields information , the prediction is displyed on a new page .

Technologies Used : -

Python 3.7
Pandas
Numpy
Flask

Datasets

https://www.kaggle.com/uciml/pima-indians-diabetes-database

Installation

Download and unzip it.
After downloading, cd into the flask directory.
Begin a new virtual environment with Python 3 and activate it.
Install the required packages using pip install -r requirements.txt

RUN

Execute the command: python app.py

Black Friday Sales Prediction project with source code

Posted on March 3, 2021April 3, 2024 by Yugesh Verma

Subscribe YouTube For Latest Update Click Here

Latest Machine Learning Project with Source Code

Buy Source Code ₹1501

Buy Project Report ₹1001

Black Friday Sales Prediction project with source code . In this project, we are getting to predict what proportion the purchasers will spend during Black Friday, using various features like age, gender, legal status . The dataset we are going to use is the Black Friday dataset from Kaggle which contains about 550068 rows and 12 features that can be downloaded here. We will follow all the steps of a Data Science lifecycle from data collection to model deployment.

This Project contains a jupyter notebook file used to train a CatBoostRegressor model for predicting the amount of sales on a black friday based on several feautures.
The model was then integrated into a flask web application

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Dataset Link: https://www.kaggle.com/c/black-friday/data

Training Model File

model.ipynb

model-checkpoint.ipynb

Output Generated File

catBoost.pkl

Read Before Purchase :

One Time Free Installation Support.
Terms and Conditions on this page: https://projectworlds/terms
We offer Paid Customization installation Support
If you have any questions please contact Support Section
Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
You can download the product after the purchase by a direct link on this page.