Posted on Leave a comment

Skin cancer Detection using Machine learning

Buy Now ₹1501

Skin cancer Detection using Machine learning .The purpose of this project is to create a tool that considering the image of a mole, can calculate the probability that a mole can be malign.

Skin cancer is a common disease that affect a big amount of peoples. Some facts about skin cancer:

Every year there are more new cases of skin cancer than the combined incidence of cancers of the breast, prostate, lung and colon.

An estimated 87,110 new cases of invasive melanoma will be diagnosed in the U.S. in 2017.

The estimated 5-year survival rate for patients whose melanoma is detected early is about 98 percent in the U.S. The survival rate falls to 62 percent when the disease reaches the lymph nodes, and 18 percent when the disease metastasizes to distant organs.

Development process and Data

The idea of this project is to construct a CNN model that can predict the probability that a specific mole can be malign.

Data: Skin cancer Detection using Machine learning

To train this model I'm planning to use a set of images from the International Skin Imaging Collaboration:

Mellanoma Project ISIC https://isic-archive.com.

The specific datasets to use are:

ISICUDA-21: Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and benign lesions are included.

Benign: 23

Malign: 37

ISICUDA-11 Moles and melanomas. Biopsy-confirmed melanocytic lesions. Both malignant and

benign lesions are included.

Benign: 398

Malign: 159

ISICMSK-21: Benign and malignant skin lesions. Biopsy-confirmed melanocytic and non-melanocytic lesions.

Benign: 1167 (Not used)

Malign: 352

ISICMSK-12: Both malignant and benign melanocytic and non-melanocytic lesions. Almost all images confirmed by histopathology. Images not taken with modern digital cameras.

Benign: 339

Malign: 77

ISICMSK-11: Moles and melanomas. Biopsy-confirmed melanocytic lesions, both malignant and benign.

Benign: 448 Malign: 224

As summary the total images to use are:

Benign ImagesMalign Images
1208849

Some sample images are shown below: 1. Sample images of benign moles:

Sample images of malign moles:

Preprocessing:

The following preprocessing tasks are going to be developed for each image: 1. Visual inspection to detect images with low quality or not representative 2. Image resizing: Transform images to 128x128x3 3. Crop images: Automatic or manual Crop 4. Other to define later in order to improve model quality

CNN Model:

The idea is to develop a simple CNN model from scratch, and evaluate the performance to set a baseline. The following steps to improve the model are: 1. Data augmentation: Rotations, noising, scaling to avoid overfitting 2. Transferred Learning: Using a pre-trained network construct some additional layer at the end to fine tuning our model. (VGG-16, or other) 3. Others to define.

Model Evaluation:

To evaluate the different models we will use ROC Curves and AUC score. To choose the correct model we will evaluate the precision and accuracy to set the threshold level that represent a good tradeoff between TPR and FPR.

Posted on

Heart Disease Prediction using Machine Learning Project

Buy Now ₹1501

Introduction

Heart diseases is a term covering any disorder of the heart. Heart diseases have become a major concern to deal with as studies show that the number of deaths due to heart diseases have increased significantly over the past few decades in India, in fact it has become the leading cause of death in India.

A study shows that from 1990 to 2016 the death rate due to heart diseases have increased around 34 per cent from 155.7 to 209.1 deaths per one lakh population in India.

Thus preventing Heart diseases has become more than necessary. Good data-driven systems for predicting heart diseases can improve the entire research and prevention process, making sure that more people can live healthy lives. This is where Machine Learning comes into play. Machine Learning helps in predicting the Heart diseases, and the predictions made are quite accurate.

Problem Description :

A dataset is formed by taking into consideration some of the information of 920 individuals. The problem is : based on the given information about each individual we have to calculate that whether that individual will suffer from heart disease.

Dataset :

The Heart disease data set consists of patient data from Cleveland, Hungary, Long Beach and Switzerland. The combined dataset consists of 14 features and 916 samples with many missing values. The features used in here are,

  1. Age : displays the age of the individual.
  2. Sex : displays the gender of the individual using the following format : 1 = male 0 = female.
  3. Chest-pain type : displays the type of chest-pain experienced by the individual using the following format : 1 = typical angina 2 = atypical angina 3 = non - anginal pain 4 = asymptotic
  4. Resting Blood Pressure : displays the resting blood pressure value of an individual in mmHg (unit)
  5. Serum Cholestrol : displays the serum cholestrol in mg/dl (unit)
  6. Fasting Blood Sugar : compares the fasting blood sugar value of an individual with 120mg/dl. If fasting blood sugar > 120mg/dl then : 1 (true) else : 0 (false)
  7. Resting ECG : 0 = normal 1 = having ST-T wave abnormality 2 = left ventricular hyperthrophy
  8. Max heart rate achieved : displays the max heart rate achieved by an individual.
  9. Exercise induced angina : 1 = yes 0 = no
  10. ST depression induced by exercise relative to rest : displays the value which is integer or float.
  11. Peak exercise ST segment : 1 = upsloping 2 = flat 3 = downsloping
  12. Number of major vessels (0-3) colored by flourosopy : displays the value as integer or float.
  13. Thal : displays the thalassemia : 3 = normal 6 = fixed defect 7 = reversable defect
  14. Diagnosis of heart disease : Displays whether the individual is suffering from heart disease or not : 0 = absence 1,2,3,4 = present.

Technologies Used : -

  1. Python 3.7
  2. Pandas
  3. Numpy
  4. Flask

Running the web app

Locally

  • Install requirements
    pip install -r requirements.txt
  • Run flask web app
    python main_file.py

Models used and accuracy

A Random forest classifier achieves an average multi-class classification accuracy of 56-60%(183 test samples). It gets 75-80% average binary classification accuracy(heart disease or no heart disease).

Posted on

Diabetes Prediction using Machine Learning Project Code

Buy Now ₹1501

In this Diabetes Prediction using Machine Learning Project Code, the objective is to predict whether the person has Diabetes or not based on various features like Number of Pregnancies, Insulin Level, Age, BMI.The data set that has used in this project has taken from the kaggle . "This dataset is originally from the National Institute of Diabetes and Digestive and Kidney Diseases. The objective of the dataset is to diagnostically predict whether or not a patient has diabetes, based on certain diagnostic measurements included in the dataset. Several constraints were placed on the selection of these instances from a larger database. In particular, all patients here are females at least 21 years old of Pima Indian heritage." and used a simple random forest classifier.

Learning Objectives : -

The following points were the objective of the project (The main intention was to create an end-to-end ML project.)

  1. Data gathering
  2. Descriptive Analysis
  3. Data Visualizations
  4. Data Preprocessing
  5. Data Modelling
  6. Model Evaluation
  7. Model Deployment

Technical Aspect : -

  1. Training a machine learning model using scikit-learn.
  2. Building and hosting a Flask web app.
  3. A user has to put details like Number of Pregnancies, Insulin Level, Age, BMI etc .
  4. Once it get all the fields information , the prediction is displyed on a new page .

Technologies Used : -

  1. Python 3.7
  2. Pandas
  3. Numpy
  4. Flask

 

Datasets

https://www.kaggle.com/uciml/pima-indians-diabetes-database

 

Installation

  1. Download  and unzip it.
  2. After downloading, cd into the flask directory.
  3. Begin a new virtual environment with Python 3 and activate it.
  4. Install the required packages using pip install -r requirements.txt

RUN

  1. Execute the command: python app.py
Posted on

Black Friday Sales Prediction project with source code

Buy Now ₹1501

Black Friday Sales Prediction project with source code . In this project, we are getting to predict what proportion the purchasers will spend during Black Friday, using various features like age, gender, legal status . The dataset we are going to use is the Black Friday dataset from Kaggle which contains about 550068 rows and 12 features that can be downloaded here. We will follow all the steps of a Data Science lifecycle from data collection to model deployment.

This Project contains a jupyter notebook file used to train a CatBoostRegressor model for predicting the amount of sales on a black friday based on several feautures.
The model was then integrated into a flask web application

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Dataset Link: https://www.kaggle.com/c/black-friday/data

Training Model File 

model.ipynb

model-checkpoint.ipynb

Output Generated File

catBoost.pkl

Read Before Purchase  :

  1. One Time Free Installation Support.
  2. Terms and Conditions on this page: https://projectworlds/terms
  3. We offer Paid Customization installation Support
  4.  If you have any questions please contact  Support Section
  5. Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
  6. You can download the product after the purchase by a direct link on this page.

 

 

 

Posted on

Fake Product Review Detection using Machine Learning

Buy Now ₹1501

Online reviews play a very important role for decision-making in today's e-commerce. Large parts of the population, i.e. customers read product or store reviews before deciding what to buy or where to buy and whether to buy or not. Because writing fake / fraudulent reviews comes with monetary gain, online review websites there has been a huge increase in tricky opinion spam. Basically, an untruthful review is a fake review or fraudulent review or opinion spam. Positive reviews of a target object can attract more customers and increase sales; negative reviews of a target object can result in lower demand and lower sales. Fake review detection has attracted considerable attention in recent years. Most review sites, however, still do not filter fake reviews publicly. Yelp is an exception that over the past few years

has filtered reviews. Yelp's algorithm, however, is a business secret. In this work, by analyzing their filtered reviews, we try to find out what Yelp could do. The results will be useful in their filtering effort for other review hosting sites. Filtering has two main approaches: supervised and unmonitored learning. There are also about two types in terms of the characteristics used: linguistic characteristics and behavioral characteristics. Through supervised learning approach we have tried to make a model which can identify the fake review with almost 70 percent accuracy.

As the Internet continues to grow in size and importance, the quantity and impact of online reviews is increasing continuously. Reviews can influence people across a wide range of industries, but they are particularly important in e-commerce, where comments and reviews on products and services are often the most convenient, if not the only, way for a buyer to decide whether to buy them.

 

Model training

Refer to the Jupyter notebooks in research folder to know the steps taken for preprocessing, model development and algorithms used. Although we experemented with different models, we found Naive Bayes to be most accurate with F1 score of 77%. 

Installing and running this app:

  1. Requirements: Use pip install/conda install to download following packages
  • Numpy, pandas
  • sklearn
  • spacy
  • Django 2.1
  • pickle
  • tqdm
  1. running the app:
  • Go to folder containing manage.py and run command: python manage.py runserver
  • Once the server starts, open browser. The app runs on http://127.0.0.1:8000/
  • fake_reviews.txt and real_reviews.txt contains some reviews that can be used to test the working of model.
Posted on

Fake News Detection using Machine Learning Natural Language Processing

Buy Now ₹1501

Fake News Detection using Machine Learning Natural Language Processing . A NLP and Machine Learning based web application used for detecting fake news. Uses NLP for preprocessing the input text. Uses XGBoost model for predicting whether the input news is Fake or Real.

here are tons of stories articles, where the news is fake or cooked up. With numerous advances in tongue Processing and machine learning, we will actually build an ml model which is in a position to detect if a bit of stories ... Here we'll be using artificial neural network models to verify the genuinity of the article.

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python3

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Database

SQLite

Dataset Link: https://www.kaggle.com/c/fake-news/data

Training Model File 

Fake_News_Classifier_Using_LSTM.ipynb

Fake_News_Classifier_using_Machine_Learning.ipynb

Output Generated File

xgb_fake_news_predictor.pkl

 

 

Posted on

Driver Distraction Prediction Using Deep Learning, Machine Learning

Buy Now ₹1501

Driver Distraction Prediction Using Machine Learning”, where given driver images, each taken during a car with a driver doing something within the car (texting, eating, talking on the phone, makeup, reaching behind, etc). The goal was to predict the likelihood of what the driving force is doing in each picture.

Driving a car may be a complex task, and it requires complete attention. Distracted driving is any activity that takes away the driver’s attention from the road. Several studies have identified three main sorts of distraction: visual distractions (driver’s eyes off the road), manual distractions (driver’s hands off the wheel) and cognitive distractions (driver’s mind off the driving task).

Dataset details -

  • Image Size - 480 X 640 pixels
  • Training Images count - 22424 images
  • Test Images count - 79726 images
  • Image type - RGB
  • Image field of view - Dashboard images with view of Driver and passenger
  • The 10 classes to predict are:
    • c0: safe driving
    • c1: texting - right
    • c2: talking on the phone - right
    • c3: texting - left
    • c4: talking on the phone - left
    • c5: operating the radio
    • c6: drinking
    • c7: reaching behind
    • c8: hair and makeup
    • c9: talking to passenger
  • Loss - multi-class logarithmic loss

State-Farm-Distracted-Driver-Detection

Kaggle hosted the challenge few years ago which focused on identifying distracted drivers using Computer Vision
Details of challenge can be found here - 
https://www.kaggle.com/c/state-farm-distracted-driver-detection

Implementation Details

  • DL Model - CNN's build from scratch ( 6 Conv Layer, 5 Dropout Layer, 3 Dense Layer)
  • Framework - Keras / Pytorch version in the process.
  • CNN Model Visualization/Model Interpretability - GradCAM
  • Final Accuracy -Train acc - 99.06%, Val acc-99 .46%

GRAD-CAM implementation for a test image with label drinking

GRAD-CAM is a technique to highlight how a model classifies new instanes by creating a heat map which highlights only the area which has contributed the most in prediction.
As seen in below image model classifies driver as distracted by drinking by highlighting the hand and glass.

 

Posted on

Multiple Disease Prediction using Machine Learning

Buy Now ₹1501

Multiple Disease Prediction using Machine Learning . This Web App was developed using Python Flask Web Framework . The models won’t to predict the diseases were trained on large Datasets. All the links for datasets and therefore the python notebooks used for model creation are mentioned below during this readme. The WebApp can predict following Diseases:

  • Diabetes
  • Breast Cancer
  • Heart Disease
  • Kidney Disease
  • Liver Disease
  • Malaria
  • Pneumonia

Models with their Accuracy of Prediction

DiseaseType of ModelAccuracy
DiabetesMachine Learning Model98.25%
Breast CancerMachine Learning Model98.25%
Heart DiseaseMachine Learning Model85.25%
Kidney DiseaseMachine Learning Model99%
Liver DiseaseMachine Learning Model78%
MalariaDeep Learning Model(CNN)96%
PneumoniaDeep Learning Model(CNN)95%

 

Steps to run the WebApp in local Computer

Step-1: Download the files in the repository.
Step-2: Get into the downloaded folder, open command prompt in that directory and install all the dependencies using following command

pip install -r requirements.txt

Step-3: After successfull installation of all the dependencies, run the following command

python app.py

Dataset Links

All the datasets were used from kaggle.

Buy Now ₹1501

Posted on

Crime Data Analysis Project in Machine Learning

Buy Now ₹1501

Crime Data Analysis Project in Machine Learning .Crime analyses is one among the important application of knowledge mining. data processing contains many tasks and techniques including Classification, Association, Clustering, Prediction each of them has its own importance and applications It can help the analysts to spot crimes faster and help to form faster decisions.
The main objective of crime analysis is to seek out the meaningful information from great deal of knowledge and disseminates this information to officers and investigators within the field to help in their efforts to apprehend criminals and suppress criminal activity. In this project, Kmeans Clustering is employed for crime data analysis.

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python

Numpy , Pandas , Scipy
matplotlib
scikit-learn
seaborn

Database

SQLite

Read Before Purchase  :

  1. One Time Free Installation Support.
  2. Terms and Conditions on this page: https://projectworlds/terms
  3. We offer Paid Customization installation Support
  4.  If you have any questions please contact  Support Section
  5. Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
  6. You can download the product after the purchase by a direct link on this page.
Posted on

Movie Recommendation System Project Using Collaborative Filtering, Python Django, Machine Learning

Buy Now ₹2501

( Note : Project Included with Complete Source code Database Plus Documentation, Synopsis, Report)

Recommender systems are one of the most successful and widespread application of machine learning technologies in business. You can find large scale recommender systems in retail, video on demand, or music streaming.

A Web Base user-item Movie Recommendation Engine using Collaborative Filtering By matrix factorizations algorithm and thus the advice supported the underlying concept is that if two persons both liked certian common movies,then the films that one person has liked that the opposite person has not yet watched are often recommended to him.

A recommender system is a type of information recommend movies to user according to their area of interest. Our recommender system provide personalized information by learning the user‟s interests from previous interactions with that user[2]. In pattern recognition, the knearest neighbours algorithm (k-NN) is a flexible method used for classification. In following cases, the input consists of the k closest examples in given space. If k = 1, then the object is simply assigned to the class of that single nearest neighbour.

Algorithms Implemented 

  • Content based filtering
  • Collaborative Filtering
    • Memory based collaborative filtering
      • User-Item Filtering
      • Item-Item Filtering
    • Model based collaborative filtering
      • Single Value Decomposition(SVD)
      • SVD++
  • Hybrid Model
    • Content Based + SVD

Technologies Used

Web Technologies

Html , Css , JavaScript , Bootstrap , Django

Machine Learning Library In Python3

Numpy , Pandas , Scipy

Database

SQLite

Requirements
python 3.6

pip3

virtualenv

Read Before Purchase  :

  1. One Time Free Installation Support.
  2. Terms and Conditions on this page: https://projectworlds/terms
  3. We offer Paid Customization installation Support
  4.  If you have any questions please contact  Support Section
  5. Please note that any digital products presented on the website do not contain malicious code, viruses or advertising. You buy the original files from the developers. We do not sell any products downloaded from other sites.
  6. You can download the product after the purchase by a direct link on this page.