Online reviews play a very important role for decision-making in today's e-commerce. Large parts of the population, i.e. customers read product or store reviews before deciding what to buy or where to buy and whether to buy or not. Because writing fake / fraudulent reviews comes with monetary gain, online review websites there has been a huge increase in tricky opinion spam. Basically, an untruthful review is a fake review or fraudulent review or opinion spam. Positive reviews of a target object can attract more customers and increase sales; negative reviews of a target object can result in lower demand and lower sales. Fake review detection has attracted considerable attention in recent years. Most review sites, however, still do not filter fake reviews publicly. Yelp is an exception that over the past few years
has filtered reviews. Yelp's algorithm, however, is a business secret. In this work, by analyzing their filtered reviews, we try to find out what Yelp could do. The results will be useful in their filtering effort for other review hosting sites. Filtering has two main approaches: supervised and unmonitored learning. There are also about two types in terms of the characteristics used: linguistic characteristics and behavioral characteristics. Through supervised learning approach we have tried to make a model which can identify the fake review with almost 70 percent accuracy.
As the Internet continues to grow in size and importance, the quantity and impact of online reviews is increasing continuously. Reviews can influence people across a wide range of industries, but they are particularly important in e-commerce, where comments and reviews on products and services are often the most convenient, if not the only, way for a buyer to decide whether to buy them.
Refer to the Jupyter notebooks in research folder to know the steps taken for preprocessing, model development and algorithms used. Although we experemented with different models, we found Naive Bayes to be most accurate with F1 score of 77%.
Installing and running this app:
- Requirements: Use pip install/conda install to download following packages
- Numpy, pandas
- Django 2.1
- running the app:
- Go to folder containing manage.py and run command: python manage.py runserver
- Once the server starts, open browser. The app runs on http://127.0.0.1:8000/
- fake_reviews.txt and real_reviews.txt contains some reviews that can be used to test the working of model.