The public puts its faith in Google reviews, but how deserving is the system of that faith? There have long been accusations that reviews and listings might not always be what they seem.
How prevalent is this problem, and what can be done about it? To answer those questions, I set out to sample local businesses and discover what patterns might emerge. What signals can we analyze to have some idea of what a normal listing looks like? If we can establish that, we can establish a baseline of what might make a listing suspicious.
My sample data is comprised of 13,591 reviews across 218 dental offices. I'll examine an array of factors, such as rating distributions, how many reviews a user typically has, the velocity of incoming reviews, and more. Hopefully, with enough signals to measure, a business faking their reviews will stand out like a sore thumb.
This post follows my own progress as I discover what does and doesn't work. If you'd like to skip to the more effective techniques, take a look toward the bottom of the post.
To begin with, how are ratings distributed on the 1 to 5 star spectrum? Taking all of the reviews I have available, I'll first generate a box plot to show medians and outliers for each star rating.
Let's take a look at how ratings are distributed for dental practices.
From this I can tell that spotting suspicious behavior from ratings alone is going to be difficult. Nearly perfect 5 star ratings don't seem to be uncommon for dental practices. This doesn't seem like a signal we can put a great deal of weight on while searching for fakes.
The idea behind review velocity is that we can examine the quantity of reviews over time and hopefully spot outliers. This can't be an exact measurement because Google doesn't share precise publication times for reviews. We can still come up with a rough sense of how quickly reviews are appearing, even if we have to rely on the vague time ranges Google does provide.
An important note here: For listings with hundreds of reviews, I was able to analyze up to 110 of the newest (this explains the vertical band of points at the end).
The Y axis, or velocity, shows the median of how many reviews a practice receives daily.
There are some pretty clear outliers here. Before looking deeper, I would guess that at least some of these are legitimate, simply because they're large practices receiving more reviews.
Some businesses may also be promoting reviews -- this is a practice that ranges from simply asking for a review, to having an entire system in place for automatically pushing for reviews. I do think they're worth looking into, however, and we'll see if we can cross compare with other metrics next.
Review Word Count
The word count of each review is another factor to take into account. My hypothesis is that fake reviews will tend to be shorter. Because review velocity revealed some interesting outliers, I decided to add word count as an extra dimension to that graph. The color of each point represents the median word count of reviews for that dental practice.
Shading each point according to its word count reinforces the outlier nature of some of the dental practices seen here. Inspecting a few of these practices, I can see that they have an out-sized number of 5 star reviews with empty comments.
This sample is from a practice that has 625 reviews, with a review velocity of 0.61 new reviews per day, and a median word count of just 1 word. For comparison, the median review velocity is 0.03, and the median word count is 25 across all practices.
The next metric takes into account the percentage of reviewers who have only 1 review. The median percentage, per dental practice, of reviewers with 1 review is 47.36%. Let's take a look at a box plot of this data to get a sense of what's normal and what isn't.
This plot shows that having roughly 60% or more one-time reviews as a share of total reviews for a dental practice is an outlier.
If we tie this new metric back to the previous example, we can see a pattern of outlier behaviors emerging for that specific practice:
- Review velocity of 0.61 new reviews per day (0.03 is the global median)
- A median word count of 1 word (25 words is the global median)
- This is their only review for 63.63% of the reviewers (47.36% globally)
The highest percentage example I've found has 75% of their reviews coming from one-time reviewers. This same practice also shows a review velocity of 0.91 and a median word count of a single word. The business also has a perfect 5.0 rating with over 362 reviews.
Here's an example from their review page.
My theory is that having an abnormal number of one-time reviewers, especially coupled with high review velocity and low word counts, indicates suspicious behavior on that listing.
What about listings with higher than average review counts? I thought I might find paid reviewers with obviously suspicious review histories, but so far I haven't seen that. Then again, I haven't had time to delve into review histories across 13,000+ reviews, so this kind of fraud may exist and I just haven't come across it yet.
For a future post I'd like to expand on this with even more metrics. I think there are other interesting questions to pose, such as how many reviewers have a generic profile photo versus a custom photo? Do lazy review fakers re-use text across their fake reviews?
I'd also be interested in redoing this analysis for different categories, such as law firms, to see how the trends differ. Stay tuned!