Is there an Amazon.com API to retrieve product reviews?

Amazon Web-Services

Amazon Web-Services Problem Overview


Do any of the AWS APIs/Services provide access to the product reviews for items sold by Amazon? I'm interested in looking up reviews by (ASIN, user_id) tuple. I can see that the Product Advertising API returns a URL to a page (for embedding in an IFRAME) containing the URLs, but I am interested in a machine-readable format of the review data, if possible.

Amazon Web-Services Solutions


Solution 1 - Amazon Web-Services

Update 2:

Please see @jpillora's comment. It's probably the most relevant regarding Update 1.

> I just tried out the Product Advertising API (as of 2014-09-17), it seems that this API only returns a URL pointing to an iframe containing just the reviews. I guess you'd have to screen scrape - though I imagine that would break Amazon's TOS.

Update 1:

Maybe. I wrote the original answer below earlier. I don't have time to look into this right now because I'm no longer on a project concerned with Amazon reviews, but their webpage at Product Advertising API states "The Product Advertising API helps you advertise Amazon products using product search and look up capability, product information and features such as Customer Reviews..." as of 2011-12-08. So I hope someone looks into it and posts back here; feel free to edit this answer.

Original:

Nope.

Here is an intersting forum discussion about the fact including theories as to why: http://forums.digitalpoint.com/showthread.php?t=1932326

If I'm wrong, please post what you find. I'm interested in getting the reviews content, as well as allowing submitting reviews to Amazon, if possible.

You might want to check this link: http://reviewazon.com/. I just stumbled across it and haven't looked into it, but I'm surprised I don't see any mention on their site about the update concerning the drop of Reviews from the Amazon Products Advertising API posted at: https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html

Solution 2 - Amazon Web-Services

Here's my quick take on it - you easily can retrieve the reviews themselves with a bit more work:

countries=['com','co.uk','ca','de']
books=[
        '''http://www.amazon.%s/Glass-House-Climate-Millennium-ebook/dp/B005U3U69C''',
        '''http://www.amazon.%s/The-Japanese-Observer-ebook/dp/B0078FMYD6''',
        '''http://www.amazon.%s/Falling-Through-Water-ebook/dp/B009VJ1622''',
      ]
import urllib2;
for book in books:
    print '-'*40
    print book.split('%s/')[1]
    for country in countries:
        asin=book.split('/')[-1]; title=book.split('/')[3]
        url='''http://www.amazon.%s/product-reviews/%s'''%(country,asin)
        try: f = urllib2.urlopen(url)
        except: page=""
        page=f.read().lower(); print '%s=%s'%(country, page.count('member-review'))
print '-'*40

Solution 3 - Amazon Web-Services

According to Amazon Product Advertizing API License Agreement (https://affiliate-program.amazon.com/gp/advertising/api/detail/agreement.html) and specifically it's point 4.b.iii:

You will use Product Advertising Content only ... to send end users to and drive sales on the Amazon Site.

which means it's prohibited for you to show Amazon products reviews taken trough their API to sale products at your site. It's only allowed to redirect your site visitors to Amazon and get the affiliate commissions.

Solution 4 - Amazon Web-Services

I would use something like the answer of @mfs above. Unfortunately, his/her answer would only work for up to 10 reviews, since this is the maximum that can be displayed on one page.

You may consider the following code:

import requests

nreviews_re = {'com': re.compile('\d[\d,]+(?= customer review)'), 
               'co.uk':re.compile('\d[\d,]+(?= customer review)'),
               'de': re.compile('\d[\d\.]+(?= Kundenrezens\w\w)')}
no_reviews_re = {'com': re.compile('no customer reviews'), 
                 'co.uk':re.compile('no customer reviews'),
                 'de': re.compile('Noch keine Kundenrezensionen')}

def get_number_of_reviews(asin, country='com'):                                 
    url = 'http://www.amazon.{country}/product-reviews/{asin}'.format(country=country, asin=asin)
    html = requests.get(url).text
    try:
        return int(re.compile('\D').sub('',nreviews_re[country].search(html).group(0)))
    except:
        if no_reviews_re[country].search(html):
            return 0
        else:
            return None  # to distinguish from 0, and handle more cases if necessary

Running this with 1433524767 (which has significantly different number of reviews for the three countries of interest) I get:

>> print get_number_of_reviews('1433524767', 'com')
3185
>> print get_number_of_reviews('1433524767', 'co.uk')
378
>> print get_number_of_reviews('1433524767', 'de')
16

Hope it helps

Solution 5 - Amazon Web-Services

As said by others above, amazon has discontinued providing reviews in its api. Howevever, i found this nice tutorial to do the same with python. Here is the code he gives, works for me! He uses python 2.7

#!/usr/bin/env python
# -*- coding: utf-8 -*-
# Written as part of https://www.scrapehero.com/how-to-scrape-amazon-product-reviews-using-python/		
from lxml import html  
import json
import requests
import json,re
from dateutil import parser as dateparser
from time import sleep

def ParseReviews(asin):
	#This script has only been tested with Amazon.com
	amazon_url  = 'http://www.amazon.com/dp/'+asin
	# Add some recent user agent to prevent amazon from blocking the request 
	# Find some chrome user agent strings  here https://udger.com/resources/ua-list/browser-detail?browser=Chrome
	headers = {'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.90 Safari/537.36'}
	page = requests.get(amazon_url,headers = headers).text

	parser = html.fromstring(page)
	XPATH_AGGREGATE = '//span[@id="acrCustomerReviewText"]'
	XPATH_REVIEW_SECTION = '//div[@id="revMHRL"]/div'
	XPATH_AGGREGATE_RATING = '//table[@id="histogramTable"]//tr'
	XPATH_PRODUCT_NAME = '//h1//span[@id="productTitle"]//text()'
	XPATH_PRODUCT_PRICE  = '//span[@id="priceblock_ourprice"]/text()'
	
	raw_product_price = parser.xpath(XPATH_PRODUCT_PRICE)
	product_price = ''.join(raw_product_price).replace(',','')

	raw_product_name = parser.xpath(XPATH_PRODUCT_NAME)
	product_name = ''.join(raw_product_name).strip()
	total_ratings  = parser.xpath(XPATH_AGGREGATE_RATING)
	reviews = parser.xpath(XPATH_REVIEW_SECTION)

	ratings_dict = {}
	reviews_list = []

	#grabing the rating  section in product page
	for ratings in total_ratings:
		extracted_rating = ratings.xpath('./td//a//text()')
		if extracted_rating:
			rating_key = extracted_rating[0] 
			raw_raing_value = extracted_rating[1]
			rating_value = raw_raing_value
			if rating_key:
				ratings_dict.update({rating_key:rating_value})

	#Parsing individual reviews
	for review in reviews:
		XPATH_RATING  ='./div//div//i//text()'
		XPATH_REVIEW_HEADER = './div//div//span[contains(@class,"text-bold")]//text()'
		XPATH_REVIEW_POSTED_DATE = './/a[contains(@href,"/profile/")]/parent::span/following-sibling::span/text()'
		XPATH_REVIEW_TEXT_1 = './/div//span[@class="MHRHead"]//text()'
		XPATH_REVIEW_TEXT_2 = './/div//span[@data-action="columnbalancing-showfullreview"]/@data-columnbalancing-showfullreview'
		XPATH_REVIEW_COMMENTS = './/a[contains(@class,"commentStripe")]/text()'
		XPATH_AUTHOR  = './/a[contains(@href,"/profile/")]/parent::span//text()'
		XPATH_REVIEW_TEXT_3  = './/div[contains(@id,"dpReviews")]/div/text()'
		raw_review_author = review.xpath(XPATH_AUTHOR)
		raw_review_rating = review.xpath(XPATH_RATING)
		raw_review_header = review.xpath(XPATH_REVIEW_HEADER)
		raw_review_posted_date = review.xpath(XPATH_REVIEW_POSTED_DATE)
		raw_review_text1 = review.xpath(XPATH_REVIEW_TEXT_1)
		raw_review_text2 = review.xpath(XPATH_REVIEW_TEXT_2)
		raw_review_text3 = review.xpath(XPATH_REVIEW_TEXT_3)

		author = ' '.join(' '.join(raw_review_author).split()).strip('By')

		#cleaning data
		review_rating = ''.join(raw_review_rating).replace('out of 5 stars','')
		review_header = ' '.join(' '.join(raw_review_header).split())
		review_posted_date = dateparser.parse(''.join(raw_review_posted_date)).strftime('%d %b %Y')
		review_text = ' '.join(' '.join(raw_review_text1).split())

		#grabbing hidden comments if present
		if raw_review_text2:
			json_loaded_review_data = json.loads(raw_review_text2[0])
			json_loaded_review_data_text = json_loaded_review_data['rest']
			cleaned_json_loaded_review_data_text = re.sub('<.*?>','',json_loaded_review_data_text)
			full_review_text = review_text+cleaned_json_loaded_review_data_text
		else:
			full_review_text = review_text
		if not raw_review_text1:
			full_review_text = ' '.join(' '.join(raw_review_text3).split())

		raw_review_comments = review.xpath(XPATH_REVIEW_COMMENTS)
		review_comments = ''.join(raw_review_comments)
		review_comments = re.sub('[A-Za-z]','',review_comments).strip()
		review_dict = {
							'review_comment_count':review_comments,
							'review_text':full_review_text,
							'review_posted_date':review_posted_date,
							'review_header':review_header,
							'review_rating':review_rating,
							'review_author':author

						}
		reviews_list.append(review_dict)

	data = {
				'ratings':ratings_dict,
				'reviews':reviews_list,
				'url':amazon_url,
				'price':product_price,
				'name':product_name
			}
	return data


def ReadAsin():
	#Add your own ASINs here 
	AsinList = ['B01ETPUQ6E','B017HW9DEW']
	extracted_data = []
	for asin in AsinList:
		print "Downloading and processing page http://www.amazon.com/dp/"+asin
		extracted_data.append(ParseReviews(asin))
		sleep(5)
	f=open('data.json','w')
	json.dump(extracted_data,f,indent=4)

if __name__ == '__main__':
	ReadAsin()

Here, is the link to his website reviews scraping with python 2.7

Solution 6 - Amazon Web-Services

Unfortunately you can only get an iframe URL with the reviews, the content itself is not accessible.

Source: http://docs.amazonwebservices.com/AWSECommerceService/2011-08-01/DG/CHAP_MotivatingCustomerstoBuy.html#GettingCustomerReviews

Solution 7 - Amazon Web-Services

Checkout RapidAPI: https://rapidapi.com/blog/amazon-product-reviews-api/ By using this API we can get Amazon product reviews.

Solution 8 - Amazon Web-Services

You can use Amazon Product Advertising API. It has a Response Group 'Reviews' which you can use with operation 'ItemLookup'. You need to know ASIN i.e. unique item id of the product.

Once you set all the parameters and execute the signed URL, you will receive an XML which contains a link to customer reviews under "IFrameURL" tag.

Use this URL and use pattern searching in html returned from this url to extract the reviews. For each review in the html, there will be a unique review id and under that

you can get all the data for that particular review.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondcrostaView Question on Stackoverflow
Solution 1 - Amazon Web-ServicesTyler CollierView Answer on Stackoverflow
Solution 2 - Amazon Web-ServicesmfsView Answer on Stackoverflow
Solution 3 - Amazon Web-Servicesuser2660784View Answer on Stackoverflow
Solution 4 - Amazon Web-ServicesthorwhalenView Answer on Stackoverflow
Solution 5 - Amazon Web-ServicesChandan PurohitView Answer on Stackoverflow
Solution 6 - Amazon Web-ServicesgphilipView Answer on Stackoverflow
Solution 7 - Amazon Web-ServicesAfzal MasoodView Answer on Stackoverflow
Solution 8 - Amazon Web-ServicesDataMiningEnthusiastView Answer on Stackoverflow