How to scrape amazon search with php? - php

Any ideas? I am new to php and am having a lot of trouble with curl and domdocuments so please write or show me an example. I was thinking of using dom documents but I can not figure out how to get amazon to search a users input from my site and display certain parts of the results such as price, category ex.....

There are several methods using file_get_contents, a "save html" plugin (https://simplehtmldom.sourceforge.io/) and CURL which I've had varying luck with, but eventually it starts flagging my requests with robot checks. I originally used the API, but Amazon locked that down to minimum traffic rules that I can't meet with my budding webservice rendering that useless.
There currently is no easy/effective way to consistently pull Amazon data though I'm playing with randomizing useragents and using proxies.

Use the Product Advertising API instead of scraping. http://docs.aws.amazon.com/AWSECommerceService/latest/DG/ItemSearch.html

The product API actually would be the best resource for this although it gives you limited results and after 180 days if no affiliate transaction occurs I believe they may revoke your access so it does limit you to some extent depending on your uses. Not sure but I think you may need a professional seller account or an affiliate membership, not 100% on that but that is my understanding.

Related

PHP Script that receives information from specified websites

I am trying to create a script that will gather information from Amazon product listing based on entered product ASIN (description, image, price, seller name). Functionality should be similar to this one: http://www.savings.com/pricejump
I tried to use DOM to receive HTML elements but I am concerned about IP ban if I have too many requests in short period of time. I plan to have several hundred requests per day with this script.
Can you please share some useful links on this subject. I really don't know in which direction to head.
Any help would be highly appreciated.
You could use their affiliate program, which is free and would allow enough requests to accomplish what you asked. Plus it provides a web services API, which is much cleaner and easier to use than screen scraping.
https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html

How to extract/fetch and show data ( from other ecommerce site ) to my website using PHP

I have seen many of the eCommerce portals which are showing the list of products from another bigger eCommerce websites from across the world.
The fetching is not a big problem i think, by using file_get_contents or CURL in php, But the question is,
Do they provide some api to allow others to fetch their data/product info?
Do we need to get their permissions to fetch data from their sites.
Are there some elegant and specific method/way to fetch data to show on our site (instead of CURL & file_get_contents)?
Some websites provide their API to access data. Some cost money, Some may be free. In any case , yes, you need permission.
But you can always scrape their sites without permission.
Here's some general guidelines on the subject.
You should check to see if they have a robot.txt file denying permission to spider some areas of the site.
Although there are copyright issues with reproducing content, search engines publish excerpts of site content all the time. Therefore to some extent, reproducing content is legally permissible.
APIs are sometimes available, but search engines scrape sites all the time without any sort of permission (except for perhaps the robot.text files).
Respect the site owner's wishes concerning their bandwidth. Poorly written robot code can wastefully tie up server resources.
If you can get permission, all the better.
I use cURL and the DomDocument class. I don't know what else you would want in terms of elegance.
Write a crawler to get all the data you want from those websites.
Use the APIs if provided.But usually it costs much.
Create you own APIs using third-party software.

Collect data over a period

I want to collect data of a particular keyword for the last seven days without user authentication on twitter. The problem is that the result set for one day itself was more than 3000. This quickly blocks my app due to rate limitation. I need a work around this. In fact I don't need the data, I just need the count for each day ( probably this is not possible). Could you please advise me to get over the same. I am using search api, and I am open to use any api.
One more question: Is it possible to collect the public posts at regular intervals ( all posts, without a query term). If this is possible then I can save them in my database and perform the search on the same.
This sounds like a job for the streaming API. You can think of it as setting a keyword and opening a firehose where you will receive tweets containing your keyword until you close the firehose connection. The streaming API is designed for persistent connections, tracking a limited number of keywords. You login with basically a default user.
This 140 PHP Development Framework is a great help in working with the Twitter streaming API in PHP.
Resources:
Twitter Streaming API Information -
https://dev.twitter.com/docs/streaming-apis
140 Twitter Streaming API Framework -
http://140dev.com/free-twitter-api-source-code-library/

Recommend a web service that handles location within a specific radius?

We have a client that wants a store locator on their website. I've been asked to find a webservice that will allow us to send a zipcode as a request and have it return locations within x radius. We found this, but it's maintained by a single person, and doesn't look like it gets updated or supported very well. We're looking for something commercial, ideally that updates their zipcode database at least once per quarter, and that has a well-documented API with PHP accessibility. I won't say price isn't an object, but right now we just want some ideas, and my google-fu has failed me.
I've already posted this over on the webmasters forum, but thought I'd cover my bases and post here too.
I've repurposed this outstanding script to conquor this same challenge. It's free, has been very reliable, and is relatively quick.
In my script, I have addresses stored in the DB. So rather than show a page to enter addresses, I simply pass them as a string and let the magic happen.
He says it in the app, but ensure that if you go this route you get your own Google Maps API. It won't work with his!
If you want to go a bit less technical approach, here's a MySQL query you could run on your locations (you'd have to add lat/long to your DB or setup a GEOCODING service) to give you distance as the crow flies.
Google Maps has a geocoder as well and it geocodes to the specific address.
It's limited to x number of requests but that shouldn't be a big deal if your site is small and if you cache. You can get more requests if you pay.
It can be accessed via javascript or via PHP (and there are several prewritten PHP modules out there)
Link here:
http://code.google.com/apis/maps/documentation/javascript/v2/services.html
(I worked for a company that did upwards of 800,000 requests a day, so it's stable and fast :) )
PostcodeAnywhere has a Store Locator feature - I think it's pay per use, but I've used their other products before and they're very cheap.
http://www.postcodeanywhere.co.uk/store-locator-tool/

Using cURL for Amazon Associates reports... possible?

I've got a couple of affiliate sites and would like to bring together the earnings reports from several Amazon sites into one place, for easier viewing and analysis.
I get the impression that cURL can be used to get external webpage content, which I could then scrape to obtain the necessary info. However, I've hit a wall in trying to log in to the Associates reports using cURL.
Has anyone done this and do you have any advice?
I am working on an open-source project called PHP-OARA it's allows you to get your data from the different networks, it's part of the AffJet Project.
We have solved the problem with Amazon(it wasn't easy) and a PHP class is available to get your data from your associate account.If you like it you can even co-operate to do it better.
I hope it helps!
You can do this, but youll need to make use of cookies with curl: http://www.electrictoolbox.com/php-curl-cookies/ But id be willing to bet some cash that Amazon offers an API to get the data you want, although the last time i dealt with their web services it was a nightmare but proably because i was using the PHP SOAP extension and Amazon SOAP API.

Categories