How to scrape data from Ajax or Json - php

I want to scrape data from this url
I am able to fetch simple data from html tags usign curl but not able to fetch data from Json or Ajax, I am not sure is it Ajax or Json data.
In below screen shot I want to fetch Appliance Models Data.
Which is coming form I think json or ajax. ==>>
Here below is my script to get data from page -
$loginURL = "https://www.apwagner.com/appliance-part/wpl/wp661600";
//$file='source.html'; //create a html file to save source code
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $loginURL);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
Please provide some guidance to fetch this info ..
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://www.apwagner.com/Product/GetPartModel");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"partNumber=wp661600&make=wpl");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec ($ch);
curl_close ($ch);

The part of data page gets through ajax request.
see this screenshot
You need to do it with curl
after your first curl response is received
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"https://www.apwagner.com/Product/GetPartModel");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"partNumber=wp661600&make=wpl");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec ($ch);
curl_close ($ch);
Or try to scrap data using python script
import string
import time
from selenium import webdriver
driver = webdriver.Chrome('<path to your chrome driver>')
driver.get('https://www.apwagner.com/appliance-part/wpl/wp661600');

Related

How to pass PHP_session_id in curl request

I am learning PHP and trying to automate a website login and then post some data to another page once logged in. I managed to login to the website and when I try to post data, I got a "Document moved".
Then I analysed the headers in firebug and realised that there was a PHP_session_id, so when I tried to manually pass this PHP_session_id it worked.
So my question is, how can I automatically get the sessionid when I login and then subsequently pass this on to my second request?
This is what my code looks like:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"http://www.example.com/login-main.php");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"loginType=company&username=johndoe%40gmail.com&password=1234&company=test");
ob_start();
curl_exec ($ch);
ob_end_clean();
curl_close ($ch);
unset($ch);
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_HTTPHEADER, array(
'Host:www.example.com',
'Origin:http://www.example.com',
'Cookie:PHPSESSID=na6ivnsdfab206ktb453o2au07',
'Referer:http://www.example.com/bookings/'
));
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_URL,"http://www.example.com/bookings.php");
curl_setopt($ch, CURLOPT_POSTFIELDS,
"start_date=&end_date=&type=instructor&b_type=&id=41");
$buf2 = curl_exec ($ch);
curl_close ($ch);
echo "<PRE>".htmlentities($buf2);
?>
Add to your code in all curl requests
curl_setopt($ch, CURLOPT_COOKIEJAR, $full_path_to_cookie_file);
curl_setopt($ch, CURLOPT_COOKIEFILE, $full_path_to_cookie_file);
For example
$full_path_to_cookie_file= __DIR__.'/cookie.txt';

Sending POST data to HTTPS redirect PHP

I need to redirect the current user to an external page (other domain) with some parameters that will be used on that page. To do that I'm using cURL, an example code is below:
$postfields = array('key'=>'value');
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://domain.com/example.aspx');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postfields));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0); // On dev server only!
$result = curl_exec($ch);
curl_close ($ch);
But the user stays on the same page and I don't know if any data is transmitted.
What is the problem here? Is there another way to do that?

curl into xml instead of webpage

ok so i am using the following code to set and retrieve the curl url
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 15);
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$xml_response = curl_exec($ch);
curl_close($ch);
echo $xml_response;
my problem is that the api that i am calling has the data in xml format so curl is returning the data in one big string like this
509f0e1f-b8d4-4cb8-b4f0-109751dca4eb0.0114370000000000TrueASIN0273702440SmallAll0273702440http://www.amazon.com/Accounting-Finance-Non-Specialists-Peter-....(ect)
how to have curl display the data in xml format?

Manipulating curl-obtained data before outputting

I'm using the below code to pull the html text from a site to publish on my own.
How can I manipulate the curl handle before echo?
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_POST, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
$returned = curl_exec($ch);
curl_close ($ch);
echo $returned;
You can manipulate your returned data:
$returned = curl_exec($ch);
curl_close ($ch);
$returned is the response so you can manipulate it
Did you mean something like this

php google weather api query

I'm having trouble retrieving data from the google api. When I run the code, it only returns a blank page and not a printout of the xml array. Here is the code:
$url="http://www.google.com/ig/api";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POST, true);
curl_setopt($ch, CURLOPT_POSTFIELDS, "?weather=london,england");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
echo "<pre>";
print_r($data);
I think that the problem is that you use POST method, and not the GET
Try like this
$url="http://www.google.com/ig/api?weather=london,england";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
Hope it helps :)
EDIT: And yes, you have to do some extra parsing to get the data from the XML string

Categories