$url = "http://www.ksl.com/index.php?nid=231&sid=74268&cat=&search=&zip=&distance=&min_price=&max_price=&type=&category=&subcat=&sold=&city=&addisplay=%5BNOW-1HOURS+TO+NOW%5D&sort=1&userid=&markettype=sale&adsstate=&nocache=1&o_facetSelected=true&o_facetKey=ad+posted&o_facetVal=Last+Minute&viewSelect=list&viewNumResults=48&sort=1";
$html = file_get_contents($url);
This returns some of the page content. I think that because the page I am trying to curl uses jquery to insert the listings the curl executes before the jquery populates the page?
Any ideas on how to get the full contents of the search curl?
Try this:
$url = 'http://www.ksl.com/index.php?nid=231&sid=74268&cat=&search=&zip=&distance=&min_price=&max_price=&type=&category=&subcat=&sold=&city=&addisplay=%5BNOW-1HOURS+TO+NOW%5D&sort=1&userid=&markettype=sale&adsstate=&nocache=1&o_facetSelected=true&o_facetKey=ad+posted&o_facetVal=Last+Minute&viewSelect=list&viewNumResults=48&sort=1';
/* gets the data from a URL */
function get_data($url) {
$ch = curl_init();
$timeout = 5;
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, $timeout);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
$returned_content = get_data($url);
echo $returned_content;
No they are not getting the listings via jQuery, AJAX. There are no XHR requests.
A view source shows the listings.
I just used this code to get
header('Content-Type: text/plain; charset=utf-8');
$url = 'http://www.ksl.com/index.php?nid=231&sid=74268&cat=&search=&zip=&distance=&min_price=&max_price=&type=&category=&subcat=&sold=&city=&addisplay=%5BNOW-1HOURS+TO+NOW%5D&sort=1&userid=&markettype=sale&adsstate=&nocache=1&o_facetSelected=true&o_facetKey=ad+posted&o_facetVal=Last+Minute&viewSelect=list&viewNumResults=48&sort=1';
$data = file_get_contents($url);
echo $data;
The Result
Partial edited
Black Xbox 360 320gb Hard Drive Plus controllers and G...</a>
<span style="color: #555;">Magna, UT
| 1 Min
<div class="adDesc">
Selling because I got it used then only used it once, works fantastically and comes with lots of games plus 4 controllers and a keyboard for the contr
<a class="listlink" href="?nid=218&ad=34170943&cat=&lpid=&search=&ad_cid=7">more...</a>
<div class="priceBox">
<a href="?nid=218&ad=34170943&cat=&lpid=&search=&ad_cid=7">
<span >$450<span class="priceCents">00
<!-- <div class="adDays">1 Min -->
<!--<div class="adTime">Dec 31, 1969-->
<div class="detailBox">
Great PS4 Bundle</a>
<span style="color: #555;">Spanish Fork, UT
| 1 Min
Have here a great PS4 bundle with 6Games 1controller and 3months of PS Network. Still has the box. Text 801-633-1659
<span >$400<span class="priceCents">00
Related
I'm unable to scrape data from few websites using curls.
I'm using CURL to scrape website from url's. It works great in 80% of the urls I use. But some url's don't seem "scrapeable". For example, when I try to scrape https://www.nextdoorhub.com/ and https://www.atknsn.com/, it doesn't work. the website keeps showing blanks and at the end it doesn't return a result.
This is my code:
<center>
<br/>
<form method="post" name="scrap_form" id="scrap_form" action="scrape_data.php">
<b>Enter Website URL To Scrape Data:</b>
<input type="input" name="website_url" id="website_url">
<input type="submit" name="submit" value="Submit" >
</form>
</center>
<?php
error_reporting(E_ALL ^ E_NOTICE );
$website_url = $_POST['website_url'];
$result = scrapeWebsiteData($website_url);
function scrapeWebsiteData($website_url){
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $website_url);
curl_setopt($curl, CURLOPT_HEADER, 0);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_BINARYTRANSFER,1);
$result = curl_exec($curl);
curl_close($curl);
return $result;
}
$regextit = '<div id="case_textlist">(.*?)<\/div>/s';
preg_match_all($regextit, $result, $list);
/* echo "<pre>";
print_r($list[1]); die; */
$regex = '/[\'" >\t^]([^\'" \n\r\t]+\.(jpe?g|bmp|gif|png))[\'" <\n\r\t]/i';
preg_match_all($regex, $result, $url_matches);
$count = count($url_matches[1]);
// set the local path of image
$local_path = 'C:\udeytech\htdocs\tests\images\\';
for($i=0; $i<$count; $i++)
{
preg_match_all('!.*?/!', $url_matches[1][$i], $matches);
$last_part = end($matches[0]);
////match image name last part of anything .jpg|jpeg|gif|png
preg_match("!$last_part(.*?.(jpg|jpeg|gif|png))!", $url_matches[1][$i], $matche);
$secons_part = $matche[0];
$info = pathinfo($secons_part);
$image_name = $info['basename'];
//save image url in a variable
$image_url = $url_matches[1][$i];
$image_path = scrapeWebsiteData($image_url);
$file_open = fopen($local_path.$image_name, 'w');
fwrite($file_open, $image_path);
fclose($file_open);
}
?>
Have you tried to load either of these sites in your browser and look at the responses?
nextdoorhub is using angular and atknsn looks to be heavy on jQuery. Long story short, these sites need to run javascript to render the full HTML you're intending to scrape.
Using PHP + cURL alone won't cut it. Look at threads that discuss scraping angular and that will point you in the right direction. (Hint: you need to scrape these sites with node.js)
Just want to show current gold rate from Gold Price India (http://www.goldpriceindia.com/) using PHP.
I have done to get data using file_get_contents() method. But its working on localhost but not in server. I want this on my FTP server too.
My Code:
$url1 = 'http://www.goldpriceindia.com/gold-price-kolkata.php';
$content1 = file_get_contents($url1);
$first_step1 = explode( '<div class="prc">' , $content1 );
$gold_rate1 = explode("</div>" , $first_step1[1] );
I am using PHP, I hope my question is clear if not I ready to explain again.
Thank You.
may be your server disabled URL file access may be you can try alternate solution to get file content.
Alternate method to get file content
function url_get_contents ($Url) {
if (!function_exists('curl_init')){
die('CURL is not installed!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);
return $output;
}
100% working Code
preg_match('#Gold price today in India <b>\(Rs\/10gm\)</b> is <b>([0-9\.]+)#', file_get_contents('http://www.marketonmobile.com/gold_price_india.php'), $matches);
echo 'The price is: '.$matches[1];
I'm trying to show a facebook feed on my website, which is working. But I only managed to show the text of the post, not the image attached to it (or if possible, if there are multiple images, only the first one or biggest one).
I tried looking here for the correct name to get it using the API
This is my code now (which shows facebook posts in an owl carousel):
function fetchUrl($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
// You may need to add the line below
// curl_setopt($ch,CURLOPT_SSL_VERIFYPEER,false);
$feedData = curl_exec($ch);
curl_close($ch);
return $feedData;
}
//App Info, needed for Auth
$app_id = "1230330267012270";
$app_secret = "secret";
//Retrieve auth token
$authToken = fetchUrl("https://graph.facebook.com/oauth/access_token?grant_type=client_credentials&client_id=1230330267012270&client_secret=secret");
$json_object = fetchUrl("https://graph.facebook.com/267007566742236/feed?{$authToken}");
$feedarray = json_decode($json_object);
foreach ( $feedarray->data as $feed_data )
{
if($feed_data->message != ''){
$facebookfeed .= '
<div class="item">
<div class="product-item">
<img class="img-responsive" src="images/siteimages/imgfacebook.jpg" alt="">
<h4 class="product-title">'.$feed_data->name.'</h4>
<p class="product-desc">'.$feed_data->message.'</p>
<p>'.$feed_data->story.'</p>
<img src="'.$feed_data->picture.'">
<p>Lees meer</p>
</div><!-- Product item end -->
</div><!-- Item 1 end -->';
}
}
echo $facebookfeed;
Looking at the Facebook documentation I thought $feed_data->picture would work, but it returns nothing.
To try to improve performance on mobile networks, Nodes and Edges in v2.4 requires that you explicitly request the field(s) you need for your GET requests. For example, GET /v2.4/me/feed no longer includes likes and comments by default, but GET /v2.4/me/feed?fields=comments,likes will return the data.
Source: https://developers.facebook.com/docs/apps/changelog#v2_4
i am using the code below, to show some photos of my instagram-account on my website. it just fetches all the images of my account in the div. Is there a way to limit the fetched data to 10 Images or so ?
Cant figure out how to do that..
thanks for your help!
<div id="instagramfeed">
<?php
// Supply a user id and an access token
$userid = "123xy";
$accessToken = "123xy ";
// Gets our data
function fetchData($url){
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_TIMEOUT, 20);
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
// Pulls and parses data.
$result = fetchData("https://api.instagram.com/v1/users/{$userid}/media/recent/?access_token={$accessToken}");
$result = json_decode($result);
?>
<?php foreach ($result->data as $post): ?>
<!-- Renders images. #Options (thumbnail,low_resoulution, high_resolution) -->
<a class="group" rel="group1" href="<?= $post->images->standard_resolution->url ?>"><img src="<?= $post->images->thumbnail->url ?>"></a>
<?php endforeach ?> <br><br><br><br>
</div>
You could use the count parameter.
https://api.instagram.com/v1/users/{$userid}/media/recent/?access_token={$accessToken}&count=10
It was a problem in Instagram Developer Console. max_id and min_id doesn't work there.
For anyone interested - i found a solution for this problem:
it doesnt work with: {$accessToken}&count=10
But it works with:
?access_token=123456789101112131415123123111&count=10
I use the following php code to get a user twitter followers. I want to export this data to a csv file and add a filter to save only followers with more than 100 followers.
<script src="http://code.jquery.com/jquery-latest.js"></script>
<?php
$trends_url = "http://api.twitter.com/1/statuses/followers/pthiongo.json";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $trends_url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$curlout = curl_exec($ch);
curl_close($ch);
$response = json_decode($curlout, true);
foreach($response as $friends){
$thumb = $friends['profile_image_url'];
$url = $friends['screen_name'];
$name = $friends['name'];
echo $friends['screen_name'];
?>
<a title="<?php echo $name;?>" href="http://www.twitter.com/<?php echo $url;?>"><img class="photo-img" src="<? php echo $thumb?>" border="0" alt="" width="40" /></a>
<?php
}
?>
First this , http://blog.gabrieleromanato.com/2012/06/jquery-get-twitter-followers-count/
Then use an if (followers_count > 99) in PHP to display
Programatically, you can request a list of up to 100 of your own followers with the Instagram API using the sample request below.
The sample request below is from SnippetLib:
https:api.instagram.comv1users3followed-by?access_token=ACCESS-TOKEN
This route (from nbyim.com) uses pagination to get around the rate limits:
from instagram.client import InstagramAPI
user_id=''
access_token = ''
client_secret = ''
api = InstagramAPI(access_token=access_token, client_secret=client_secret)
followers = []
# Get the followers list
for p in api.user_followed_by(user_id=user_id, as_generator=True, max_pages=None):
followers.extend(p[0])
# Convert from an instagram.models.User list to a list of strings
followers = [str(u).replace('User: ', '') for u in followers]
print len(followers), 'followers'
print followers
To automate the CSV process, you can use a service like Crowdbabble: https://www.crowdbabble.com/download-all-instagram-followers/