Curl is returning a string - php

I'm using curl to get my values from a site name PKNiC
My code is:
function _isCurl() {
return function_exists('curl_version');
}
if (_iscurl()) {
//curl is enabled
$url = "https://pk6.pknic.net.pk/pk5/lookup.PK?name=cat.com.pk&jsonp=?";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec($ch);
curl_close($ch);
var_dump($output);
// Curl operations finished
} else {
echo "CURL is disabled";
}
Now when I run this program it returns a string to me with whole page print on it as a single string.
I need registrant name, expiry date, create date, contacts. How do I get those things? I have no idea how it works and it just provide me a single string when I use var_dump or print_r or any thing to view it. How to get the record of my choice?

Use a DOM Crawler, like this one: http://symfony.com/doc/current/components/dom_crawler.html.
Then you can get the registrant name like this:
use Symfony\Component\DomCrawler\Crawler;
$crawler = new Crawler($htmlFromCurl);
$crawler = $crawler->filter('.whitebox tr:nth-child(3) td:last-child');
Filtering is even easier if you have the CssSelector component
installed. This allows you to use jQuery-like selectors to traverse.
You can install the Dom Crawler without using the whole framework
composer require symfony/dom-crawler

Related

Getting whole HTML element with PHP

I want to get the whole element <article> which represents 1 listing but it doesn't work. Can someone help me please?
containing the image + title + it's link + description
<?php
$url = 'http://www.polkmugshot.com/';
$content = file_get_contents($url);
$first_step = explode( '<article>' , $content );
$second_step = explode("</article>" , $first_step[3] );
echo $second_step[0];
?>
You should definitely be using curl for this type of requests.
function curl_download($url){
// is cURL installed?
if (!function_exists('curl_init')){
die('cURL is not installed!');
}
$ch = curl_init();
// URL to download
curl_setopt($ch, CURLOPT_URL, $url);
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, "Set your user agent here...");
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = retu rn, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
// Download the given URL, and return output
$output = curl_exec($ch);
// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}
for best results for your question. Combine it with HTML Dom Parser
use it like:
// Find all images
foreach($output->find('img') as $element)
echo $element->src . '<br>';
// Find all links
foreach($output->find('a') as $element)
echo $element->href . '<br>';
Good Luck!
I'm not sure I get you right, But I guess you need a PHP DOM Parser. I suggest this one (This is a great PHP library to parser HTML codes)
Also you can get whole HTML code like this:
$url = 'http://www.polkmugshot.com/';
$html = file_get_html($url);
echo $html;
Probably a better way would be to parse the document and run some xpath queries over it afterwards, like so:
$url = 'http://www.polkmugshot.com/';
$xml = simplexml_load_file($url);
$articles = $xml->xpath("//articles");
foreach ($articles as $article) {
// do sth. useful here
}
Read about SimpleXML here.
extract the articles with DOMDocument. working example:
<?php
$url = 'http://www.polkmugshot.com/';
$content = file_get_contents($url);
$domd=#DOMDocument::loadHTML($content);
foreach($domd->getElementsByTagName("article") as $article){
var_dump($domd->saveHTML($article));
}
and as pointed out by #Guns , you'd better use curl, for several reasons:
1: file_get_contents will fail if allow_url_fopen is not set to true in php.ini
2: until php 5.5.0 (somewhere around there), file_get_contents kept reading from the connection until the connection was actually closed, which for many servers can be many seconds after all content is sent, while curl will only read until it reaches content-length HTTP header, which makes for much faster transfers (luckily this was fixed)
3: curl supports gzip and deflate compressed transfers, which again, makes for much faster transfer (when content is compressible, such as html), while file_get_contents will always transfer plain

cURL using info from mySQL, then storing the cURL'ed info

I'm programming in PHP.
An article I've found useful until now was mainly about how to CURL through one site with a lot of information, but what I really need is how to cURL on multiple sites with not so much information - a few lines, as a matter of fact!
Another part is, the article focus is mainly at storing it at the FTP server in a txt file, but I have loaded around 900 addresses into mysql, and want to load them from there, and enrich the table with the information stored in the links - Which I will provided beneath!
We have some open public libraries with addresses and information about these and an API.
Link to the main site:
The function I would like to use: http://dawa.aws.dk/adresser/autocomplete?q=
SQL Structure:
Data example: http://i.imgur.com/jP1J26U.jpg
fx this addresse: Dornen 2 6715 Esbjerg N (called AdrName in databasen).
http://dawa.aws.dk/adresser/autocomplete?q=Dornen%202%206715%20Esbjerg%20N
This will give me the following output (which I want to store in the AdrID in the database):
[
{
"tekst": "Dornen 2, Tarp, 6715 Esbjerg N",
"adresse": {
"id": "0a3f50b8-d085-32b8-e044-0003ba298018",
"href": "http://dawa.aws.dk/adresser/0a3f50b8-d085-32b8-e044-0003ba298018",
"vejnavn": "Dornen",
"husnr": "2",
"etage": null,
"dør": null,
"supplerendebynavn": "Tarp",
"postnr": "6715",
"postnrnavn": "Esbjerg N"
}
}
]
How to store it all in a blob, as seen in the SQL structure?
If you want to make a cURL request in php use this method
function curl_download($Url){
// is cURL installed yet?
if (!function_exists('curl_init')){
die('Sorry cURL is not installed!');
}
// OK cool - then let's create a new cURL resource handle
$ch = curl_init();
// Now set some options (most are optional)
// Set URL to download
curl_setopt($ch, CURLOPT_URL, $Url);
// Set a referer
curl_setopt($ch, CURLOPT_REFERER, "http://www.example.org/yay.htm");
// User agent
curl_setopt($ch, CURLOPT_USERAGENT, "MozillaXYZ/1.0");
// Include header in result? (0 = yes, 1 = no)
curl_setopt($ch, CURLOPT_HEADER, 0);
// Should cURL return or print out the data? (true = return, false = print)
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
// Timeout in seconds
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
// Download the given URL, and return output
$output = curl_exec($ch);
// Close the cURL resource, and free system resources
curl_close($ch);
return $output;
}
And then you call it using
print curl_download('http://dawa.aws.dk/adresser/autocomplete?q=Melvej');
Or you can directly convert it jSON object
$jsonString=curl_download('http://dawa.aws.dk/adresser/autocomplete?q=Melvej');
var_dump(json_decode($jsonString));
The data you download is json, so you can store that in a varchar column rather than blog.
Also the site with the api does not seem bothered about http referrer, user agent etc so you can use file_get_contents in place of curl.
So simply get all the results from your db, iterate over them, making a call to the api, and update the appropriate row with the correct data:
//get all the rows from your database
$addresses = DB::exec('SELECT * FROM addresses'); //i dont know how you actually access your db, this is just an example
foreach($addresses as $address){
$searchTerm = $address['AdrName'];
$addressId = $address['Vid'];
//download the json
$apidata = file_get_contents('http://dawa.aws.dk/adresser/autocomplete?q=' . urlencode($searchTerm));
//save back to db
DB::exec('UPDATE addresses SET status=? WHERE id=?', [$apidata, $searchTerm]);
//if you want to access the data, you can use json_decode:
$data = json_decode($apidata);
echo $data[0]->tekst; //outputs Dornen 2, Tarp, 6715 Esbjerg N
}

How to call posts from PHP

I have a website, that uses WP Super Cache plugin. I need to recycle cache once a day and then I need to call 5 posts (URL adresses) so WP Super Cache put these posts into cache again (caching is quite time consuming so I'd like to have it precached before users come so they dont have to wait).
On my hosting I can use a CRON but only for 1 call/hour. And I need to call 5 different URL's at once.
Is it possible to do that? Maybe create one HTML page with these 5 posts in iframe? Will something like that work?
Edit: Shell is not available, so I have to use PHP scripting.
The easiest way to do it in PHP is to use file_get_contents() (fopen() also works), if the HTTP stream wrapper is enabled on your server:
<?php
$postUrls = array(
'http://my.site.here/post1',
'http://my.site.here/post2',
'http://my.site.here/post3',
'http://my.site.here/post4',
'http://my.site.here/post5',
);
foreach ($postUrls as $url) {
// Get the post as an user will do it
$text = file_get_contents();
// Here you can check if the request was successful
// For example, use strpos() or regex to find a piece of text you expect
// to find in the post
// Replace 'copyright bla, bla, bla' with a piece of text you display
// in the footer of your site
if (strpos($text, 'copyright bla, bla, bla') === FALSE) {
echo('Retrieval of '.$url." failed.\n");
}
}
If file_get_contents() fails to open the URLs on your server (some ISP restrict this behaviour) you can try to use curl:
function curl_get_contents($url)
{
$ch = curl_init($url);
curl_setopt_array($ch, array(
CURLOPT_CONNECTTIMEOUT => 30, // timeout in seconds
CURLOPT_RETURNTRANSFER => TRUE, // tell curl to return the page content instead of just TRUE/FALSE
));
$text = curl_exec($ch);
curl_close($ch);
return $text;
}
Then use the function curl_get_contents() listed above instead of file_get_contents().
An example using PHP without building a cURL request.
Using PHP's shell exec, you can have an extremely light function like so :
$siteList = array("http://url1", "http://url2", "http://url3", "http://url4", "http://url5");
foreach ($siteList as &$site) {
$request = shell_exec('wget '.$site);
}
Now of course this is not the most concise answer and not always a good solution also, if you actually want anything from the response you will have to work with it a different way to cURLbut its a low impact option.
Thanks to Arkascha tip I created a PHP page that I call from CRON. This page contains simple function using cURL:
function cache_it($Url){
if (!function_exists('curl_init')){
die('No cURL, sorry!');
}
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $Url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_TIMEOUT, 50); //higher timeout needed for cache to load
curl_exec($ch); //dont need it as output, otherwise $output = curl_exec($ch);
curl_close($ch);
}
cache_it('http://www.mywebsite.com/url1');
cache_it('http://www.mywebsite.com/url2');
cache_it('http://www.mywebsite.com/url3');
cache_it('http://www.mywebsite.com/url4');

Scrape a statistic from YouTube using PHP

After struggling for 3 hours at trying to do this on my own, I have decided that it is either not possible or not possible for me to do on my own. My question is as follows:
How can I scrape the numbers in the attached image using PHP to echo them in a webpage?
Image URL: http://gyazo.com/6ee1784a87dcdfb8cdf37e753d82411c
Please help. I have tried almost everything, from using cURL, to using a regex, to trying an xPath. Nothing has worked the right way.
I only want the numbers by themselves in order for them to be isolated, assigned to a variable, and then echoed elsewhere on the page.
Update:
http://youtube.com/exonianetwork - The URL I am trying to scrape.
/html/body[#class='date-20121213 en_US ltr ytg-old-clearfix guide-feed-v2 site-left-aligned exp-new-site-width exp-watch7-comment-ui webkit webkit-537']/div[#id='body-container']/div[#id='page-container']/div[#id='page']/div[#id='content']/div[#id='branded-page-default-bg']/div[#id='branded-page-body-container']/div[#id='branded-page-body']/div[#class='channel-tab-content channel-layout-two-column selected blogger-template ']/div[#class='tab-content-body']/div[#class='secondary-pane']/div[#class='user-profile channel-module yt-uix-c3-module-container ']/div[#class='module-view profile-view-module']/ul[#class='section'][1]/li[#class='user-profile-item '][1]/span[#class='value']
The xPath I tried, which didn't work for some unknown reason. No exceptions or errors were thrown, and nothing was displayed.
Perhaps a simple XPath would be easier to manipulate and debug.
Here's a Short Self-Contained Correct Example (watch for the space at the end of the class name):
#!/usr/bin/env php
<?
$url = "http://youtube.com/exonianetwork";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_AUTOREFERER, true);
curl_setopt($ch, CURLOPT_FAILONERROR, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_TIMEOUT, 10);
$html = curl_exec($ch);
if (!$html)
{
print "Failed to fetch page. Error handling goes here";
}
curl_close($ch);
$dom = new DOMDocument();
#$dom->loadHTML($html);
$xpath = new DOMXPath($dom);
$profile_items = $xpath->query("//li[#class='user-profile-item ']/span[#class='value']");
if ($profile_items->length === 0) {
print "No values found\n";
} else {
foreach ($profile_items as $profile_item) {
printf("%s\n", $profile_item->textContent);
}
}
?>
Execute:
% ./scrape.php
57
3,593
10,659,716
113,900
United Kingdom
If you are willing to try a regex again, this pattern should work:
!Network Videos:</span>\r\n +<span class=\"value\">([\d,]+).+Views:</span>\r\n +<span class=\"value\">([\d,]+).+Subscribers:</span>\r\n +<span class=\"value\">([\d,]+)!s
It captures the numbers with their embedded commas, which would then need to be stripped out. I'm not familiar with PHP, so cannot give you more complete code

is there another way to do this HTTP request in php?

function do_post_request($url, $data, $optional_headers = null) {
$request = new HttpRequest($url, HttpRequest::METH_POST);
$request->setBody($data);
$response = $request->send();
return $response->getBody();
}
This piece of code doesn't seem to be working, and seems to crash my script. I don't know if its because I don't have the php_http module, but is there an equivalent I can use?
For instance curl? I have tried curl, but I don't know much about it, and with curl I got a "bad request" returned from the server I was trying to connect to with a 400 status.
Anything would be good
Thanks
Tom
Edit:
function do_post_request($url, $data, $optional_headers = null) {
$request = new HttpRequest($url, HttpRequest::METH_POST);
$request->setBody($data);
$response = $request->send();
return $response->getBody();
}
echo "before";
$response = do_post_request($url, $data);
echo "After";
Doing that makes "before" appear on the page. But no "After".
After managing to turn error reporting on I get this:
Fatal error: Class 'HttpRequest' not found in /home/sites/ollysmithwineapp.com/public_html/mellowpages/geocode.php on line 25
So I need another way to do the HTTP Request.
Sure HTTP extension is installed and configured correctly?
Installation/Configuration
Installation
This » PECL extension is not bundled
with PHP.
Information for installing this PECL
extension may be found in the manual
chapter titled Installation of PECL
extensions. Additional information
such as new releases, downloads,
source files, maintainer information,
and a CHANGELOG, can be located here:
»
http://pecl.php.net/package/pecl_http.
and maybe cURl is the way to go
RAW POST using cURL in PHP
PHP4: Send XML over HTTPS/POST via cURL?
Stolen from this question. You can insert $data directly where CURLOPT_POSTFIELDS is set in place of the query string.
<?php
//
// A very simple PHP example that sends a HTTP POST to a remote site
//
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,"http://www.mysite.com/tester.phtml");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS,
"postvar1=value1&postvar2=value2&postvar3=value3");
// receive server response ...
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$server_output = curl_exec ($ch);
curl_close ($ch);
// further processing ....
if ($server_output == "OK") { ... } else { ... }
?>
I also found a solution using stream_context_create(). It gives you more control over what you're sending in the POST.
Here's a blog post explaining how to do it. It lets you easily specify the exact headers and body.
http://wezfurlong.org/blog/2006/nov/http-post-from-php-without-curl/
There is no HttpRequest::setBody() method. You should use the addPostFields function instead, using an associative array:
function do_post_request($url, $data, $optional_headers = null) {
$request = new HttpRequest($url, HttpRequest::METH_POST);
$request->setPostFields($data);
$response = $request->send();
return $response->getBody();
}
$responseBody = do_post_request('http://www.example.com',array('examplefield'=>'exampledata'));

Categories