Parsing xml to store in mysql using php

Parsing xml to store in mysql using php - php

<?php
header("Content-type: text/xml");
$xml = new SimpleXMLElement("<noresult>1</noresult>");
$fn = urlencode($_REQUEST['fn']);
$ln = urlencode($_REQUEST['ln']);
$co = $_REQUEST['co'];
if (empty($fn) || empty($ln)):
echo $xml->asXML();
exit();
endif;
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.linkedin.com/pub/dir/?first={$fn}&last={$ln}&search=Search");
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/20100101 Firefox/19.0");
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_ENCODING, 'gzip,deflate');
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
$res = curl_exec($ch);
preg_match("/<div id=\"content\".*?<\/div>\s*<\/div>/ms", $res, $match);
if (!empty($match)):
$dom = new DOMDocument();
$dom->loadHTML($match[0]);
$ol = $dom->getElementsByTagName('ol');
$vcard = $dom->getElementsByTagName('li');
$co_match_node = false;
for ($i = 0; $i < $vcard->length; $i++):
if (!empty($co) && stripos($vcard->item($i)->nodeValue, $co) !== false) $co_match_node = $vcard->item($i);
endfor;
if (!empty($co_match_node)):
echo $dom->saveXML($co_match_node);
// my idea is to put code here to save in the database.
else:
echo (string)$dom->saveXML($ol->item(0));
endif;
else:
echo $xml->asXML();
endif;
curl_close($ch);
exit();
I'm trying to save XML into a MySQL database.
However, I don't know how to parse the $dom or how to segregate the "li".
There are 5 fields needed in the database:
span.given-name
span.family-name
span.location
span.industry
dd.current-content span
These fields are available in the XML.

Related

submitting aspx form with Php I have the view but not the data

I want to have the coupon data from planet win I have the view, but not the coupon data. I use all the params of asp forms but I can't have the coupon data please help. I think that I have a problem in the form data or the site work with web service the request header of the xhr rquest is
POST /Sport/default.aspx HTTP/1.1 Host: ww3.365planetwinall.net
Connection: keep-alive Content-Length: 10353 Cache-Control: no-cache
Origin: https://ww3.365planetwinall.net X-MicrosoftAjax: Delta=true
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36 Content-Type:
application/x-www-form-urlencoded; charset=UTF-8 Accept: / Referer:
https://ww3.365planetwinall.net/Sport/default.aspx Accept-Encoding:
gzip, deflate, br Accept-Language:
fr-FR,fr;q=0.8,en-US;q=0.6,en;q=0.4,ar;q=0.2 Cookie:
Comm100_CC_Identity_178373=-28931327; ISBets_CurrentOddsFormat=1;
ISBets_CurrentGMT=41; ASP.NET_SessionId=i2avbkrxv4pvls55sw4d1j45;
__utmt=1; __utma=1.1764843245.1455596018.1473978904.1474078088.172; __utmb=1.2.10.1474078088; __utmc=1; __utmz=1.1473331905.170.21.utmcsr=zalozi.com|utmccn=(referral)|utmcmd=referral|utmcct=/planetwin365;
comm100_session_178373=-35985514;
comm100_guid2_178373=5d22b4d2847a4e0d82cc3db3afeb5177;
ISBets_CurrentCulture=11; _ga=GA1.2.1764843245.1455596018;
_dc_gtm_UA-63917352-3=1; _ga=GA1.3.1764843245.1455596018; _dc_gtm_UA-63917352-10=1
<?php
$url = "https://ww3.365planetwinall.net/Sport/default.aspx";
$ckfile = tempnam("/tmp", "CURLCOOKIE");
$useragent = 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US)
AppleWebKit/533.2 (KHTML, like Gecko) Chrome/5.0.342.3 Safari/533.2';
//$username = "XXXXXXXXXX";
//$password = "XXXXXXXXXX";
$f = fopen('log.txt', 'w'); // file to write request header for debug
purpose
/**
Get __VIEWSTATE & __EVENTVALIDATION
*/
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
$html = curl_exec($ch);
curl_close($ch);
preg_match('~<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="(.*?)" />~', $html, $viewstate);
preg_match('~<input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="(.*?)" />~', $html, $eventValidation);
$viewstate = $viewstate[1];
$eventValidation = $eventValidation[1];
/**
Start Login process
*/
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
curl_setopt($ch, CURLOPT_HTTPHEADER, array('Accept: application/json'));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, false);
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt($ch, CURLOPT_HEADER, FALSE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_REFERER, $url);
curl_setopt($ch, CURLOPT_VERBOSE, 1);
curl_setopt($ch, CURLOPT_STDERR, $f);
curl_setopt($ch, CURLOPT_USERAGENT, $useragent);
// Collecting all POST fields
$postfields = array();
$postfields['h$w$SM'] = 'h$w$PC$cCoupon$atlasCoupon|h$w$PC$cCoupon$lnkCaricaCouponCodiceAnonimo';
$postfields['h$w$cLogin$ctrlLogin$Username'] = "";
$postfields['h$w$cLogin$ctrlLogin$Password'] = '';
$postfields['h$w$PC$oddsSearch$txtSearch'] = '';
$postfields['h$w$PC$cSport$hidSportTime'] = '';
$postfields['h$w$PC$ctl02$txtVincita'] = "100";
$postfields['h$w$PC$ctl02$txtGiocata'] = "1";
$postfields['h$w$PC$CouponCheck1$txtCodiceCoupon'] = '';
$postfields['h$w$PC$ctl12$hidQuoteCoupon'] =
'4177834906§4189204249§4192948716§4191682218§4192727992§';
$postfields['h$w$PC$cCoupon$hidRiserva'] = "0";
$postfields['h$w$PC$cCoupon$hidAttesa'] = "0";
$postfields['h$w$PC$cCoupon$hidCouponAsincrono'] = "0";
$postfields['h$w$PC$cCoupon$hidIsTemporaryCoupon'] = '';
$postfields['h$w$PC$cCoupon$hidTipoCoupon'] = "4";
$postfields['h$w$PC$cCoupon$hidStatoCoupon'] = "0";
$postfields['h$w$PC$cCoupon$hidBonusNumScommesse'] = "1.1000";
$postfields['h$w$PC$cCoupon$hidQuotaTotaleDIMax'] = '';
$postfields['h$w$PC$cCoupon$hidQuotaTotaleDIMin'] = '';
$postfields['h$w$PC$cCoupon$hidQuotaTotale'] = '112,66';
$postfields['h$w$PC$cCoupon$hidIDQuote'] = '';
$postfields['h$w$PC$cCoupon$hidModificatoQuote'] = "1";
$postfields['h$w$PC$cCoupon$hidBonusQuotaMinimaAttivo'] = "0";
$postfields['h$w$PC$cCoupon$hidBonusRaggruppamentoMinimo'] = '0';
$postfields['h$w$PC$cCoupon$hidNumItemCoupon'] = '0';
$postfields['h$w$PC$cCoupon$hidPrintAsincronoDisabled'] = '0';
$postfields['h$w$PC$cCoupon$txtCouponCodiceAnonimo'] = 'TD426';
$postfields['h$w$PC$cCoupon$txtIDQuota'] = '';
$postfields['h$w$PC$cCoupon$txtSottoEventName'] = '';
$postfields['h$w$PC$cCoupon$txtQuota'] = '';
$postfields['h$w$PC$cCoupon$txtCodPubblicazione'] = '';
$postfields['h$w$PC$cCoupon$txtIDEvento'] = '';
$postfields['h$w$PC$cCoupon$txtEventName'] = '';
$postfields['h$w$PC$cCoupon$txtIDSottoEvento'] = '';
$postfields['h$w$PC$cCoupon$txtGiocabilita'] = '';
$postfields['h$w$PC$cCoupon$txtTipoQuota'] = '';
$postfields['h$w$PC$cCoupon$txtIDTipoEvento'] = '';
$postfields['h$w$PC$cCoupon$txtIDTipoQuota'] = '';
$postfields['h$w$PC$cCoupon$txtQB'] = '';
$postfields['h$w$PC$cCoupon$txtAddImporto'] = '';
$postfields['h$w$PC$cCoupon$txtIDCouponPrecompilato'] = '';
$postfields['h$w$PC$cCoupon$txtImportoCouponPrecompilato'] = '';
$postfields['__EVENTTARGET'] = "h$w$PC$cCoupon$btnFakeLoad";
$postfields['__EVENTARGUMENT'] = "";
$postfields['__ASYNCPOST'] = "true";
$postfields['__VIEWSTATEGENERATOR'] = "15C4A0A3";
$postfields['__VIEWSTATE'] = $viewstate;
$postfields['__EVENTVALIDATION'] = $eventValidation;
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields);
$ret = curl_exec($ch); // Get result after login page.
var_dump($ret) ;
echo 'Erreur Curl : ' . curl_error($ch);
?>

it wouldn't surprise me if your problem is simply,
you send the POST request using multipart/form-data,
a lot of servers don't parse that correctly, and
would expect application/x-www-form-urlencoded instead.
to fix that, replace curl_setopt($ch, CURLOPT_POSTFIELDS, $postfields); with
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($postfields));
other notes:
do not
curl_setopt($ch, CURLOPT_CUSTOMREQUEST, "POST");
simply do
curl_setopt($ch,CURLOPT_POST,true);
instead.
your UA string contains newlines. pretty sure that's not what you want, i don't know any browser that actually has a newline in the user-agent header.
for the sake of portability, fopen with 'wb'
it would probably be better to use DOMDocument to parse your html.
$viewstate=(#DOMDocument::loadHTML($html))->getElementById('__VIEWSTATE')->getAttribute("value");
$eventValidation=(#DOMDocument::loadHTML($html))->getElementById('__EVENTVALIDATION')->getAttribute("value");
(a lot of experts agree on that regular expressions are not fit for parsing html. see RegEx match open tags except XHTML self-contained tags for example)
setting CURLOPT_ENCODING to empty string would magically make your transfers faster.
if you don't need the cookies after the script has finished, you should probably do
$ckfileh=tmpfile();
$ckfile=stream_get_meta_data($ckfileh)['uri']; instead of tmpnam() as it will automatically clear the tmpfiles() at the end of script execution, wheras your tmpnam() approach leaves junk in /tmp unless you manually explicitly delete it on script completion.

function get_headers_from_curl_response($headerContent) {
$headers = [];
// Split the string on every "double" new line.
$arrRequests = explode("\r\n\r\n", $headerContent);
// Loop of response headers. The "count() -1" is to
//avoid an empty row for the extra line break before the body of the esponse.
for ($index = 0; $index < count($arrRequests) - 1; $index++) {
foreach (explode("\r\n", $arrRequests[$index]) as $i => $line) {
if ($i === 0) {
$headers[$index]['http_code'] = $line;
}
else {
list ($key, $value) = explode(': ', $line);
$headers[$index][$key] = $value;
}
}
}
return $headers;
}
function regexExtract($text, $regex, $regs, $nthValue) {
if (preg_match($regex, $text, $regs)) {
$result = $regs[$nthValue];
}
else {
$result = "";
}
return $result;
}
$regexViewstate = '/__VIEWSTATE\" value=\"(.*)\"/i';
$regexEventVal = '/__EVENTVALIDATION\" value=\"(.*)\"/i';
$ch = curl_init("https://m3.365planetwinall.net/Schedina.aspx");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'cookies.txt');
$response = curl_exec($ch);
curl_close($ch);
$viewstate = regexExtract($response, $regexViewstate, $regs, 1);
$eventval = regexExtract($response, $regexEventVal, $regs, 1);
$params = [
'__EVENTTARGET'=>'ctl00$w$ContentMain$ContentMain$Coupon1$lnkCaricaCouponCodiceAnonimo',
'__VIEWSTATEGENERATOR'=>'748FF232',
'__EVENTARGUMENT' => '',
'__VIEWSTATE' => $viewstate,
'__EVENTVALIDATION' => $eventval,
'ctl00$w$SM'=>'ctl00$w$ContentMain$ContentMain$Coupon1$atlasCoupon|ctl00$w$ContentMain$ContentMain$Coupon1$lnkCaricaCouponCodiceAnonimo',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidRiserva'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidAttesa'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidTipoCoupon'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidStatoCoupon'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidBonusNumScommesse'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidQuotaTotaleDIMax'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidQuotaTotaleDIMin'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidQuotaTotale'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidIDQuote'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidModificatoQuote'=>'1',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidBonusQuotaMinimaAttivo'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidBonusRaggruppamentoMinimo'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidNumItemCoupon'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidIDCoupon'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$hidPrintAsincronoDisabled'=>'0',
'ctl00$w$ContentMain$ContentMain$Coupon1$txtCouponCodiceAnonimo'=>$code_coupon,
'ctl00$w$ContentMain$ContentMain$Coupon1$txtIDQuota'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$txtQB'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$txtAddImporto'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$txtIDCouponPrecompilato'=>'',
'ctl00$w$ContentMain$ContentMain$Coupon1$txtImportoCouponPrecompilato'=>'',
'__ASYNCPOST'=>'false'
];
$ch2 = curl_init("https://m3.365planetwinall.net/Schedina.aspx");
curl_setopt($ch2, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch2, CURLOPT_HEADER, 1);
curl_setopt($ch2, CURLOPT_POST, true);
curl_setopt($ch2, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch2, CURLOPT_POSTFIELDS, http_build_query($params));
curl_setopt($ch2, CURLOPT_COOKIE, 'cookies.txt');
curl_setopt($ch2, CURLOPT_COOKIEJAR, 'cookies2.txt');
$response2 = curl_exec($ch2);
curl_close($ch2);
foreach (get_headers_from_curl_response($response2) as $value) {
foreach ($value as $key => $value2) {
// echo $key.": ".$value2."<br />";
}
}

Get Paginated Links With php and simple html dom

I have this code to try and get the pagination links using php but the result is not quiet right. could any one help me.
what I get back is just a recurring instance of the first link.
<?php
include_once('simple_html_dom.php');
function dlPage($href) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $href);
curl_setopt($curl, CURLOPT_REFERER, $href);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
$str = curl_exec($curl);
curl_close($curl);
// Create a DOM object
$dom = new simple_html_dom();
// Load HTML from a string
$dom->load($str);
$Next_Link = array();
foreach($dom->find('a[title=Next]') as $element){
$Next_Link[] = $element->href;
}
print_r($Next_Link);
$next_page_url = $Next_Link[0];
if($next_page_url !='') {
echo '<br>' . $next_page_url;
$dom->clear();
unset($dom);
//load the next page from the pagination to collect the next link
dlPage($next_page_url);
}
}
$url = 'https://www.jumia.com.gh/phones/';
$data = dlPage($url);
//print_r($data)
?>
what i want to get is
mySiteUrl/?facet_is_mpg_child=0&viewType=gridView&page=2
mySiteUrl//?facet_is_mpg_child=0&viewType=gridView&page=3
.
.
.
to the last link in the pagination. Please help

Here it is. Look that I htmlspecialchars_decode the link. Cause the href in curl there shouldn't be an & like in xml. Should the return value of dlPage the last link in Pagination. I understood so.
<?php
include_once('simple_html_dom.php');
function dlPage($href, $already_loaded = array()) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $href);
curl_setopt($curl, CURLOPT_REFERER, $href);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
$htmlPage = curl_exec($curl);
curl_close($curl);
echo "Loading From URL:" . $href . "<br/>\n";
$already_loaded[$href] = true;
// Create a DOM object
$dom = file_get_html($href);
// Load HTML from a string
$dom->load($htmlPage);
$next_page_url = null;
$items = $dom->find('ul[class="osh-pagination"] li[class="item"] a[title="Next"]');
foreach ($items as $item) {
$link = htmlspecialchars_decode($item->href);
if (!isset($already_loaded[$link])) {
$next_page_url = $link;
break;
}
}
if ($next_page_url !== null) {
$dom->clear();
unset($dom);
//load the next page from the pagination to collect the next link
return dlPage($next_page_url, $already_loaded);
}
return $href;
}
$url = 'https://www.jumia.com.gh/phones/';
$data = dlPage($url);
echo "DATA:" . $data . "\n";
And the output is:
Loading From URL:https://www.jumia.com.gh/phones/<br/>
Loading From URL:https://www.jumia.com.gh/phones/?facet_is_mpg_child=0&viewType=gridView&page=2<br/>
Loading From URL:https://www.jumia.com.gh/phones/?facet_is_mpg_child=0&viewType=gridView&page=3<br/>
Loading From URL:https://www.jumia.com.gh/phones/?facet_is_mpg_child=0&viewType=gridView&page=4<br/>
Loading From URL:https://www.jumia.com.gh/phones/?facet_is_mpg_child=0&viewType=gridView&page=5<br/>
DATA:https://www.jumia.com.gh/phones/?facet_is_mpg_child=0&viewType=gridView&page=5

Curl with Simple HTML DOM using Link Pagination

I want to combine Curl and Simple HTML DOM.
Both are working fine separately.
I want to curl a site and then I want to look into the inner data using DOM
with pagination page numbers.
I am using this code.
<?php
include 'simple_html_dom.php';
function dlPage($href) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, false);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $href);
curl_setopt($curl, CURLOPT_REFERER, $href);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
$str = curl_exec($curl);
curl_close($curl);
// Create a DOM object
$dom = new simple_html_dom();
// Load HTML from a string
$dom->load($str);
return $dom;
}
$url = 'http://example.com/';
$data = dlPage($url);
// echo $data;
#######################################################
$startpage = 1;
$endpage = 3;
for ($p=$startpage;$p<=$endpage;$p++) {
$html = file_get_html('http://example.com/page/$p.html');
// connect to main page links
foreach ($html->find('div#link a') as $link) {
$linkHref = $link->href;
//loop through each link
$linkHtml = file_get_html($linkHref);
// parsing inner data
foreach($linkHtml->find('h1') as $title) {
echo $title;
}
foreach ($linkHtml->find('div#data') as $description) {
echo $description;
}
}
}
?>
How can I combine this to make it work as one single script?

Update page after each cURL Request

I have the following code:
$fiz = $_GET['file'];
$file = file_get_contents($fiz);
$trim = trim($file);
$tw = explode("\n", $trim);
$browser = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36';
foreach($tw as $twi){
$url = 'https://twitter.com/users/username_available?username='.$twi;
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, '$browser');
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
$json = json_decode($result, true);
if($json['valid'] == 1){
echo "Twitter ".$twi." is available! <br />";
$fh = fopen('available.txt', 'a') or die("can't open file");
fwrite($fh, $twi."\n");
} else {
echo "Twitter ".$twi." is taken! <br />";
}
}
And what it does is that it takes list that would look something like:
apple
dog
cat
and so on, and it checks it with Twitter to check if the name is taken or not.
What I want to know is that if it's in any way possible to make the request show up after each check in instead of showing up all at once?

You need to use Ajax calls, If you are familiar with JavaScript or Jquery you can easily do this.
Instead of checking all names at once , use an Ajax function to send one name at a time to the server side PHP code.
Say you send "Cat" first , the page is processed and returns the result using Ajax. Now you can display the result on page.
send "dog" get the response---> display it and so on.
A similar Question has been answered here Return AJAX result for each cURL Request
Hope this helps, I use Jquery here ...
JavaScript
<script>
var keyArray = ('cat','dog','mouse','rat');
function checkUserName(name, keyArray, position){
$("#result").load("namecheck.php", {uesername: name, function(){ // Results are displayed on 'result' element
fetchNext(keyArray, position);
});}
}
function fetchNext(keyArray, position){
position++; // get next name in the array
if(position < keyArray.lenght){ // not exceeding the aray count
checkUserName(keyArray[position], keyArray, position) // make ajax call to check user name
}
}
function startProcess(){
var keyArray = ('cat','dog','mouse','rat');
var position = 0; // get the first element from the array
fetchNext(keyArray, position);
}
</script>
HTML
<div id="result"></div>
<button onclick="startProcess()"> Start Process </button>
PHP
<?
$twi = $_GET['username'];
$browser = 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/28.0.1468.0 Safari/537.36';
$url = 'https://twitter.com/users/username_available?username='.$twi;
$ch = curl_init();
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, '$browser');
curl_setopt($ch, CURLOPT_TIMEOUT, 8);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
$result = curl_exec($ch);
$json = json_decode($result, true);
if($json['valid'] == 1){
echo "Twitter ".$twi." is available! <br />";
$fh = fopen('available.txt', 'a') or die("can't open file");
fwrite($fh, $twi."\n");
}else{
echo "Twitter ".$twi." is taken! <br />";
} ?>

How to parse feed with php

Using Wikiepdia API link to get some basic informations about some world known characters.
Example : (About Dave Longaberger)
This would show as following
Now my question
I'd like to parse the xml to get such basic informations between <extract></extract> to show it.
Here is my idea but failed (I/O warning : failed to load external entity)
<?PHP
$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Dave Longaberger&format=xml&exintro=1';
$xml = simplexml_load_file($url);
// get extract
$text=$xml->pages[0]->extract;
// show title
echo $text;
?>
Another idea but also failed (failed to open stream: HTTP request failed!)
<?PHP
function get_url_contents($url){
$crl = curl_init();
$timeout = 5;
curl_setopt ($crl, CURLOPT_URL,$url);
curl_setopt ($crl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt ($crl, CURLOPT_CONNECTTIMEOUT, $timeout);
$ret = curl_exec($crl);
curl_close($crl);
return $ret;
}
$url = "http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Dave Longaberger&format=xml&exintro=1";
$text = file_get_contents($url);
echo $text;
?>
so any idea how to do it. ~ Thanks
Update (after added urlencode or rawurlencode still not working)
$name = "Dave Longaberger";
$name = urlencode($name);
$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles='.$name.'&format=xml&exintro=1';
$text = file_get_contents($url);
Also not working
$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Dave Longaberger&format=xml&exintro=1';
$url = urlencode($url);
$text = file_get_contents($url);
nor
$url = 'http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles='.rawurlencode('Dave Longaberger').'&format=xml&exintro=1';
$text = file_get_contents($url);
Well so i really don't know looks like it is impossible by somehow.

Set the User Agent Header in your curl request, wikipedia replies with error 403 forbidden otherwise.
<?PHP
$url = "http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Dave+Longaberger&format=xml&exintro=1";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
$xml = curl_exec($ch);
curl_close($ch);
echo $xml;
?>
Alternatively:
ini_set("user_agent","Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1");
$url = "http://en.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Dave+Longaberger&format=xml&exintro=1";
$xml = simplexml_load_file($url);
$extracts = $xml->xpath("/api/query/pages/page/extract");
var_dump($extracts);

Look at the note in this php man page
http://php.net/manual/en/function.file-get-contents.php
If you're opening a URI with special characters, such as spaces, you need to encode the URI with urlencode().

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Parsing xml to store in mysql using php - php

Related

submitting aspx form with Php I have the view but not the data

Get Paginated Links With php and simple html dom

Curl with Simple HTML DOM using Link Pagination

Update page after each cURL Request

How to parse feed with php

Categories

Resources