php curl, link label modifying through proxy website, not fully working - php

Here is the code
<?php
$url='http://isrc.ulster.ac.uk';
$var = fread_url($url);// function calling to get the page from curl
$i=0;
$linklabel = array();
$linklabelmod = array();
$link = array();
$dom = new DOMDocument();
#$dom->loadHTML($var);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//a') as $element) {
$linklabel[] = $element->textContent;
$link[] = $element->getAttribute("href");
$i=$i+1;
}
for($k=0;$k<$i;$k++) {
$linklabelmod[$k] = str_replace($linklabel[$k], $linklabel[$k]."[$k]", $linklabel[$k]);
$var = preg_replace( "/\\Q$linklabel[$k]\\E/", $linklabelmod[$k], $var, 1 );//modifying link labels
}
print $var;
function fread_url($url){
if(function_exists("curl_init")){
$ch = curl_init();
$user_agent = "Mozilla/4.0 (compatible; MSIE 5.01; "."Windows NT 5.0)";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt( $ch, CURLOPT_HTTPGET, 1 );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION , 1 );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION , 1 );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
$html = curl_exec($ch);
//print $html;//will printing the web page .
curl_close($ch);
}
else{
$hfile = fopen($url,"r");
if($hfile){
while(!feof($hfile)){
$html.=fgets($hfile,1024);
}
}
}
return $html;
}
?>
Not all link labels are changing. I want each link label to be modified by attaching a unique number. Plz run the code so that you can see error.. Thx in advance..

What about checking if a match was found before attempting to replace it? Using preg_match.
It is not my intention to ruin your question by asking this, but how would one reply to someone elses comment? I only see the 'add comment' on my own comments, thank you.

Related

Bypass age check on Steam website (with PHP curl)

I know, there is already a thread about this ... see How to pass the steam age check using curl? ... but I'm a new user and can't comment in an existing thread and the answer marked as solution there doesn't work anymore.
I had my own code that worked fine in the past (around 2017), but doesn't work anymore as well.
Here is my code that worked in the past:
function curl_redirect_exec2($ch, &$redirects, $curlopt_header = false) {
$ckfile = tempnam(sys_get_temp_dir(), "CURLCOOKIE");
curl_setopt($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5");
curl_setopt($ch, CURLOPT_POST, 1);
curl_setopt($ch, CURLOPT_POSTFIELDS, 'snr=1_agecheck _agecheck__age-gate&ageDay=1&ageMonth=May&ageYear=1990');
curl_setopt($ch, CURLOPT_FRESH_CONNECT, true);
curl_setopt($ch, CURLOPT_UNRESTRICTED_AUTH, true);
curl_setopt($ch, CURLOPT_COOKIEFILE, $ckfile);
//new start
curl_setopt($ch, CURLOPT_COOKIE, 'mature_content=1; path=/app/'.$gameid.';');
//new end
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
$http_code = curl_getinfo($ch, CURLINFO_HTTP_CODE);
if($http_code == 301 || $http_code == 302) {
list($header) = explode("\r\n\r\n", $data, 2);
$matches = array();
preg_match('/(Location:|URI:)(.*?)\n/', $header, $matches);
$url = trim(array_pop($matches));
$url_parsed = parse_url($url);
if(isset($url_parsed)) {
curl_setopt($ch, CURLOPT_URL, $url);
$redirects++;
return curl_redirect_exec2($ch, $redirects);
}
}
if($curlopt_header) {
return $data;
} else {
list(,$body) = explode("\r\n\r\n", $data, 2);
return $body;
}
}
And here is the code sample from the thread linked above that also seemed to work in the past (but doesn't anymore):
<?php
$url = "http://store.steampowered.com/app/312660/";
// $file = __DIR__ . DIRECTORY_SEPARATOR . "cookie.txt";
// $postData = array(
// 'ageDay' => '31',
// 'ageMonth' => 'July',
// 'ageYear' => '1993'
// );
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_POST,true);
curl_setopt($ch,CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13");
curl_setopt($ch,CURLOPT_POSTFIELDS,$postData);
// curl_setopt($ch,CURLOPT_COOKIESESSION, true);
// curl_setopt($ch,CURLOPT_COOKIEJAR,$file);
// curl_setopt($ch,CURLOPT_COOKIEFILE,$file);
$strCookie = 'mature_content=' . 1 . '; path=/';
curl_setopt( $ch, CURLOPT_COOKIE, $strCookie );
curl_setopt($ch,CURLOPT_FOLLOWLOCATION,true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
?>
What I tested so far:
You can use the game "RUST" as an example: https://store.steampowered.com/app/252490/
Page redirects to age check: https://store.steampowered.com/agecheck/app/252490/
I saw that the cookie set uses other names now ("wants_mature_content" instead of "mature_content" in the JavaScript), but even after changing the PHP to use the new name, it doesn't work.
JavaScript code from Steam page:
function HideAgeGate( )
{
var bHideAll = false;
console.log(bHideAll);
var strCookiePath = bHideAll ? '/' : "\/app\/252490";
V_SetCookie( 'wants_mature_content', 1, 365, strCookiePath );
document.location = "https:\/\/store.steampowered.com\/app\/252490\/Rust\/?snr=";
}
Edit: I also found the function "V_SetCookie" ... in https://store.akamai.steamstatic.com/public/shared/javascript/shared_global.js ... that is called by the code above:
function V_SetCookie( strCookieName, strValue, expiryInDays, path )
{
if ( !path )
path = '/';
var strDate = '';
if( typeof expiryInDays != 'undefined' && expiryInDays )
{
var dateExpires = new Date();
dateExpires.setTime( dateExpires.getTime() + 1000 * 60 * 60 * 24 * expiryInDays );
strDate = '; expires=' + dateExpires.toGMTString();
}
document.cookie = strCookieName + '=' + strValue + strDate + ';path=' + path;
}
Can somebody help please? :-) Thanks!
This is working for me
<?php
$url = 'https://store.steampowered.com/bundle/5699/Grand_Theft_Auto_V_Premium_Edition/';
$ch = curl_init();
curl_setopt($ch, CURLOPT_ENCODING, '');
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, ['Cookie: birthtime=470682001;lastagecheckage=1-0-1985;']);
$html = curl_exec($ch);
curl_close($ch);
var_dump($html);
Ok, got it working.
There are actually 3 cookies: "wants_mature_content", "lastagecheckage" and "birthtime"
Open a page with age check in e.g. Chrome, click on "View Page" and then look for the 3 cookies in chrome (and their content). Set all 3 cookies with PHP's curl and it's working ;-)

Login to site using php curl [duplicate]

I have some problem with PHP Curl and cookies authentication.
I have a file Connector.php which authenticates users on another server and returns the cookie of the current user.
The Problem is that I want to authenticate thousands of users with curl but it authenticates and saves COOKIES only for one user at a time.
The code for connector.php is this:
<?php
if(!count($_REQUEST)) {
die("No Access!");
}
//Core Url For Services
define ('ServiceCore', 'http://example.com/core/');
//Which Internal Service Should Be Called
$path = $_GET['service'];
//Service To Be Queried
$url = ServiceCore.$path;
//Open the Curl session
$session = curl_init($url);
// If it's a GET, put the GET data in the body
if ($_GET['service']) {
//Iterate Over GET Vars
$postvars = '';
foreach($_GET as $key=>$val) {
if($key!='service') {
$postvars.="$key=$val&";
}
}
curl_setopt ($session, CURLOPT_POST, true);
curl_setopt ($session, CURLOPT_POSTFIELDS, $postvars);
}
//Create And Save Cookies
$tmpfname = dirname(__FILE__).'/cookie.txt';
curl_setopt($session, CURLOPT_COOKIEJAR, $tmpfname);
curl_setopt($session, CURLOPT_COOKIEFILE, $tmpfname);
curl_setopt($session, CURLOPT_HEADER, false);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
curl_setopt($session, CURLOPT_FOLLOWLOCATION, true);
// EXECUTE
$json = curl_exec($session);
echo $json;
curl_close($session);
?>
Here is the process of authentication:
User enters username and password: Connector.php?service=logon&user_name=user32&user_pass=123
Connector.php?service=logosessionInfo returns info about the user based on the cookies saved earlier with logon service.
The problem is that this code saves the cookie in one file for each user and can't handle multiple user authentications.
You can specify the cookie file with a curl opt. You could use a unique file for each user.
curl_setopt( $curl_handle, CURLOPT_COOKIESESSION, true );
curl_setopt( $curl_handle, CURLOPT_COOKIEJAR, uniquefilename );
curl_setopt( $curl_handle, CURLOPT_COOKIEFILE, uniquefilename );
The best way to handle it would be to stick your request logic into a curl function and just pass the unique file name in as a parameter.
function fetch( $url, $z=null ) {
$ch = curl_init();
$useragent = isset($z['useragent']) ? $z['useragent'] : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2';
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_POST, isset($z['post']) );
if( isset($z['post']) ) curl_setopt( $ch, CURLOPT_POSTFIELDS, $z['post'] );
if( isset($z['refer']) ) curl_setopt( $ch, CURLOPT_REFERER, $z['refer'] );
curl_setopt( $ch, CURLOPT_USERAGENT, $useragent );
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, ( isset($z['timeout']) ? $z['timeout'] : 5 ) );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $z['cookiefile'] );
curl_setopt( $ch, CURLOPT_COOKIEFILE, $z['cookiefile'] );
$result = curl_exec( $ch );
curl_close( $ch );
return $result;
}
I use this for quick grabs. It takes the url and an array of options.
In working with a similar problem I created the following function after combining a lot of resources I ran into on the web, and adding my own cookie handling. Hopefully this is useful to someone else.
function get_web_page( $url, $cookiesIn = '' ){
$options = array(
CURLOPT_RETURNTRANSFER => true, // return web page
CURLOPT_HEADER => true, //return headers in addition to content
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 10, // stop after 10 redirects
CURLINFO_HEADER_OUT => true,
CURLOPT_SSL_VERIFYPEER => true, // Validate SSL Certificates
CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1,
CURLOPT_COOKIE => $cookiesIn
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$rough_content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header_content = substr($rough_content, 0, $header['header_size']);
$body_content = trim(str_replace($header_content, '', $rough_content));
$pattern = "#Set-Cookie:\\s+(?<cookie>[^=]+=[^;]+)#m";
preg_match_all($pattern, $header_content, $matches);
$cookiesOut = implode("; ", $matches['cookie']);
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['headers'] = $header_content;
$header['content'] = $body_content;
$header['cookies'] = $cookiesOut;
return $header;
}
First create temporary cookie using tempnam() function:
$ckfile = tempnam ("/tmp", "CURLCOOKIE");
Then execute curl init witch saves the cookie as a temporary file:
$ch = curl_init ("http://uri.com/");
curl_setopt ($ch, CURLOPT_COOKIEJAR, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
Or visit a page using the cookie stored in the temporary file:
$ch = curl_init ("http://somedomain.com/cookiepage.php");
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
curl_setopt ($ch, CURLOPT_RETURNTRANSFER, true);
$output = curl_exec ($ch);
This will initialize the cookie for the page:
curl_setopt ($ch, CURLOPT_COOKIEFILE, $ckfile);
Here you can find some useful info about cURL & cookies http://docstore.mik.ua/orelly/webprog/pcook/ch11_04.htm .
You can also use this well done method https://github.com/alixaxel/phunction/blob/master/phunction/Net.php#L89 like a function:
function CURL($url, $data = null, $method = 'GET', $cookie = null, $options = null, $retries = 3)
{
$result = false;
if ((extension_loaded('curl') === true) && (is_resource($curl = curl_init()) === true))
{
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FAILONERROR, true);
curl_setopt($curl, CURLOPT_AUTOREFERER, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
curl_setopt($curl, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, false);
if (preg_match('~^(?:DELETE|GET|HEAD|OPTIONS|POST|PUT)$~i', $method) > 0)
{
if (preg_match('~^(?:HEAD|OPTIONS)$~i', $method) > 0)
{
curl_setopt_array($curl, array(CURLOPT_HEADER => true, CURLOPT_NOBODY => true));
}
else if (preg_match('~^(?:POST|PUT)$~i', $method) > 0)
{
if (is_array($data) === true)
{
foreach (preg_grep('~^#~', $data) as $key => $value)
{
$data[$key] = sprintf('#%s', rtrim(str_replace('\\', '/', realpath(ltrim($value, '#'))), '/') . (is_dir(ltrim($value, '#')) ? '/' : ''));
}
if (count($data) != count($data, COUNT_RECURSIVE))
{
$data = http_build_query($data, '', '&');
}
}
curl_setopt($curl, CURLOPT_POSTFIELDS, $data);
}
curl_setopt($curl, CURLOPT_CUSTOMREQUEST, strtoupper($method));
if (isset($cookie) === true)
{
curl_setopt_array($curl, array_fill_keys(array(CURLOPT_COOKIEJAR, CURLOPT_COOKIEFILE), strval($cookie)));
}
if ((intval(ini_get('safe_mode')) == 0) && (ini_set('open_basedir', null) !== false))
{
curl_setopt_array($curl, array(CURLOPT_MAXREDIRS => 5, CURLOPT_FOLLOWLOCATION => true));
}
if (is_array($options) === true)
{
curl_setopt_array($curl, $options);
}
for ($i = 1; $i <= $retries; ++$i)
{
$result = curl_exec($curl);
if (($i == $retries) || ($result !== false))
{
break;
}
usleep(pow(2, $i - 2) * 1000000);
}
}
curl_close($curl);
}
return $result;
}
And pass this as $cookie parameter:
$cookie_jar = tempnam('/tmp','cookie');
You can define different cookies for every user with CURLOPT_COOKIEFILE and CURLOPT_COOKIEJAR. Make different file for every user so each one would have it's own cookie-based session on remote server.
Solutions which are described above, even with unique CookieFile names, can cause a lot of problems on scale.
We had to serve a lot of authentications with this solution and our server went down because of high file read write actions.
The solution for this was to use Apache Reverse Proxy and omit CURL requests at all.
Details how to use Proxy on Apache can be found here:
https://httpd.apache.org/docs/2.4/howto/reverse_proxy.html

file_get_contents not retrieving page contents

OK, before saying this is a duplicate just read a bit....
I have been trying to echo contents of URL that has allow_url_fopen disabled for HOURS now, I have tried every solution posted on stack overflow. EXAMPLE:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
$result = curl_exec($ch);
curl_close($ch);
Doesn't WORK
function curl_get_contents($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
Doesn't WORK
$url = "http://www.google.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
curl_close($ch);
echo $data;
Doesn't WORK
fopen("cookies.txt", "w");
$url="http://adfoc.us/1575051";
$ch = curl_init();
$header=array('GET /1575051 HTTP/1.1',
'Host: adfoc.us',
'Accept:text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language:en-US,en;q=0.8',
'Cache-Control:max-age=0',
'Connection:keep-alive',
'Host:adfoc.us',
'User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.116 Safari/537.36',
);
curl_setopt($ch,CURLOPT_URL,$url);
curl_setopt($ch,CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch,CURLOPT_CONNECTTIMEOUT,0);
curl_setopt( $ch, CURLOPT_COOKIESESSION, true );
curl_setopt($ch,CURLOPT_COOKIEFILE,'cookies.txt');
curl_setopt($ch,CURLOPT_COOKIEJAR,'cookies.txt');
curl_setopt($ch,CURLOPT_HTTPHEADER,$header);
$result=curl_exec($ch);
curl_close($ch);
Doesn't WORK
// create the Gateway object
$gateway = new Gateway();
// set our url
$gateway->init($url);
// get the raw response, ignore errors
$response = $gateway->exec();
Doesn't WORK
$file = "http://www.example.com/my_page.php";
if (function_exists('curl_version'))
{
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $file);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$content = curl_exec($curl);
curl_close($curl);
}
else if (file_get_contents(__FILE__) && ini_get('allow_url_fopen'))
{
$content = file_get_contents($file);
}
else
{
echo 'You have neither cUrl installed nor allow_url_fopen activated. Please setup one of those!';
}
This doesn't work.
The page I am trying to use file_get_contents on is not on my website. I am trying to use file_get_contents so i can make a simple API for the site owner by reading a page and checking if a certain word is present on the page.
But yeah if anyone has any suggestions PLEASE post below :)
You can check first weather the site is available or not for example a sample code
Code taken from here:
<?php
$cURL = curl_init('http://www.technofusions.com/');
curl_setopt ( $cURL , CURLOPT_RETURNTRANSFER , true );
// Follow any kind of redirection that are in the URL
curl_setopt ( $cURL , CURLOPT_FOLLOWLOCATION , true );
$result = curl_exec ( $cURL );
// Getting HTTP response code
$answer = curl_getinfo ( $cURL , CURLINFO_HTTP_CODE );
curl_close ( $cURL );
if ( $answer == ' 404 ' ) {
echo ' The site not found (ERROR 404)! ' ;
} else {
echo ' It looks like everything is working fine ... ' ;
}
?>
For a full answer you can got to this tutorial Curl IN PHP

Select HTML content using PHP

I want to get the paragraphs under this tag:
I tried to:
<?php
$doc = new DOMDocument();
$doc->loadHTMLFile("https://sabq.org/xMQjz2");
$elements = $doc->getElementsByTagName('p');
if (!is_null($elements)) {
foreach ($elements as $element) {
$nodes = $element->childNodes;
foreach ($nodes as $node) {
echo $node->textContent. "\n";
}
}
}
?>
And I got the paragraphs I wanted along with unwanted ones, and they were duplicated.
EDIT:
I changed the URL, hope it works
The link that you have provided throws an error when accessing it so what I did, I found a function that could get the contents of the webpage using curl instead of the DOMDocument class which you were using.
I used preg_match and regex to extract the specific element that you were looking for.
Here's the code:
<?php
//opened url
$content = get_fcontent("https://sabq.org/%D8%B4%D8%A7%D9%87%D8%AF-%D8%A3%D9%84%D9%81-%D8%B5%D9%81%D8%AD%D8%A9-%D8%AA%D8%B1%D9%88%D9%8A-%D9%82%D8%B5%D8%B5-%D8%A7%D9%84%D8%AD%D8%B1%D9%85%D9%8A%D9%86-%D9%85%D9%86%D8%B0-%D8%A7%D9%86%D8%B7%D9%84%D8%A7%D9%82-%D8%A7%D9%84%D8%B9%D9%87%D8%AF-%D8%A7%D9%84%D8%B3%D8%B9%D9%88%D8%AF%D9%8A");
//extract specific html tag and its innerHTML
preg_match('/<p .*? ng\-bind\-html\=\"getContent\(material\.content\)\" .*?>.*?<\/p>/m', $content[0], $matches);
//display the wanted element
echo $matches[0];
//getting contents using curl because threw error: failed to open stream
function get_fcontent( $url, $javascript_loop = 0, $timeout = 5 ) {
$url = str_replace( "&", "&", urldecode(trim($url)) );
$cookie = tempnam ("/tmp", "CURLCOOKIE");
$ch = curl_init();
curl_setopt( $ch, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1" );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt( $ch, CURLOPT_COOKIEJAR, $cookie );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION, true );
curl_setopt( $ch, CURLOPT_ENCODING, "" );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, true );
curl_setopt( $ch, CURLOPT_AUTOREFERER, true );
curl_setopt( $ch, CURLOPT_SSL_VERIFYPEER, false ); # required for https urls
curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_TIMEOUT, $timeout );
curl_setopt( $ch, CURLOPT_MAXREDIRS, 10 );
$content = curl_exec( $ch );
$response = curl_getinfo( $ch );
curl_close ( $ch );
if ($response['http_code'] == 301 || $response['http_code'] == 302) {
ini_set("user_agent", "Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.7.3) Gecko/20041001 Firefox/0.10.1");
if ( $headers = get_headers($response['url']) ) {
foreach( $headers as $value ) {
if ( substr( strtolower($value), 0, 9 ) == "location:" )
return get_url( trim( substr( $value, 9, strlen($value) ) ) );
}
}
}
if ( ( preg_match("/>[[:space:]]+window\.location\.replace\('(.*)'\)/i", $content, $value) || preg_match("/>[[:space:]]+window\.location\=\"(.*)\"/i", $content, $value) ) && $javascript_loop < 5) {
return get_url( $value[1], $javascript_loop+1 );
} else {
return array( $content, $response );
}
}
?>
For testing, I created a local file called test.html:
<!DOCTYPE html>
<html>
<head>
<title></title>
</head>
<body>
<p>This should not be showing.</p>
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text">This is a test.</p>
</body>
</html>
I used the local url http://localhost/example/test.html instead of the link you provided for testing purposes.
And from the local file I created for testing, I got the following result:
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text">This is a test.</p>
Here's the result that I got from the original url:
<p ng-bind-html="getContent(material.content)" id="dev-content" class="details-text"></p>
I hope this helps!

Error on line 14, php curl dom

<?php
$url='http://edition.cnn.com/?fbid=4OofUbASN5k';
$var = fread_url($url);// function calling to get the page from curl
$search = array('#<script[^>]*?>.*?</script>#si'); // Strip out javascript
$var = preg_replace($search, "\n", html_entity_decode($var)); // Strip out javascript
$linklabel = array();
$link = array();
$dom = new DOMDocument($var);
#$dom->loadHTML($var);
$xpath = new DOMXPath($dom);// Grab the DOM nodes
foreach($xpath->find('a') as $element) {
array_push($linklabel, $element->innerText);
print $linklabel;
array_push($link, $element->href);
print $link.'<br>';
}
function fread_url($url) {
if(function_exists("curl_init")) {
$ch = curl_init();
$user_agent = "Mozilla/4.0 (compatible; MSIE 5.01; ".
"Windows NT 5.0)";
$ch = curl_init();
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt( $ch, CURLOPT_HTTPGET, 1 );
curl_setopt( $ch, CURLOPT_RETURNTRANSFER, 1 );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION , 1 );
curl_setopt( $ch, CURLOPT_FOLLOWLOCATION , 1 );
curl_setopt( $ch, CURLOPT_URL, $url );
curl_setopt ($ch, CURLOPT_COOKIEJAR, 'cookie.txt');
$html = curl_exec($ch);
//print $html;//printing the web page.
curl_close($ch);
}
else {
$hfile = fopen($url,"r");
if($hfile) {
while(!feof($hfile)) {
$html.=fgets($hfile,1024);
}
}
}
return $html;
}
i need to seperate links and link labels into two seperate arrays. i followed several forums and made a code, but is getting error. i don't know about the find function used in the code
Several problems, mainly calls to inexistent functions and references to inexistent properties. Correct version:
<?php
$var = <<<EOD
<html>
sdfd
</html>
EOD;
$dom = new DOMDocument();
#$dom->loadHTML($var);
$xpath = new DOMXPath($dom);
foreach($xpath->query('//a') as $element) {
$linklabel[] = $element->textContent;
$link[] = $element->getAttribute("href");
}
var_dump($linklabel);
var_dump($link);

Categories