Find a url in content a site by url it? - php

I want search into content a site by url it site, if existence my url (for example: http://www.mydomain.com/) return it is TRUE else it is FALSE.
If existence url as following list, Return it is FALSE:
- http://www.mydomain.com/blog?12
- www.mydomain.com/news/maste.php
- http://www.mydomain.com/mkds/skas/aksa.html
- www.mydomain.com/
- www.mydomain.com
I want just accsept(find) as(only):
http://www.mydomain.com/ OR http://www.mydomain.com
I tried as:
$url = 'http://www.usersite.com';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL,$url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$contents = curl_exec($ch);
curl_close($ch);
$link="/http:\/\/mydomain.com/";
if(preg_match("/". preg_quote($link,"/"). "/m", $contents) && strstr($contents,"http://www.mydomain.com")){
echo 'TRUE';
} else{
echo 'FALSE';
}
But it doesn't worked, for it what that i want. How can fix it?

You should not be using preg_quote on your link as it is already in a regex form. Try using the entire regex /http:\/\/mydomain.com/m instead.
$link="/http:\/\/mydomain.com/m";
if(preg_match($link, $contents) && false!== stripos($contents,"http://www.mydomain.com")){
echo 'TRUE';
} else{
echo 'FALSE';
}
I've also updated strstr to be stripos and to have an absolute comparison as it's not a boolean safe function.

Related

Read a remote file in php

I want to show contents of a remote file (a file on another server) on my website.
I used the following code, readfile() function is working fine on the current server
<?php
echo readfile("editor.php");
But when I tried to get a remote file
<?php
echo readfile("http://example.com/php_editor.php");
It showed the following error :
301 moved
The document has moved here 224
I am getting this error remote files only, local files are showing with no problem.
Is there anyway to fix this?
Thanks!
Option 1 - Curl
Use CURL and set the CURLOPT_FOLLOWLOCATION-option to true:
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http//example.com");
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
if(curl_exec($ch) === FALSE) {
echo "Error: " . curl_error($ch);
} else {
echo curl_exec($ch);
}
curl_close($ch);
?>
Option 2 - file_get_contents
According to the PHP Documentation file_get_contents() will follow up to 20 redirects as default. Therefore you could use that function. On failure, file_get_contents() will return FALSE and otherwise it will return the entire file.
<?php
$string = file_get_contents("http://www.example.com");
if($string === FALSE) {
echo "Could not read the file.";
} else {
echo $string;
}
?>

How to base64 encode an image from the facebook api

I am attempting to convert an image url provided by the facebook api into base64 format with cURL.
the api provides a url as such:
https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xfp1/v/t1.0-9/p180x540/72099_736078480783_68792122_n.jpg?oh=f3698c5eed12c1f2503b147d221f39d1&oe=54C5BA4E&__gda__=1418090980_c7af12de6b0dd8abe752f801c1d61e0d
The issue is that the url only works with the oh, oe and gda parameters included in the url string, there is no direct img url. Removing the params send you to a facebook error page.
With the parameterized url my curl_exec is not getting correct image data. Is there a way to get the base64 data from facebook, or is there something I can do to get access the pure image url given the parameterized url?
This is what my decode scrip looks like:
header('Access-Control-Allow-Origin: *');
$url = $_GET['url'];
try {
$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 3);
$result = curl_exec($c);
curl_close ($c);
if(false===$result) {
echo 'fail';
} else {
$base64 = "data:image/jpeg;charset=UTF-8;base64,".base64_encode($result);
echo $base64;
}
} catch ( \ErrorException $e ) {
echo 'fail';
}
To address your specific problem, your script is likely failing because the required oh, oe, __gda__ parameters are getting separated during the GET request and therefore are not included in $_GET['url'].
Make sure you're using a URL-encoded string so any unencoded & characters aren't handled as delimiters. Then just decode the string before passing it on to cURL.
...
$url = urldecode($_GET['url']);
...
For anyone curious, you can still load any Facebook image from any one of their legacy CDNs without needing the new parameters:
https://scontent-a-iad.xx.fbcdn.net/hphotos-frc3/
https://scontent-b-iad.xx.fbcdn.net/hphotos-frc3/
https://scontent-c-iad.xx.fbcdn.net/hphotos-frc3/
Just append the original image filename to the URL et voila.
Disclaimer: I have no idea how long this little trick will work for so don't use it on anything important in production.
Maybe this won't help much but it seems that the original picture (ending with _o) does not need gda nor oe oh parameters
to get the original profile picture you can do:
var username_or_id = "name.lastname" //Example
get_url ("http://graph.facebook.com/$username_or_id/picture?width=9999")
hth
I had similar problem. My solution:
$url = urldecode($url);
return base64_encode(file_get_contents($url));
Where the URL is to Graph API:
https://graph.facebook.com/$user_id/picture?width=160
(You probably want to also check, if file_get_contents returns something)
You just need to add the CURLOPT_SSL_VERIFYPEER set to false as the url from facebook is https and not http., or you could just as well request the url without ssl by replacing https with http.
Try the code below
$url = 'https://fbcdn-sphotos-g-a.akamaihd.net/hphotos-ak-xfp1/v/t1.0-9/p180x540/72099_736078480783_68792122_n.jpg?oh=f3698c5eed12c1f2503b147d221f39d1&oe=54C5BA4E&__gda__=1418090980_c7af12de6b0dd8abe752f801c1d61e0d';
try {
$c = curl_init($url);
curl_setopt($c, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($c, CURLOPT_CONNECTTIMEOUT, 3);
/***********************************************/
// you need the curl ssl_opt_verifypeer
curl_setopt($c, CURLOPT_SSL_VERIFYPEER, false);
/***********************************************/
$result = curl_exec($c);
curl_close ($c);
if(false===$result) {
echo 'fail';
} else {
$base64 = '<img alt="Embedded Image" src="data:image/jpeg;charset=UTF-8;base64,'.base64_encode($result).'"/>';
echo $base64;
}
}
catch ( \ErrorException $e ) {
echo 'fail';
}

How to make curl call for remote url which contain space

This question is continuation of my previous question
<?php
$remoteFile = 'http://cdn/bucket/my textfile.txt';
$ch = curl_init($remoteFile);
curl_setopt($ch, CURLOPT_NOBODY, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true); //not necessary unless the file redirects (like the PHP example we're using here)
$data = curl_exec($ch);
print_r($data)
curl_close($ch);
if ($data === false) {
echo 'cURL failed';
exit;
}
$contentLength = 'unknown';
$status = 'unknown';
if (preg_match('/^HTTP\/1\.[01] (\d\d\d)/', $data, $matches)) {
$status = (int)$matches[1];
}
if (preg_match('/Content-Length: (\d+)/', $data, $matches)) {
$contentLength = (int)$matches[1];
}
echo 'HTTP Status: ' . $status . "\n";
echo 'Content-Length: ' . $contentLength;
?>
I am using above code to get the file size in server side from CDN url but when I use the CDN url with space in it. it is throwing below error
page not found 09/18/2014 - 16:54 http://cdn/bucket/my textfile.txt
Can I make curl call for remote url which contain space ?
To give little bit more info on this
I am having interface where user will be saving file to CDN (so user
can give whatever title user want, it may contain space )and all
information in saved in back end db. I have another interface where I
retrieve the saved information and show it in my page along with file
size which I am getting using above code.
You have to encode your url's which have space's in it.
echo urlencode('http://cdn/bucket/my textfile.txt');
Ref: urlencode
or you can use,
echo '<a href="http://example.com/department_list_script/',
rawurlencode('sales and marketing/Miami'), '">';
Ref: rawurlencode
Yes you need to URL / URI encode
In an encoded URL, the spaces are encoded as: %20, so your URL would be: http://cdn/bucket/my%20textfile.txt so you could just use this url.
Or as this is PHP, you could use the urlencode function.
ref: http://php.net/manual/en/function.urlencode.php
e.g.
$remoteFile = urlencode('http://cdn/bucket/my textfile.txt');
or
$ch = curl_init(urlencode($remoteFile));

Replace one URL with another without regex

I'm trying to replace some URLs in a database (wordpress) with another, but it's tricky because a lot of the URLs are redirects. I'm trying to either replace the URL with the redirected URL, or with a URL of my choosing, based on the result. I can get the matching done without any problems, but I can't replace it. I've tried str_replace, but it doesn't seem to replace the URLs. When I try preg_replace, it will give "Warning: preg_replace(): Delimiter must not be alphanumeric or backslash". Can anyone point me in the right way to do this?
if(preg_match($url_regex,$row['post_content'])){
preg_match_all($url_regex,$row['post_content'],$matches);
foreach($matches[0] as $match){
echo "{$row['ID']} \t{$row['post_date']} \t{$row['post_title']}\t{$row['guid']}";
$newUrl = NULL;
if(stripos($url_regex,'domain1') !== false || stripos($url_regex,'domain2') !== false || stripos($url_regex,'domain3') !== false){
$match = str_replace('&','&',$match);
$ch = curl_init();
curl_setopt($ch,CURLOPT_URL,$match);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,1);
curl_setopt($ch, CURLOPT_HEADER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$html = curl_exec($ch);
$newUrl = curl_getinfo($ch, CURLINFO_EFFECTIVE_URL);
if(stripos($newUrl,'domain4') !== false)
$newUrl = NULL;
}
else
if($newUrl == NULL)
{ $newUrl = 'http://www.mysite.com/';
}
echo "\t$match\t$newUrl";
$content = str_replace($match,$newUrl,$row['post_content']);
echo "\t (" . strlen($content).")";
echo "\n";
}
}
This is how you would do it with Perl Regular Expressions.
$baesUrlMappings = array('/www.yoursite.com/i' => 'www.mysite.com',
'/www.yoursite2.com/i' => 'www.mysite2.com',);
echo preg_replace(array_keys($baesUrlMappings), array_values($baesUrlMappings), 'http://www.yoursite.com/foo/bar?id=123');
echo preg_replace(array_keys($baesUrlMappings), array_values($baesUrlMappings), 'http://www.yoursite2.com/foo/bar?id=123');
http://codepad.viper-7.com/2ne7u6
Please read the manual! You should be able to figure this out.

validate youtube URL and it should be exists

I am new to php.
I want to check the valid youtube URL and if video is exists or not.
Any suggestion would be appreciated.
Here's a solution I wrote using Youtube's oembed.
The first function simply checks if video exists on Youtube's server. It assumes that video does not exists ONLY if 404 error is returned. 401 (unauthorized) means video exists, but there are some access restrictions (for example, embedding may be disabled).
Use second function if you want to check if video exists AND is embeddable.
<?php
function isValidYoutubeURL($url) {
// Let's check the host first
$parse = parse_url($url);
$host = $parse['host'];
if (!in_array($host, array('youtube.com', 'www.youtube.com'))) {
return false;
}
$ch = curl_init();
$oembedURL = 'www.youtube.com/oembed?url=' . urlencode($url).'&format=json';
curl_setopt($ch, CURLOPT_URL, $oembedURL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
// Silent CURL execution
$output = curl_exec($ch);
unset($output);
$info = curl_getinfo($ch);
curl_close($ch);
if ($info['http_code'] !== 404)
return true;
else
return false;
}
function isEmbeddableYoutubeURL($url) {
// Let's check the host first
$parse = parse_url($url);
$host = $parse['host'];
if (!in_array($host, array('youtube.com', 'www.youtube.com'))) {
return false;
}
$ch = curl_init();
$oembedURL = 'www.youtube.com/oembed?url=' . urlencode($url).'&format=json';
curl_setopt($ch, CURLOPT_URL, $oembedURL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
curl_close($ch);
$data = json_decode($output);
if (!$data) return false; // Either 404 or 401 (Unauthorized)
if (!$data->{'html'}) return false; // Embeddable video MUST have 'html' provided
return true;
}
$url = 'http://www.youtube.com/watch?v=QH2-TGUlwu4';
echo isValidYoutubeURL($url) ? 'Valid, ': 'Not Valid, ';
echo isEmbeddableYoutubeURL($url) ? 'Embeddable ': 'Not Embeddable ';
?>
You never read the preg_match docs, did you?
You need a delimiter. / is most common but since you deal with an URL, # is easier as it avoid some escaping.
You need to escape characters with a special meaning in regex such as ? or .
The matches are not returned (it returns the number of matches or false if it failed), so to get the matched string you need the third param of preg_match
preg_match('#https?://(?:www\.)?youtube\.com/watch\?v=([^&]+?)#', $videoUrl, $matches);
as #ThiefMaster said,
but i'd like to add something.
he has asked how to determine if a video exists.
do a curl request and then execute curl_getinfo(...) to check the http status code.
When it is 200, the video exists, else it doesn't exist.
How that works, read here: curl_getinfo
you need change the answer above a little bit otherwise you just got the very first character,
try this
<?php
$videoUrl = 'http://www.youtube.com/watch?v=cKO6GrbdXfU&feature=g-logo';
preg_match('%https?://(?:www\.)?youtube\.com/watch\?v=([^&]+)%', $videoUrl, $matches);
var_dump($matches);
//array(2) {
// [0]=>
// string(42) "http://www.youtube.com/watch?v=cKO6GrbdXfU"
// [1]=>
// string(11) "cKO6GrbdXfU"
//}

Categories