file_get_contents() find why not work - php

<?
$ip = '95.79.1.36'; //russian ip for test
$str = 'http://ipgeobase.ru:7020/geo?ip='.$ip;
$content = file_get_contents($str);
preg_match_all('#<country>(.*)(</country>)#Usi', $content, $matches);
$country = $matches[0][0];
preg_match_all('#<city>(.*)(</city>)#Usi', $content, $matches);
$city = $matches[0][0];
if($country == 'RU'){
echo 'City: '.$city.'';
}else{
echo 'Country: '.$country.'';
}
?>
The problem is $country == 'RU' , not work, my question is why ?
Thanks )))

You probably shouldn't be parsing HTML/XHTML/XML with Regex. See: RegEx match open tags except XHTML self-contained tags
I recommend using PHP's SimpleXML parser. The following worked for me:
<?php
$ip = '95.79.1.36'; //russian ip for test
$str = 'http://ipgeobase.ru:7020/geo?ip='.$ip;
$results = simplexml_load_file($str);
$country = $results->ip->country;
$city = $results->ip->city;
if($country == 'RU'){
echo 'City: '.$city.'';
}else{
echo 'Country: '.$country.'';
}
?>

Your server probably does not allow_url_fopen (php.ini directive). Anyway, the technology you are looking for for this particular case is cURL : https://php.net/curl.
I'd be delighted to provide more explanations about your code specifically, and even provide cURL code samples, once you'll have edited your question properly, with more information and attempts resulting of your efforts.

Related

PHP regex to exactly obtain a string I want

I have a code for embedding a link for iframe.
$post_contetn = explode('htt',$content);
$content_with_link = $post_contetn[0];
$link = 'htt'.$post_contetn[1];
But the problem is that, if I write
http://www.espn.com was great
then it links "was great" is part of the $link.
How can I change (perhaps use regex) to only include the actual url?
======
If I incorporate siam's answer, should it be
$regex = '/https?:\/\/.*?(?=\s)/';
$post_contetn = preg_match($regex, $content, $linkarray);
$content_with_link = $post_contetn[0];
$link = $linkarray[0]
echo $content_with_link;
I then edited to
preg_match($regex, $content, $post_contetn);
$content_with_link = $post_contetn[0];
$link = $post_contetn[0]
echo $content_with_link;
But the error still occurs at echo line.
Try using the following regex :
(?:https?:\/\/\S+)?\S+\.\S+\.?\S+
see demo / explanation
PHP
<?php
$content = 'http://www.espn.com was great';
$regex = '/(?:https?:\/\/\S+)?\S+\.\S+\.?\S+/';
preg_match($regex, $content, $post_contetn);
$link = $post_contetn[0];
echo $link;
?>

How to get only center domain name from url

I have many thousands of urls from which i only want to get name of domain for example
http://google.com
<?php
$url = 'http://google.com';
$host = parse_url($url);
echo '<pre>';
print_r($host['host']);
echo '</pre>';
**//Output google.com**
?>
but i only want to get google from http://google.com not google.com
please help thanks
Not particularaly elegant but something like this gets simply the domain name...
$url = 'http://dev.subdomain.google.com';
$host = parse_url($url,PHP_URL_HOST);
$pieces=explode( '.', $host );
$popped=array_pop( $pieces ); //remove tld extension from stack
if( strlen( $popped ) <= 3 ) array_pop( $pieces ); //tld was likely a multi-part ext like .co.uk so pop next element off stack too!
$domain=array_pop( $pieces );
echo $domain; // returns 'google'
$url = 'http://google.com';
$host = parse_url($url);
$host = strstr($host, '.com', true);
See php.net/strstr for more detailed information, of course there's other and properly better ways to do it.
Try below code
<?php
$full_url = parse_url('http://facebook.com');
$url = $full_url['host'];
$url_array = explode('.',$url);
echo $url_array[0];
?>
maybe you can fix it with a regex
$host = (preg_replace("#(http://)|(https://)|\.(com)|(co\.uk)|(fr)|(de)|(org)|(net)#", "", $host));
preg_replace : preg_replace manual (php.net)
test your regex : Debuggex

How to filter URLs that contain white space with preg match?

I parse through a text that contains several links. Some of them contain white spaces but have a file ending. My current pattern is:
preg_match_all('#\bhttps?://[^\s()<>]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#', $links, $match);
This works the same way:
preg_match_all('/(http|https|ftp|ftps)\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,3}(\/\S*)?/', $links, $match);
I don't know much about the patterns and didn't find a good tutorial that explains the meaning of all possible patterns and shows examples.
How could I filter an URL like this:
http://my-url.com/my doc.doc or even http://my-url.com/my doc with more white spaces.doc
The \s in that preg_match_all functions stands for a white space. But how could I check if there is a file ending behind one or some white spaces?
Is it possible?
Why not just make use of PHP's FILTER functions. ?
<?php
$url = "http://my-url.com/my doc.doc";
if(!filter_var($url, FILTER_VALIDATE_URL))
{
echo "URL is not valid";
}
else
{
echo "URL is valid";
}
OUTPUT :
URL is not valid
this might be what you are looking for which uses urlencode
$file = "my doc with more white spaces.doc";
echo " http://my-url.com/" . urlencode($file);
which produces:
http://my-url.com/my+doc+with+more+white+spaces.doc
or with rawurlencode
produces:
http://my-url.com/my%20doc%20with%20more%20white%20spaces.doc
EDIT: Something like the following might help to parse your urls with parse_url
DEMO
$url = 'http://my-url.com/my doc with more white spaces.doc';
$purl = parse_url($url);
$rurl = "";
if(isset($purl['scheme'])){
$rurl .= $purl['scheme'] . "://";
}
if(isset($purl['host'], $purl['path'])){
$rurl .= $purl['host'] . rawurlencode($purl['path']);
}
if($rurl === ""){
$rurl = $url;#error parsing error/invalid url?
}
for sub directories you can do
$purl['path'] = implode('/', array_map(function($value){return rawurlencode($value);}, explode('/', $purl['path'])));
I don't know much about php but this regex
(http|ftp)(s)?://([\w-]+\.)+[\w-]+(/[\w- ./?%&=]*)?
will match every url even with spaces
I think this regex will do.
use this regex
preg_match_all("/^(?si)(?>\s*)(((?>https?:\/\/(?>www\.)?)?(?=[\.-a-z0-9]{2,253}(?>$|\/|\?|\s))[a-z0-9][a-z0-9-]{1,62}(?>\.[a-z0-9][a-z0-9-]{1,62})+)(?>(?>\/|\?).*)?)?(?>\s*)$/", $input_lines, $output_array);
Demo
Alright after doing this really helpful tutorial I finally know how the regex syntax works. After finishing it I experimented a bit on this site
It was pretty easy after figuring out that all hyperlinks in my parsed document were in between quotation marks so I just had to change the regex to:
preg_match_all('#\bhttps?://[^()<>"]+#', $links, $match);
so that after the " it is looking for the next match that begins with http.
But that's not the full solution yet. The user Class was right - without rawurlencode the filenames it won't work.
So the next step was this:
function endsWith($haystack, $needle)
{
return $needle === "" || substr($haystack, -strlen($needle)) === $needle;
}
if(endsWith($textlink, ".doc") || endsWith($textlink, ".docx") || endsWith($textlink, ".pdf") || endsWith($textlink, ".jpg") || endsWith($textlink, ".jpeg") || endsWith($textlink, ".png")){
$file = substr( $textlink, strrpos( $textlink, '/' )+1 );
$rest_url=substr($textlink, 0, strrpos($textlink, '/' )+1 );
$textlink=$rest_url.rawurlencode($file);
}
That filters the filenames from the URLs and rawurlencodes them so that the the output links are correct.
I think this should work:
$url = '...';
$url_new = '';
$array = explode(' ',$url);
foreach($array as $name => $val){
if ($val!=' '){
$url_new = $url_new.$val;
}
}

Error using preg_relace to change url youtube?

I have a sample code:
<?php
$url = 'http://www.youtube.com/watch?v=KTRPVo0d90w';
$pattern = '/http:\/\/www\.youtube\.com\/watch\?(.*?)v=([a-zA-Z0-9_\-]+)(\S*)/i';
$replace = $pattern.'&w=550';
$string = preg_replace($pattern, $replace, $url);
?>
How to result is http://www.youtube.com/watch?v=KTRPVo0d90w&w=550
You can just append using the . operator:
<?php
$url = 'http://www.youtube.com/watch?v=KTRPVo0d90w';
$string = $url.'&w=550';
?>
Use preg_match instead:
<?php
$url = 'http://www.youtube.com/watch?v=KTRPVo0d90w&s=222';
$pattern = '/v=[^&]+/i';
preg_match($pattern, $url, $match);
echo 'http://www.youtube.com/watch?'.$match[0].'&w=550';
?>
Like below?
$url = 'http://www.youtube.com/watch?v=KTRPVo0d90w';
$bit = '&w=550';
echo "${url}${bit}";
Don't get me wrong, I'm not looking to gain any points here, but just thought I would add to this question and include a few options. I love toying with ideas like this every once in a while.
Using jh314's idea to concatenate the strings, thought that this could be used for future use, to actually replace a string inside the video's YouTube number, should the occasion ever present itself.
Such as $number for instance.
<?php
$url = 'http://www.youtube.com/watch?v=';
$number = 'KTRPVo0d90w';
$string = $url.$number.'&w=550';
// Output to screen
echo $string;
echo "<br>";
// Link to video
echo "Click for the video";
?>
The same could easily be done for the video's width.

PHP RegEx for "Website Name"

Duplicate: PHP validation/regex for URL
My goal is create a PHP regex for website name. The regex is for a lead gathering form and should accept any legit kind of website name syntax that someone might enter. After an exhaustive search, I'm surprised that I can't find one out there.
Here are the regex matches that I'm looking for:
somewebsite.com
http://somewebsite.com
http://www.somewebsite.com
AND, it should also match:
any of the above with a trailing backslash, such as: somewebsite.com/
subdomains
No RegEx necessary.
$subject = 'example.com';
$part = (stripos($subject, 'http://') === FALSE) ? 'http://' : '' ;
var_dump(filter_var($part.$subject, FILTER_VALIDATE_URL));
You might need to tweak it:
<?php
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?#)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
$url1 = "http://www.somewebsite.com";
$url2 = "https://www.somewebsite.com";
$url3 = "https://somewebsite.com";
$url4 = "www.somewebsite.com";
$url5 = "somewebsite.com";
function valURL($pattern, $url) {
$return = false;
if(preg_match($pattern, $url)) {
$return = true;
}
if($return == true) {
echo "Match URL: <font color='green'>" . $url . "</font><br /><br />";
} else {
echo "Try Again: <font color='red'>URL: " . $url . "</font><br /><br />";
}
}
valURL($pattern, $url1);
valURL($pattern, $url2);
valURL($pattern, $url3);
valURL($pattern, $url4);
valURL($pattern, $url5);
?>
I decided to benchmark the answers here to prove that regular expressions are not the answer for such simple tasks. Andy Leekman's code is whole 30% to 60% quicker than other answers. He did have a bug, but I fixed that with a line of code. You can view my results below.
Here's the code on which the tests ran.
http://pastie.org/476900
alt text http://img254.imageshack.us/img254/7821/capturevzh.png
PS If anyone elses uses a regular expression to validate a URL I might go mad ;)
/^([a-z0-9]([-a-z0-9]*[a-z0-9])?\\.)+((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)|(c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]|(g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)|(j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]|(m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)|(p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]|(t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw])$/i
http://www.shauninman.com/archive/2006/05/08/validating_domain_names
Courtesy of google. It is VERY complex though, so someone else might have a simpler one.
EDIT: Try andy's answer first. If you can find an alternative to a regex, 9/10 the alternative is much better.
^(https?://)?(([0-9a-z_!'().&=$%-]: )?[0-9a-z_!'().&=$%-]#)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-z_!'()-]\.)([0-9a-z][0-9a-z-]{0,61})?[0-9a-z]\.[a-z]{2,6})(:[0-9]{1,4})?((/?)|(/[0-9a-z_!*'().;?:#&=$,%#-])/?)$

Categories