Having trouble with preg_match pinterest username url - php

Please help the statement i am using for matching pinterest username url is
$url = http://pinterest.com/username
preg_match("|^http(s)?://pinterest.com/(.*)?$|i", $url);
but preg_match result are returning 0

You are missing the third parameter of the preg_match function.
$url = "http://pinterest.com/username";
preg_match("|^http(s)?://pinterest.com/(.*)?$|i", $url, $match);
print_r($match);
results in
Array
(
[0] => http://pinterest.com/username
[1] =>
[2] => username
)
Or in an if statement:
$url = "http://pinterest.com/username";
if (preg_match("|^http(s)?://pinterest.com/(.*)?$|i", $url, $match)) {
// true
}

<?php
$url = "http://pinterest.com/username";
if(preg_match("|^http(s)?://pinterest.com/(.*)?$|i", $url)){
echo "true";
}
else{
echo "false";
}
?>
output:
true
What else you want ?

No one said that need to escape point.
So more correct code will be something like this:
$url = "https://pinterest.com/username";
preg_match("|(?:https?://)(?:www\.)?pinterest\.com/(.+)/?|i", $url, $match);
It will return username. I don't know the rules that have pinterest for usernames so I just match all that are inside of slashes.
It will work with links like:
https://pinterest.com/username/
https://www.pinterest.com/username
pinterest.com/username
and other
Don't use this regular expression for validation

Related

php regex preg_match on a variable containing a url

I'm trying to run a regex on a url to extract all the segments after the host. I can't get it working when the host segment is in a variable and i'm not sure how to get it working
// this works
if(preg_match("/^http\:\/\/myhost(\/[a-z0-9A-Z-_\/.]*)$/", $url, $matches)) {
return $matches[2];
}
// this doesn't work
$siteUrl = "http://myhost";
if(preg_match("/^$siteUrl(\/[a-z0-9A-Z-_\/.]*)$/", $url, $matches)) {
return $matches[2];
}
// this doesn't work
$siteUrl = preg_quote("http://myhost");
if(preg_match("/^$siteUrl(\/[a-z0-9A-Z-_\/.]*)$/", $url, $matches)) {
return $matches[2];
}
In PHP, there is a function called parse_url. (Something similar to what you are trying to achieve through your code).
<?php
$url = 'http://username:password#hostname/path?arg=value#anchor';
print_r(parse_url($url));
echo parse_url($url, PHP_URL_PATH);
?>
OUTPUT :
Array
(
[scheme] => http
[host] => hostname
[user] => username
[pass] => password
[path] => /path
[query] => arg=value
[fragment] => anchor
)
/path
You forgot to escape the / in your variable declaration. One quick fix is to change your regex delimiter from / to #. Try:
$siteUrl = "http://myhost";
if(preg_match("#^$siteUrl(\/[a-z0-9A-Z-_\/.]*)$#", $url, $matches)) { //note the hashtags!
return $matches[2];
}
Or without changing the regex delimiter:
$siteUrl = "http:\/\/myhost"; //note how we escaped the slashes
if(preg_match("/^$siteUrl(\/[a-z0-9A-Z-_\/.]*)$/", $url, $matches)) { //note the hashtags!
return $matches[2];
}

PHP preg_match, Finding a package name from Android URL address

I need to get Android package name from the URL address.
Here is what I have done.
$url = 'https://play.google.com/store/apps/details?id=com.gamevil.projectn.global&feature=featured-apps#?t=W251bGwsMSwyLDIwMywiY29tLmdhbWV2aWwucHJvamVjdG4uZ2xvYmFsIl0.';
preg_match("~id=(\d+)~", $url, $matches);
$package_name = $matches[1];
echo $package_name;
Package name should be "com.gamevil.projectn.global"
However, my code is not working.
Is there something that I miss?
you can do this by parse_url function
<?php
$url = 'https://play.google.com/store/apps/details?id=com.gamevil.projectn.global&feature=featured-apps#?t=W251bGwsMSwyLDIwMywiY29tLmdhbWV2aWwucHJvamVjdG4uZ2xvYmFsIl0.';
$arr =parse_url($url);
$new = explode("&",$arr['query']);
$new1 = explode("=",$new[0]);
echo($new1[1] );
output
com.gamevil.projectn.global
Maybe this can help you:
$url = 'https://play.google.com/store/apps/details?id=com.gamevil.projectn.global&feature=featured-apps#?t=W251bGwsMSwyLDIwMywiY29tLmdhbWV2aWwucHJvamVjdG4uZ2xvYmFsIl0.';
preg_match("/id=(.*?)&/", $url, $matches);
$package_name = $matches[1];
echo $package_name;
preg_match will no find everything between id= and &.
But a better solution is to use parse_url to parse the url and this function will return the components of the url.

php regex to get string inside href tag

I need a regex that will give me the string inside an href tag and inside the quotes also.
For example i need to extract theurltoget.com in the following:
URL
Additionally, I only want the base url part. I.e. from http://www.mydomain.com/page.html i only want http://www.mydomain.com/
Dont use regex for this. You can use xpath and built in php functions to get what you want:
$xml = simplexml_load_string($myHtml);
$list = $xml->xpath("//#href");
$preparedUrls = array();
foreach($list as $item) {
$item = parse_url($item);
$preparedUrls[] = $item['scheme'] . '://' . $item['host'] . '/';
}
print_r($preparedUrls);
$html = 'URL';
$url = preg_match('/<a href="(.+)">/', $html, $match);
$info = parse_url($match[1]);
echo $info['scheme'].'://'.$info['host']; // http://www.mydomain.com
this expression will handle 3 options:
no quotes
double quotes
single quotes
'/href=["\']?([^"\'>]+)["\']?/'
Use the answer by #Alec if you're only looking for the base url part (the 2nd part of the question by #David)!
$html = 'URL';
$url = preg_match('/<a href="(.+)">/', $html, $match);
$info = parse_url($match[1]);
This will give you:
$info
Array
(
[scheme] => http
[host] => www.mydomain.com
[path] => /page.html" class="myclass" rel="myrel
)
So you can use $href = $info["scheme"] . "://" . $info["host"]
Which gives you:
// http://www.mydomain.com
When you are looking for the entire url between the href, You should be using another regex, for instance the regex provided by #user2520237.
$html = 'URL';
$url = preg_match('/href=["\']?([^"\'>]+)["\']?/', $html, $match);
$info = parse_url($match[1]);
this will give you:
$info
Array
(
[scheme] => http
[host] => www.mydomain.com
[path] => /page.html
)
Now you can use $href = $info["scheme"] . "://" . $info["host"] . $info["path"];
Which gives you:
// http://www.mydomain.com/page.html
http://www.the-art-of-web.com/php/parse-links/
Let's start with the simplest case - a well formatted link with no extra attributes:
/<a href=\"([^\"]*)\">(.*)<\/a>/iU
For all href values replacement:
function replaceHref($html, $replaceStr)
{
$match = array();
$url = preg_match_all('/<a [^>]*href="(.+)"/', $html, $match);
if(count($match))
{
for($j=0; $j<count($match); $j++)
{
$html = str_replace($match[1][$j], $replaceStr.urlencode($match[1][$j]), $html);
}
}
return $html;
}
$replaceStr = "http://affilate.domain.com?cam=1&url=";
$replaceHtml = replaceHref($html, $replaceStr);
echo $replaceHtml;
This will handle the case where there are no quotes around the URL.
/<a [^>]*href="?([^">]+)"?>/
But seriously, do not parse HTML with regex. Use DOM or a proper parsing library.
/href="(https?://[^/]*)/
I think you should be able to handle the rest.
Because Positive and Negative Lookbehind are cool
/(?<=href=\").+(?=\")/
It will match only what you want, without quotation marks
Array (
[0] => theurltoget.com )

How to check if a given value is a valid URL

I need some function to check is the given value is a url.
I have code:
<?php
$string = get_from_db();
list($name, $url) = explode(": ", $string);
if (is_url($url)) {
$link = array('name' => $name, 'link' => $url);
} else {
$text = $string;
}
// Make some things
?>
If you're running PHP 5 (and you should be!), just use filter_var():
function is_url($url)
{
return filter_var($url, FILTER_VALIDATE_URL) !== false;
}
Addendum: as the PHP manual entry for parse_url() (and #Liutas in his comment) points out:
This function is not meant to validate the given URL, it only breaks it up into the above listed parts. Partial URLs are also accepted, parse_url() tries its best to parse them correctly.
For example, parse_url() considers a query string as part of a URL. However, a query string is not entirely a URL. The following line of code:
var_dump(parse_url('foo=bar&baz=what'));
Outputs this:
array(1) {
["path"]=>
string(16) "foo=bar&baz=what"
}
use parse_url and check for false
<?php
$url = 'http://username:password#hostname/path?arg=value#anchor';
print_r(parse_url($url));
echo parse_url($url, PHP_URL_PATH);
?>
The above example will output:
Array
(
[scheme] => http
[host] => hostname
[user] => username
[pass] => password
[path] => /path
[query] => arg=value
[fragment] => anchor
)
/path
You can check if ParseUrl returns false.

Get keyword from a (search engine) referrer url using PHP

I am trying to get the search keyword from a referrer url. Currently, I am using the following code for Google urls. But sometimes it is not working...
$query_get = "(q|p)";
$referrer = "http://www.google.com/search?hl=en&q=learn+php+2&client=firefox";
preg_match('/[?&]'.$query_get.'=(.*?)[&]/',$referrer,$search_keyword);
Is there another/clean/working way to do this?
Thank you,
Prasad
If you're using PHP5 take a look at http://php.net/parse_url and http://php.net/parse_str
Example:
// The referrer
$referrer = 'http://www.google.com/search?hl=en&q=learn+php+2&client=firefox';
// Parse the URL into an array
$parsed = parse_url( $referrer, PHP_URL_QUERY );
// Parse the query string into an array
parse_str( $parsed, $query );
// Output the result
echo $query['q'];
There are different query strings on different search engines. After trying Wiliam's method, I have figured out my own method. (Because, Yahoo's is using 'p', but sometimes 'q')
$referrer = "http://search.yahoo.com/search?p=www.stack+overflow%2Ccom&ei=utf-8&fr=slv8-msgr&xargs=0&pstart=1&b=61&xa=nSFc5KjbV2gQCZejYJqWdQ--,1259335755";
$referrer_query = parse_url($referrer);
$referrer_query = $referrer_query['query'];
$q = "[q|p]"; //Yahoo uses both query strings, I am using switch() for each search engine
preg_match('/'.$q.'=(.*?)&/',$referrer,$keyword);
$keyword = urldecode($keyword[1]);
echo $keyword; //Outputs "www.stack overflow,com"
Thank you,
Prasad
To supplement the other answers, note that the query string parameter that contains the search terms varies by search provider. This snippet of PHP shows the correct parameter to use:
$search_engines = array(
'q' => 'alltheweb|aol|ask|ask|bing|google',
'p' => 'yahoo',
'wd' => 'baidu',
'text' => 'yandex'
);
Source: http://betterwp.net/wordpress-tips/get-search-keywords-from-referrer/
<?php
class GET_HOST_KEYWORD
{
public function get_host_and_keyword($_url) {
$p = $q = "";
$chunk_url = parse_url($_url);
$_data["host"] = ($chunk_url['host'])?$chunk_url['host']:'';
parse_str($chunk_url['query']);
$_data["keyword"] = ($p)?$p:(($q)?$q:'');
return $_data;
}
}
// Sample Example
$obj = new GET_HOST_KEYWORD();
print_r($obj->get_host_and_keyword('http://www.google.co.in/search?sourceid=chrome&ie=UTF-&q=hire php php programmer'));
// sample output
//Array
//(
// [host] => www.google.co.in
// [keyword] => hire php php programmer
//)
// $search_engines = array(
// 'q' => 'alltheweb|aol|ask|ask|bing|google',
// 'p' => 'yahoo',
// 'wd' => 'baidu',
// 'text' => 'yandex'
//);
?>
$query = parse_url($request, PHP_URL_QUERY);
This one should work For Google, Bing and sometimes, Yahoo Search:
if( isset($_SERVER['HTTP_REFERER']) && $_SERVER['HTTP_REFERER']) {
$query = getSeQuery($_SERVER['HTTP_REFERER']);
echo $query;
} else {
echo "I think they spelled REFERER wrong? Anyways, your browser says you don't have one.";
}
function getSeQuery($url = false) {
$segments = parse_url($url);
$keywords = null;
if($query = isset($segments['query']) ? $segments['query'] : (isset($segments['fragment']) ? $segments['fragment'] : null)) {
parse_str($query, $segments);
$keywords = isset($segments['q']) ? $segments['q'] : (isset($segments['p']) ? $segments['p'] : null);
}
return $keywords;
}
I believe google and yahoo had updated their algorithm to exclude search keywords and other params in the url which cannot be received using http_referrer method.
Please let me know if above recommendations will still provide the search keywords.
What I am receiving now are below when using http referrer at my website end.
from google: https://www.google.co.in/
from yahoo: https://in.yahoo.com/
Ref: https://webmasters.googleblog.com/2012/03/upcoming-changes-in-googles-http.html

Categories