Just remove a a-zA-z section in regular expression - php

This is my regex string:
'(?!('.$exceptions.')((\W+)|$))[a-zA-Z\-_]+/?$'
$exceptions is a variable contains a string like this :
word1|word2|word3|word4|wordN
I just want to remove the section a-zA-Z which means I want to just delete the rule which checking english chars, because of unicode.
A sample :
$exception ='word1|word3|word3|word4' ;
$myword = 'a-unicode-statement-like-سلام' ;
If $myword compared with the regex rules string it will not match that because of سلام
it is not in a-zA-z range i just want remove this limitation (a-zA-Z)

Try adding something to match everything else, instead of your a-zA-Z rule.
'(?!('.$exceptions.')(.*))'
EDIT:
After reading your comment below. Maybe a better solution is to use the one proposed for this question: wordpress: how to check if the slug contains a specific word?
You can then check using something like this:
$url = $_SERVER["REQUEST_URI"];
$isException = strpos($url, 'word1');
if ($isException !== false)
{
//url contains word in exceptions!
}

From what I understand, I think you're looking for this:
$exceptions = ["word1","word2","word3"];
// or $exceptions = explode("|",$exceptions) to work with what you have already
if( in_array($string,$exceptions)) {
// word is in exceptions
}

Related

Create a function to find a specific word in the title

I have the following title formation on my website:
It's no use going back to yesterday, because at that time I was... Lewis Carroll
Always is: The phrase… (author).
I want to delete everything after the ellipsis (…), leaving only the sentence as the title. I thought of creating a function in php that would take the parts of the titles, throw them in an array and then I would work each part, identifying the only pattern I have in the title, which is the ellipsis… and then delete everything. But when I do that, in the X space of my array, it returns the following:
was...
In position 8 of the array comes the word and the ellipsis and I don't know how to find a pattern to delete the author of the title, my pattern was the ellipsis. Any idea?
<?php
$a = get_the_title(155571);
$search = '... ';
if(preg_match("/{$search}/i", $a)) {
echo 'true';
}
?>
I tried with the code above and found the ellipsis, but I needed to bring it into an array to delete the part I need. I tried something like this:
<?php
define('WP_USE_THEMES', false);
require('./wp-blog-header.php');
global $wpdb;
$title_array = explode(' ', get_the_title(155571));
$search = '... ';
if (array_key_exists("/{$search}/i",$title_array)) {
echo "true";
}
?>
I started doing it this way, but it doesn't work, any ideas?
Thanks,
If you use regex you need to escape the string as preg_quote() would do, because a dot belongs to the pattern.
But in your simple case, I would not use a regex and just search for the three dots from the end of the string.
Note: When the elipsis come from the browser, there's no way to detect in PHP.
$title = 'The phrase... (author).';
echo getPlainTitle($title);
function getPlainTitle(string $title) {
$rpos = strrpos($title, '...');
return ($rpos === false) ? $title : substr($title, 0, $rpos);
}
will output
The phrase
First of all, since you're working with regular expressions, you need to remember that . has a special meaning there: it means "any character". So /... / just means "any three characters followed by a space", which isn't what you want. To match a literal . you need to escape it as \.
Secondly, rather than searching or splitting, you could achieve what you want by replacing part of the string. For instance, you could find everything after the ellipsis, and replace it with an empty string. To do that you want a pattern of "dot dot dot followed by anything", where "anything" is spelled .*, so \.\.\..*
$title = preg_replace('/\.\.\..*/', '', $title);

Erasing C comments with preg_replace

I need to erase all comments in $string which contains data from some C file.
The thing I need to replace looks like this:
something before that shouldnt be replaced
/*
* some text in between with / or * on many lines
*/
something after that shouldnt be replaced
and the result should look like this:
something before that shouldnt be replaced
something after that shouldnt be replaced
I have tried many regular expressions but neither work the way I need.
Here are some latest ones:
$string = preg_replace("/\/\*(.*?)\*\//u", "", $string);
and
$string = preg_replace("/\/\*[^\*\/]*\*\//u", "", $string);
Note: the text is in UTF-8, the string can contain multibyte characters.
You would also want to add the s modifier to tell the regex that .* should include newlines. I always think of s to mean "treat the input text as a single line"
So something like this should work:
$string = preg_replace("/\\/\\*(.*?)\\*\\//us", "", $string);
Example: http://codepad.viper-7.com/XVo9Tp
Edit: Added extra escape slashes to the regex as Brandin suggested because he is right.
I don't think regexp fit good here. What about wrote a very small parse to remove this? I don't do PHP coding for a long time. So, I will try to just give you the idea (simple alogorithm) I haven't tested this, it's just to you get the idea, as I said:
buf = new String() // hold the source code without comments
pos = 0
while(string[pos] != EOF) {
if(string[pos] == '/') {
pos++;
while(string[pos] != EOF)
{
if(string[pos] == '*' && string[pos + 1] == '/') {
pos++;
break;
}
pos++;
}
}
buf[buf_index++] = string[pos++];
}
where:
string is the C source code
buf a dynamic allocated string which expands as needed
It is very hard to do this perfectly without ending up writing a full C parser.
Consider the following, for example:
// Not using /*-style comment here.
// This line has an odd number of " characters.
while (1) {
printf("Wheee!
(*\/*)
\\// - I'm an ant!
");
/* This is a multiline comment with a // in, and
// an odd number of " characters. */
}
So, from the above, we can see that our problems include:
multiline quote sequences should be ignored within doublequotes. Unless those doublequotes are part of a comment.
single-line comment sequences can be contained in double-quoted strings, and in multiline strings.
Here's one possibility to address some of those issues, but far from perfect.
// Remove "-strings, //-comments and /*block-comments*/, then restore "-strings.
// Based on regex by mauke of Efnet's #regex.
$file = preg_replace('{("[^"]*")|//[^\n]*|(/\*.*?\*/)}s', '\1', $file);
try this:
$string = preg_replace("#\/\*\n?(.*)\*\/\n?#ms", "", $string);
Use # as regexp boundaries; change that u modifier with the right ones: m (PCRE_MULTILINE) and s (PCRE_DOTALL).
Reference: http://php.net/manual/en/reference.pcre.pattern.modifiers.php
It is important to note that my regexp does not find more than one "comment block"... Use of "dot match all" is generally not a good idea.

Regular expression to find a value and return it in PHP

I've been trying to figure this out for 2 hours now with no success. Its a bit complicated for me i guess.
I am trying to parse a script file in PHP and return some values to the user. The ones i want are like this:
_value = object runFunction blah blah blah
Basically what i want is (in an algorithm):
IF case-insensitive runFunction is found in the line (because it might be runfunction)
AND the line starts with _ (underscore) (or if possible before the = there is a value that starts with _ to be sure of the result)
THEN return that underscore value before the = to me.
Usually 99.9% the format is like this...But there are small cases it can be like this:
_value = _object runFunction blah blah blah
(in case the _ after the = messes things up).
Any help here :) ?
Thanks
try something like:
$str = 'YOUR FILE CONTENTS HERE';
$match = preg_match_all('/(_[a-zA-Z0-9_]+) ?= ?[a-zA-Z0-9_]+ runFunction/s',$str,$matches);
var_dump($matches);
you'll probably need to add the multiline flag.
How about
if (preg_match('/^_([^=]+?)(?=\s*=).*runfunction/im', $subject, $regs)) {
$result = $regs[1];
} else {
$result = "";
}
You can exclude the initial "start" anchor is your underscore might not be at the beginning of the line
Here is the regex by itself. The results are in capturing group 1
^_([^=]+?)(?=\s*=).*runfunction
The regex
look for beginning of line
match the first underscore
capture everything that is not an '=' into capturing group 1
provided it is followed by 0 or more spaces and an equal sign.
then capture everything up to a runfunction.
Case insensitive and multiline options need to be set
If the first underscore does not need to be at the beginning of the line, eliminate the anchor.

PHP get specific string from url before and after unknown characters

I know it may sound as a common question but I have difficulty understanding this process.
So I have this string:
http://domain.com/campaign/tgadv?redirect
And I need to get only the word "tgadv". But I don't know that the word is "tgadv", it could be whatever.
Also the url itself may change and become:
http://domain.com/campaign/tgadv
or
http://domain.com/campaign/tgadv/
So what I need is to create a function that will get whatever word is after campaign and before any other particular character. That's the logic..
The only certain thing is that the word will come after the word campaign/ and that any other character that will be after the word we are searching is a special one ( i.e. / or ? )
I tried understanding preg_match but really cannot get any good result from it..
Any help would be highly appreciated!
I would not use a regex for that. I would use parse_url and basename:
$bits = parse_url('http://domain.com/campaign/tgadv?redirect');
$filename = basename($bits['path']);
echo $filename;
However, if want a regex solution, use something like this:
$pattern = '~(.*)/(.*)(\?.*)~';
preg_match($pattern, 'http://domain.com/campaign/tgadv?redirect', $matches);
$filename = $matches[2];
echo $filename;
Actually, preg_match sounds like the perfect solution to this problem. I assume you are having problems with the regex?
Try something like this:
<?php
$url = "http://domain.com/campaign/tgadv/";
$pattern = "#campaign/([^/\?]+)#";
preg_match($pattern, $url, $matches);
// $matches[1] will contain tgadv.
$path = "http://domain.com/campaign/tgadv?redirect";
$url_parts = parse_url($path);
$tgadv = strrchr($url_parts['path'], '/');
You don't really need a regex to accomplish this. You can do it using stripos() and substr().
For example:
$str = '....Your string...';
$offset = stripos($str, 'campaign/');
if ( $offset === false ){
//error, end of h4 tag wasn't found
}
$offset += strlen('campaign/');
$newStr = substr($str, $offset);
At this point $newStr will have all the text after 'campaign/'.
You then just need to use a similar process to find the special character position and use substr() to strip the string you want out.
You can also just use the good old string functions in this case, no need to involve regexps.
First find the string /campaign/, then take the substring with everything after it (tgadv/asd/whatever/?redirect), then find the next / or ? after the start of the string, and everything in between will be what you need (tgadv).

Regex to match Youtube URL's

I am trying to validate a Youtube URL using regex:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+~', $videoLink)
It kind of works, but it can match URL's that are malformed. For example, this will match ok:
http://www.youtube.com/watch?v=Zu4WXiPRek
But so will this:
http://www.youtube.com/watch?v=Zu4WX£&P!ek
And this wont:
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
I think it's because of the + operator. It's matching what seems to be the first character after v=, when it needs to try and match everything behind v= with [a-zA-Z0-9-]. Any help is appreciated, thanks.
To provide an alternative that is larger and much less elegant than a regex, but works with PHP's native URL parsing functions so it might be a bit more reliable in the long run:
$url = "http://www.youtube.com/watch?v=Zu4WXiPRek";
$query_string = parse_url($url, PHP_URL_QUERY); // v=Zu4WXiPRek
$query_string_parsed = array();
parse_str($query_string, $query_string_parsed); // an array with all GET params
echo($query_string_parsed["v"]); // Will output Zu4WXiPRek that you can then
// validate for [a-zA-Z0-9] using a regex
The problem is that you are not requiring any particular number of characters in the v= part of the URL. So, for instance, checking
http://www.youtube.com/watch?v=Zu4WX£&P!ek
will match
http://www.youtube.com/watch?v=Zu4WX
and therefore return true. You need to either specify the number of characters you need in the v= part:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]{10}~', $videoLink)
or specify that the group [a-zA-Z0-9-] must be the last part of the string:
preg_match('~http://youtube.com/watch\?v=[a-zA-Z0-9-]+$~', $videoLink)
Your other example
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
does not match, because the + sign requires that at least one character must match [a-zA-Z0-9-].
Short answer:
preg_match('%(http://www.youtube.com/watch\?v=(?:[a-zA-Z0-9-])+)(?:[&"\'\s])%', $videoLink)
There are a few assumptions made here, so let me explain:
I added a capturing group ( ... ) around the entire http://www.youtube.com/watch?v=blah part of the link, so that we can say "I want get the whole validated link up to and including the ?v=movieHash"
I added the non-capturing group (?: ... ) around your character set [a-zA-Z0-9-] and left the + sign outside of that. This will allow us to match all allowable characters up to a certain point.
Most importantly, you need to tell it how you expect your link to terminate. I'm taking a guess for you with (?:[&"\'\s])
?) Will it be in html format (e.g. anchor tag) ? If so, the link in href will obviously end with a " or '.
?) Or maybe there's more to the query string, so there would be an & after the value of v.
?) Maybe there's a space or line break after the end of the link \s.
The important piece is that you can get much more accurate results if you know what's surrounding what you are searching for, as is the case with many regular expressions.
This non-capturing group (in which I'm making assumptions for you) will take a stab at finding and ignoring all the extra junk after what you care about (the ?v=awesomeMovieHash).
Results:
http://www.youtube.com/watch?v=Zu4WXiPRek
- Group 1 contains the http://www.youtube.com/watch?v=Zu4WXiPRek
http://www.youtube.com/watch?v=Zu4WX&a=b
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=!Zu4WX£&P4ek
- No match
a href="http://www.youtube.com/watch?v=Zu4WX&size=large"
- Group 1 contains http://www.youtube.com/watch?v=Zu4WX
http://www.youtube.com/watch?v=Zu4WX£&P!ek
- No match
The "v=..." blob is not guaranteed to be the first parameter in the query part of the URL. I'd recommend using PHP's parse_url() function to break the URL into its component parts. You can also reassemble a pristine URL if someone began the string with "https://" or simply used "youtube.com" instead of "www.youtube.com", etc.
function get_youtube_vidid ($url) {
$vidid = false;
$valid_schemes = array ('http', 'https');
$valid_hosts = array ('www.youtube.com', 'youtube.com');
$valid_paths = array ('/watch');
$bits = parse_url ($url);
if (! is_array ($bits)) {
return false;
}
if (! (array_key_exists ('scheme', $bits)
and array_key_exists ('host', $bits)
and array_key_exists ('path', $bits)
and array_key_exists ('query', $bits))) {
return false;
}
if (! in_array ($bits['scheme'], $valid_schemes)) {
return false;
}
if (! in_array ($bits['host'], $valid_hosts)) {
return false;
}
if (! in_array ($bits['path'], $valid_paths)) {
return false;
}
$querypairs = explode ('&', $bits['query']);
if (count ($querypairs) < 1) {
return false;
}
foreach ($querypairs as $querypair) {
list ($key, $value) = explode ('=', $querypair);
if ($key == 'v') {
if (preg_match ('/^[a-zA-Z0-9\-_]+$/', $value)) {
# Set the return value
$vidid = $value;
}
}
}
return $vidid;
}
Following regex will match any youtube link:
$pattern='#(((http(s)?://(www\.)?)|(www\.)|\s)(youtu\.be|youtube\.com)/(embed/|v/|watch(\?v=|\?.+&v=|/))?([a-zA-Z0-9._\/~#&=;%+?-\!]+))#si';

Categories