Matching a regex

Matching a regex - php

My regex skills are poor. For my js/css to show the correct menu I am adding in a "selected" class to my li so that the js knows which is the current item (so it can display the rest of the drop down).
My problem is matching the correct uri string in codeigniter:
<li <? if((strstr($this->uri->uri_string(),"rfid_finder/finder")) || (strstr($this->uri->uri_string(),"finder"))) {?>class="selected"<?}?> rel="home"> <?= anchor('finder','HOME')?></li>
I was using this method initially but my routes are now a bit more complex to allow for searching and pagination.
I need a regex that would match all of the following routes:
$route['finder/(:any)/(:any)/(:num)'] = "rfid_finder/finder/$1/$2/$3";
$route['finder/(:any)/(:any)'] = "rfid_finder/finder/$1/$2";
$route['finder/(:any)'] = "rfid_finder/finder/$1";
$route['finder'] = 'rfid_finder/finder';
but when a user visits:
rfid_finder/search_form
the first menu is not given the selected class.
Update
I want first code snippet to match the routes and not the rfid_finder/search route- I have a second line of code which matches the rfid_finder/search_form route. my problem lies in trying to capture the route using (strstr($this->uri->uri_string(),"finder") it matches all my routes even the rfid_finder/search_form

Hmm, let me know if this works for you. It seems to work in my tests.
preg_match('|^(rfid_finder/)?finder/?([a-z0-9]+?)?(/[a-z0-9]+?)?(/\d+?)?$|i', $string)
Here's my testing example:
$route = array();
$route['finder/(:any)/(:any)/(:num)'] = "rfid_finder/finder/$1/$2/$3";
$route['finder/(:any)/(:any)'] = "rfid_finder/finder/$1/$2";
$route['finder/(:any)'] = "rfid_finder/finder/$1";
$route['finder'] = 'rfid_finder/finder';
$route['finder/2313'] = 'rfid_finder/finder';
$route['finder/asd/dsda'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/122'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/122/qwewe'] = 'rfid_finder/finder';
$route['rfid_finder/finder/dd/asdsad/333'] = 'rfid_finder/finder';
foreach ($route as $string=>$match) {
if ( preg_match('|^(rfid_finder/)?finder/?([a-z0-9]+?)?(/[a-z0-9]+?)?(/\d+?)?$|i', $string) ) {
echo $string.' - yes';
} else {
echo $string.' - no';
}
echo '<br />';
}
exit();
It searches the string to make sure it starts with either finder or rfid_finder (i wasn't really sure which you prefered. if you want it to only start with rfid_finder just remove the parentheses around rfid_finder and the trailing ? before finder)

Related

PHP Regex for IMDB/TMDB Urls

I'm writing a code what compares a links from imdb and tmdb.
The code matches link to imdb and then transforms it for the tmdb link, if was inserted.
The links look like:
https://www.imdb.com/title/tt0848228
https://www.themoviedb.org/movie/24428
I want to ask if these regexs are correct for movies links.
For ex.
$imdb_url = https://www.imdb.com/title/tt0848228
if (strpos($imdb_url, 'themoviedb.org') == true) {
preg_match_all('/\\d*-/', $imdb_url, $tmdb_id);
$tmdb_id = $tmdb_id[0];
$tmdb_id = str_replace('-', '', $tmdb_id);
$tmdb_id = $tmdb_id[0];
$request_url = amy_movie_provider_build_query_url('tmdb', $tmdb_id, $api_key);
$the_data = wp_remote_get($request_url, array(
'timeout' => $timeout,
));
if (!is_wp_error($the_data) && !empty($the_data)) {
$movie_data = json_decode($the_data['body'], true);
$result = amy_movie_add_tmdb_movie_data($movie_data);
echo $result;
exit;
} else {
$result = esc_html__('Provider TMDB being error!', 'amy-movie-extend');
echo $result;
exit;
}
exit;
}
And else for imdb link:
else if (strpos($imdb_url, 'www.imdb.com') == true) {
preg_match_all('/tt\\d{7}/', $imdb_url, $imdb_id);
$imdb_id = $imdb_id[0];
$imdb_id = $imdb_id[0];
}
I think it's not working because something may be wrong with not existing /movie prefix in the link, but I tried changing that and it still catches error 404.

Why not combining the domain part with the rest of the URI? Why once omitting the subdomain and once making it mandatory?
$sURI= 'whatever';
if( preg_match( '#imdb\\.com/title/tt(\\d{7})#i', $sURI, $aMatch ) ) {
echo 'IMDb, movie #'. $aMatch[1];
} else
if( preg_match( '#themoviedb.org/movie/(\\d+)($|-)#i', $sURI, $aMatch ) ) {
echo 'TMDb, movie #'. $aMatch[1];
} else {
echo 'Unrecognized';
}
This way it doesn't matter if the IMDb URI comes with www. or not. Since the movie IDs have a fixed length we don't even need to expect/care a slash following. Your mistake was expecting a slash without any need.
Same for TMDb, which either ends right away (but we want to get all digits to the end, not just the first) or is followed by a dash. i is for really distorted URIs for whichever reason. Your mistake was to expect a dash and to make digits entirely optional (when at least one should be needed, as in https://www.themoviedb.org/movie/9)
Side note: Using \\d in a PHP string for a regular expression is the correct way, as you first have to deal with the string context - there an effective backslash has to be escaped by the backslash itself. And only after that the scope of the regular expression is encountered. \d only also works because unknown string escapings are silently ignored.

Additional elements to URLS?

I'm not sure what the terminology is, but basically I have a site that uses the "tag-it" system, currently you can click on the tags and it takes the user to
topics.php?tags=example
My question is what sort of scripting or coding would be required to be able to add additional links?
topics.php?tags=example&tags=example2
or
topics.php?tags=example+example2
Here is the code in how my site is linked to tags.
header("Location: topics.php?tags={$t}");
or
<?php echo strtolower($fetch_name->tags);?>
Thanks for any hints or tips.

You cannot really pass tags two times as a GET parameter although you can pass it as an array
topics.php?tags[]=example&tags[]=example2
Assuming this is what you want try
$string = "topics.php?";
foreach($tags as $t)
{
$string .= "tag[]=$t&";
}
$string = substr($string, 0, -1);
We iterate through the array concatenating value to our $string. The last line removes an extra & symbol that will appear after the last iteration
There is also another option that looks a bit more dirty but might be better depending on your needs
$string = "topics.php?tag[]=" . implode($tags, "&tag[]=");
Note Just make sure the tags array is not empty

topics.php?tags=example&tags=example2
will break in the back end;
you have to assign the data to one variable:
topics.php?tags=example+example2
looks good you can access it in the back end explode it by the + sign:
//toplics.php
<?php
...
$tags = urlencode($_GET['tags']);
$tags_arr = explode('+', $tags); // array of all tags
$current_tags = ""; //make this accessible in the view;
if($tags){
$current_tags = $tags ."+";
}
//show your data
?>
Edit:
you can create the fron-end tags:
<a href="topics.php?tags=<?php echo $current_tags ;?>horror">
horror
</a>

K2 Joomla Wrong Urls in item comments

K2 is parsing un-necessary text into urls in item comments.
1.Created a item using joomla admin panel and as a guest entered comment with following text
"node.js is a power full js engine. Enven.though this is not a valid url it has been rendered as valid.url anything with xxx.xxx are parsed as urls and even like sub domain syntax iam.not.valid i.e mail.yahoo.com how funny this is"
In the above coomment node.js, even.though, valid.url, xxx.xxx iam.not.valid i.e mail.yahoo.com are rendered as valid url. but in this case only mail.yahoo.com is valid not others.
K2 is using some smart intelligence using following snippet in $JHOME/components/com_k2/views/item/view.html.php lines (159-178)
$comments = $model->getItemComments($item->id, $limitstart, $limit, $commentsPublished);
$pattern = "#\b(https?://)?(([0-9a-zA-Z_!~*'().&=+$%-]+:)?[0-9a-zA-Z_!~*'().&=+$%-]+\#)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-zA-Z_!~*'()-]+\.)*([0-9a-zA-Z][0-9a-zA-Z-]{0,61})?[0-9a-zA-Z]\.[a-zA-Z]{2,6})(:[0-9]{1,4})?((/[0-9a-zA-Z_!~*'().;?:\#&=+$,%#-]+)*/?)#";
for ($i = 0; $i < sizeof($comments); $i++) {
$comments[$i]->commentText = nl2br($comments[$i]->commentText);
$comments[$i]->commentText = preg_replace($pattern, '<a target="_blank" rel="nofollow" href="\0">\0</a>', $comments[$i]->commentText);
$comments[$i]->userImage = K2HelperUtilities::getAvatar($comments[$i]->userID, $comments[$i]->commentEmail, $params->get('commenterImgWidth'));
if ($comments[$i]->userID>0) {
$comments[$i]->userLink = K2HelperRoute::getUserRoute($comments[$i]->userID);
}
else {
$comments[$i]->userLink = $comments[$i]->commentURL;
}
if($reportSpammerFlag && $comments[$i]->userID>0) {
$comments[$i]->reportUserLink = JRoute::_('index.php?option=com_k2&view=comments&task=reportSpammer&id='.$comments[$i]->userID.'&format=raw');
}
else {
$comments[$i]->reportUserLink = false;
}
}
Can somebody help fixing above regular expression? Thanks

You are going to have this problem any time a user types.in a period with no spaces around it. You could add in some login to test for valid TLDs, but even that would not be perfect because there are plenty of TLDs that would fool the logic, like .it.
If you want to try your hand at fixing the regular expression, the pattern that determines if a string is a URL is here -
$pattern = "#\b(https?://)?(([0-9a-zA-Z_!~*'().&=+$%-]+:)?[0-9a-zA-Z_!~*'().&=+$%-]+\#)?(([0-9]{1,3}\.){3}[0-9]{1,3}|([0-9a-zA-Z_!~*'()-]+\.)*([0-9a-zA-Z][0-9a-zA-Z-]{0,61})?[0-9a-zA-Z]\.[a-zA-Z]{2,6})(:[0-9]{1,4})?((/[0-9a-zA-Z_!~*'().;?:\#&=+$,%#-]+)*/?)#";
Personally, I would just disable links in comments altogether by removing or commenting out this code -
$comments[$i]->commentText = preg_replace($pattern, '<a target="_blank" rel="nofollow" href="\0">\0</a>', $comments[$i]->commentText);

PHP router without regular expressions

I have been working on a fancy router/dispatcher class for weeks now trying to decide how I wanted it, I got it perfect IMO except performance is not what I am wanting from it. It uses a route map arrap = /forums/viewthread/:id/:page => 'forums/viewthread/(?\d+)' and loops through my map array with regex to get a match, I am trying to get something better on a high traffic site, here is a start...
$uri = "forum/viewforum/id-522/page-3";
$parts = explode("/", $uri);
$controller = $parts['0'];
$method = $parts['1'];
if($parts['2'] != ''){
$idNumber = $parts['2'];
}
if($parts['3'] != ''){
$pageNumber = $parts['3'];
}
Where I need help is sometime an id and a page will not be present sometime one or the other and sometimes both, so obvioulsy my above code would not cover that, it assumes array item 2 is always the id and 3 is always the page, could someone show me a practical way of matchting up the page and id to a variable only if they exist in the URI and without using regular expressions?
You can see what I have so far on my regular expressions versions in this question Is this a good way to match URI to class/method in PHP for MVC

This seems more extendable:
$parts = explode("/", $uri);
$parts_count=count($parts);
//set default values
$page_info=array('id'=>0,'page'=>0);
for($i=2;$i<$parts_count;$i++) {
if(strpos($parts[$i],'-')!==FALSE) {
list($info_type,$info_val)=explode('-',$parts[$i],2);
if(isset($page_info[$info_type])) {
$page_info[$info_type]=(int)$info_val;
}
}
}
then just use $page_info values. You can easily add other values this way and more levels of '/'.

if ( ! empty($parts['2']))
{
if (strpos($parts['2'], 'id-') !== FALSE)
{
$idNumber = str_replace('id-', '', $parts['2']);
}
elseif (strpos($parts['2'], 'page-') !== FALSE)
{
$pageNumber = str_replace('id-', '', $parts['2']);
}
}
And do the same for $part[3]

Need a regex to add css class to first and last list item

UPDATE:
Thank you all for your input. Some additional information.
It's really just a small chunk of markup (20 lines) I'm working with and had aimed to to leverage a regex to do the work.
I also do have the ability to hack up the script (an ecommerce one) to insert the classes as the navigation is built. I wanted to limit the number of hacks I have in place to keep things easier on myself when I go to update to the latest version of the software.
With that said, I'm pretty aware of my situation and the various options available to me. The first part of my regex works as expected. I posted really more or less to see if someone would say, "hey dummy, this is easy just change this....."
After coming close with a few of my efforts, it's more of the principle at this point. To just know (and learn) a solution exists for this problem. I also hate being beaten by a piece of code.
ORIGINAL:
I'm trying to leverage regular expressions to add a CSS a class to the first and last list items within an ordered list. I've tried a bunch of different ways but can't produce the results I'm looking for.
I've got a regular expression for the first list item but can't seem to figure a correct one out for the last. Here is what I'm working with:
$patterns = array('/<ul+([^<]*)<li/m', '/<([^<]*)(?<=<li)(.*)<\/ul>/s');
$replace = array('<ul$1<li class="first"','<li class="last"$2$3</ul>');
$navigation = preg_replace($patterns, $replace, $navigation);
Any help would be greatly appreciated.

Jamie Zawinski would have something to say about this...
Do you have a proper HTML parser? I don't know if there's anything like hpricot available for PHP, but that's the right way to deal with it. You could at least employ hpricot to do the first cleanup for you.
If you're actually generating the HTML -- do it there. It looks like you want to generate some navigation and have a .first and .last kind of thing on it. Take a step back and try that.

+1 to generating the right html as the best option.
But a completely different approach, which may or may not be acceptable to you: you could use javascript.
This uses jquery to make it easy ...
$(document).ready(
function() {
$('#id-of-ul:firstChild').addClass('first');
$('#id-of-ul:lastChild').addClass('last');
}
);
As I say, may or may not be any use in this case, but I think its a valid solution to the problem in some cases.
PS: You say ordered list, then give ul in your example. ol = ordered list, ul = unordered list

You wrote:
$patterns = array('/<ul+([^<]*)<li/m','/<([^<]*)(?<=<li)(.*)<\/ul>/s');
First pattern:
ul+ => you search something like ullll...
The m modifier is useless here, since you don't use ^ nor $.
Second pattern:
Using .* along with s is "dangerous", because you might select the whole document up to the last /ul of the page...
And well, I would just drop s modifier and use: (<li\s)(.*?</li>\s*</ul>) with replace: '$1class="last" $2'
In view of above remarks, I would write the first expression: <ul.*?>\s*<li
Although I am tired of seeing the Jamie Zawinski quote each time there is a regex question, Dustin is right in pointing you to a HTML parser (or just generating the right HTML from the start!): regexes and HTML doesn't mix well, because HTML syntax is complex, and unless you act on a well known machine generated output with very predictable result, you are prone to get something breaking in some cases.

I don't know if anyone cares any longer, but I have a solution that works in my simple test case (and I believe it should work in the general case).
First, let me point out two things: While PhiLho is right in that the s is "dangerous", since dots may match everything up to the final of the document, this may very well be what you want. It only becomes a problem with not well formed pages. Be careful with any such regex on large, manually written pages.
Second, php has a special meaning of backslashes, even in single quotes. Most regexen will perform well either way, but you should always double-escape them, just in case.
Now, here's my code:
<?php
$navigation='<ul>
<li>Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li>Water</li>
</ul>';
$patterns = array('/<ul.*?>\\s*<li/',
'/<li((.(?<!<li))*?<\\/ul>)/s');
$replace = array('$0 class="first"',
'<li class="last"$1');
$navigation = preg_replace($patterns, $replace, $navigation);
echo $navigation;
?>
This will output
<ul>
<li class="first">Coffee</li>
<li>Tea</li>
<li>Milk</li>
<li>Beer</li>
<li class="last">Water</li>
</ul>
This assumes no line feeds inside the opening <ul...> tag. If there are any, use the s modifier on the first expression too.
The magic happens in (.(?<!<li))*?. This will match any character (the dot) that is not the beginning of the string <li, repeated any amount of times (the *) in a non-greedy fashion (the ?).
Of course, the whole thing would have to be expanded if there is a chance the list items already have the class attribute set. Also, if there is only one list item, it will match twice, giving it two such attributes. At least for xhtml, this would break validation.

You could load the navigation in a SimpleXML object and work with that. This prevents you from breaking your markup with some crazy regex :)

As a preface .. this is waaay over-complicating things in most use-cases. Please see other answers for more sanity :)
Here is a little PHP class I wrote to solve a similar problem. It adds 'first', 'last' and any other classes you want. It will handle li's with no "class" attribute as well as those that already have some class(es).
<?php
/**
* Modify list items in pre-rendered html.
*
* Usage Example:
* $replaced_text = ListAlter::addClasses($original_html, array('cool', 'awsome'));
*/
class ListAlter {
private $classes = array();
private $classes_found = FALSE;
private $count = 0;
private $total = 0;
// No public instances.
private function __construct() {}
/**
* Adds 'first', 'last', and any extra classes you want.
*/
static function addClasses($html, $extra_classes = array()) {
$instance = new self();
$instance->classes = $extra_classes;
$total = preg_match_all('~<li([^>]*?)>~', $html, $matches);
$instance->total = $total ? $total : 0;
return preg_replace_callback('~<li([^>]*?)>~', array($instance, 'processListItem'), $html);
}
private function processListItem($matches) {
$this->count++;
$this->classes_found = FALSE;
$processed = preg_replace_callback('~(\w+)="(.*?)"~', array($this, 'appendClasses'), $matches[0]);
if (!$this->classes_found) {
$classes = $this->classes;
if ($this->count == 1) {
$classes[] = 'first';
}
if ($this->count == $this->total) {
$classes[] = 'last';
}
if (!empty($classes)) {
$processed = rtrim($matches[0], '>') . ' class="' . implode(' ', $classes) . '">';
}
}
return $processed;
}
private function appendClasses($matches) {
array_shift($matches);
list($name, $value) = $matches;
if ($name == 'class') {
$value = array_filter(explode(' ', $value));
$value = array_merge($value, $this->classes);
if ($this->count == 1) {
$value[] = 'first';
}
if ($this->count == $this->total) {
$value[] = 'last';
}
$value = implode(' ', $value);
$this->classes_found = TRUE;
}
return sprintf('%s="%s"', $name, $value);
}
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Matching a regex - php

Related

PHP Regex for IMDB/TMDB Urls

Additional elements to URLS?

K2 Joomla Wrong Urls in item comments

PHP router without regular expressions

Need a regex to add css class to first and last list item

Categories

Resources