Wordpress rewrite rule for custom PHP page - php

I have created a PHP page which I'm using an include plugin for in wordpress.
So when I visit domain.com/kb it shows me the included PHP file which works fine.
I want to make the URL pretty so I tried adding this to the top of the php file:
add_rewrite_rule('^kb/([^/]*)/([^/]*)/?','index.php?page_id='.get_the_ID().'&category=$matches[1]&sequence=$matches[2]','top');
But when I visit domain.com/kb/123 is just removes the 123 and leaves domain.com/kb/
Ultimately, I want to be able to visit domain.com/kb/123/456 where I can read "123" and "456" as separate variables.

The current regex you have doesn't match kb/123, because it looks for a mandatory 2nd /. (marked in blue)
I'm gonna ignore that and go for your end-goal.
Going by this answer,
i've come up with the following solution:
function wpd_foo_rewrite_rule() {
add_rewrite_rule(
'^kb/(.*?)/([^/]*)',
'index.php?page_id='.get_the_ID().'&category=$matches[1]&sequence=$matches[2]',
'top'
);
}
add_action( 'init', 'wpd_foo_rewrite_rule' );
let's go over the regex -
that ?, in that context, means "lazy". match as few characters as possible until you hit the following char (/). (then, of course, we save it in a capture group).
Afterwards, let's deal with the 2nd parameter.
The following will take everything between kb/ and the 1st / (well, 1st after kb's) into the 1st capture group, (attention! the 1st / must exist, otherwise it isn't a match),
and everything after the 1st /, into the 2nd capture group.
this assumes you won't have any other "/" parameters (they'll all be in capture group 2) and that you won't have any query parameters (same thing).
if you want to utilize your "any char except /, that's fine too.
As a rule - never trust user input, always sanitize it. be on the lookout for ways to trick your plugin into loading files it shouldn't, inserting malicious data into storage (DB etc..), or querying data it shouldn't
btw - screenshots taken from regexr

Related

Custom php Hide programmer's comments from public view

When we create a html page comments like
<!-- Comment 1 -->
or inside php
// Comment2
are obvious from a right click of the page - Show code
How can i prevent that ?
Hiding the comments inside is the answer.. Thanks everyone.
Html comments will show on the HTML page but as long as you include your PHP comments in the <?php tag they won't show to the user
To leave html comments are normal if you check amazon.com's code you will see all the html comments but none php or whatever server lang they use so don't worry about html comments just don't include stupid stuff like your admin password or some revealing database schema stuff in the html comments.
if you still want to remove all the comments even html(vscode):
Easy way:
Open extensions (ctrl-shift-x) * type in remove comments in the
search box. * Install the top pick and read instructions.
Hard way: * search replace(ctrl-h) * toggle regex on (alt-r). * Learn some regular expressions! https://docs.rs/regex/0.2.5/regex/#syntax
A simple //.* will match all single line comments (and more ;D).
#.* could be used to match python comments. And /\*[\s\S\n]*\*/
matches block comments. And you can combine them as well:
//.*|/\*[\s\S\n]*\*/ (| in regex means "or", . means any
character, * means "0 or more" and indicates how many characters to
match, therefore .* means all characters until the end of the line
(or until the next matching rule))
Of course with caveats, such as urls (https://...) has double
slashes and will match that first rule, and god knows where there are
# in code that will match that python-rule. So some
reading/adjusting has to be done!
Once you start fiddling with your regexes it can take a lifetime to
get them perfect, so be careful and go the easy route if you are short
on time, but knowing some simple regex by heart will do you good,
since regular expressions are usable almost everywhere.
From https://stackoverflow.com/a/50575194/17239314

RegExp to match a segment of a URL

I'm trying to use RegExp to match a segment of a URL.
The URL in question is this:
http://www.example.com/news/region/north-america/
As I need this regex for the WordPress URL Rewrite API, the subject will only be the path section of the URL:
news/region/north-america
In the above example I need to be able to extract the north-america portion of the path, however when pagination is used the path becomes something like this:
news/region/north-america/page/2
Where I still only need to extract the north-america portion.
The RegExp I've come up with is as follows:
^news/region/(.*?)/(.*?)?/?(.*?)?$
However this does not match for news/region/north-america only news/region/north-america/page/2
From what I can tell I need to make the trailing slash after north-america optional, but adding /? doesn't seem to work.
Try this:
preg_match('/news\/region\/(.*?)\//',"http://www.example.com/news/region/north-america/page/2",$matches);
the $matches[1] will give you the output. as "north-america".
You should match using this regex:
^news/region/([^/]+)
This will give you news/region/north-america even when URI becomes /news/region/north-america/page/2
georg's suggested rule work like a charm:
^news/region/(.*?)(?:/(.*?)/(.*?))?$
For those interested in the application of this regex, I used it in the WP Rewrite API to grab the custom taxonomy and page number (if present) and assign the relevant matches to the the WP re-write:
$newRules['news/region/(.?)(?:/(.?)/(.*?))?$']='index.php?region=$matches[1]&forcetemplate=news&paged=$matches[3]';

RegEx to find a PHP RegEx string

I want to match a PHP regex string.
From what I know, they are always in the format (correct me if I am wrong):
/ One opening forward slash
the expression Any regular expression
/ One closing forward slash
[imsxe] Any number of the modifiers NOT REPEATING
My expression for this was:
^/.+/[imsxe]{0,5}$
Written as a PHP string, (with the open/close forward slash and escaped inner forward slashes) it is this:
$regex = '/^\/.+\/[imsxe]{0,5}$/';
which is:
^ From the beginning
/ Literal forward slash
.+ Any character, one or more
/ Literal forward slash
[imsxe]{0,5} Any of the chars i,m,s,x,e, 0-5 times (only 5 to choose from)
$ Until the end
This works, however it allows repeating modifiers, i.e:
This: ^/.+/[imsxe]{0,5}$
Allows this: '/blah/ii'
Allows this: '/blah/eee'
Allows this: '/blah/eise'
etc...
When it should not.
I personally use RegexPal to test, because its free and simple.
If (in order to help me) you would like to do the same, click the link above (or visit http://regexpal.com), paste my expression in the top text box
^/.+/[imsxe]{0,5}$
Then paste my tests in the bottom textbox
/^[0-9]+$/i
/^[0-9]+$/m
/^[0-9]+$/s
/^[0-9]+$/x
/^[0-9]+$/e
/^[0-9]+$/ii
/^[0-9]+$/mm
/^[0-9]+$/ss
/^[0-9]+$/xx
/^[0-9]+$/ee
/^[0-9]+$/iei
/^[0-9]+$/mim
/^[0-9]+$/sis
/^[0-9]+$/xix
/^[0-9]+$/eie
ensure you click the second checkbox at the top where it says '^$ match at line breaks (m)' to enable the multi-line testing.
Thanks for the help
Edit
After reading comments about Regex often having different delimiters i.e
/[0-9]+/ == #[0-9]+#
This is not a problem and can be factored in to my regex solution.
All I really need to know is how to prevent duplicate characters!
Edit
This bit isn't so important but it provides context
The need for such a feature is simple...
I'm using jQuery UI MultiSelect Widget written by Eric Hynds.
Simple demo found here
Now In my application, I'm extending the plugin so that certain options popup a little menu on the right when hovered. The menu that pops up can be ANY html element.
I wanted multiple options to be able to show the same element. So my API works like this:
$('#select_element_id')
// Erics MultiSelect API
.multiselect({
// MultiSelect options
})
// My API
.multiselect_side_pane({
menus: [
{
// This means, when an option with value 'MENU_1' is hovered,
// the element '#my_menu_1' will be shown. This makes attaching
// menus to options REALLY SIMPLE
menu_element: $('#my_menu_1'),
target: ['MENU_1']
},
// However, lets say we have option value 'USER_ID_132', I need the
// target name to be dynamic. What better way to be dynamic than regex?
{
menu_element: $('#user_details_box'),
targets: ['USER_FORM', '/^USER_ID_[0-9]+$/'],
onOpen: function(target)
{
// here the TARGET can be interrogated, and the correct
// user info can be displayed
// Target will be 'USER_FORM' or 'USER_ID_3' or 'USER_ID_234'
// so if it is USER_FORM I can clear the form ready for input,
// and if the target starts with 'USER_ID_', I can parse out
// the user id, and display the correct user info!
}
}
]
});
So as you can see, The whole reason I need to know if a string a regex, is so in the widget code, I can decide whether to treat the TARGET as a string (i.e. 'USER_FORM') or to treat the TARGET as an expression (i.e '/^USER_ID_[0-9]+$/' for USER_ID_234')
Unfortunately, the regexp string can be "anything". The forward slashes you talk about can be a lot of characters. i.e. a hash (#) will also work.
Secondly, to match up to 5 characters without having them double could probably be done with lookahead / lookbehind etc, but will create such complex regexp that it's faster to post-process it.
It is possibly faster to search for the regular expression functions (preg_match, preg_replace etc.) in code to be able to deduct where regular expressions are used.
$var = '#placeholder#';
Is a valid regular expression in PHP, but doesn't have to be one, where:
const ESCAPECHAR = '#';
$var = 'text';
$regexp = ESCAPECHAR . $var . ESCAPECHAR;
Is also valid, but might not be seen as such.
In order to prevent duplicate in modifier section, I'd do:
^/.+/(?:(?=[^i]*i[^i]*)?(?=[^m]*m[^m]*)?(?=[^s]*s[^s]*)?(?=[^x]*x[^x]*)?(?=[^e]*e[^e]*)?)?$

PHP Regex URL parsing issues preg_replace

I have a custom markup parsing function that has been working very well for many years. I recently discovered a bug that I hadn't noticed before and I haven't been able to fix it. If anyone can help me with this that'd be awesome. So I have a custom built forum and text based MMORPG and every input is sanitized and parsed for bbcode like markup. It'll also parse out URL's and make them into legit links that go to an exit page with a disclaimer that you're leaving the site... So the issue that I'm having is that when I user posts multiple URL's in a text box (let's say \n delimited) it'll only convert every other URL into a link. Here's the parser for URL's:
$markup = preg_replace("/(^|[^=\"\/])\b((\w+:\/\/|www\.)[^\s<]+)" . "((\W+|\b)([\s<]|$))/ei", '"$1".shortURL("$2")."$4"', $markup);
As you can see it calls a PHP function, but that's not the issue here. Then entire text block is passed into this preg_replace at the same time rather than line by line or any other means.
If there's a simpler way of writing this preg_replace, please let me know
If you can figure out why this is only parsing every other URL, that's my ultimate goal here
Example INPUT:
http://skylnk.co/tRRTnb
http://skylnk.co/hkIJBT
http://skylnk.co/vUMGQo
http://skylnk.co/USOLfW
http://skylnk.co/BPlaJl
http://skylnk.co/tqcPbL
http://skylnk.co/jJTjRs
http://skylnk.co/itmhJs
http://skylnk.co/llUBAR
http://skylnk.co/XDJZxD
Example OUTPUT:
http://skylnk.co/tRRTnb
<br>http://skylnk.co/hkIJBT
<br>http://skylnk.co/vUMGQo
<br>http://skylnk.co/USOLfW
<br>http://skylnk.co/BPlaJl
<br>http://skylnk.co/tqcPbL
<br>http://skylnk.co/jJTjRs
<br>http://skylnk.co/itmhJs
<br>http://skylnk.co/llUBAR
<br>http://skylnk.co/XDJZxD
<br>
e flag in preg_replace is deprecated. You can use preg_replace_callback to access the same functionality.
i flag is useless here, since \w already matches both upper case and lower case, and there is no backreference in your pattern.
I set m flag, which makes the ^ and $ matches the beginning and the end of a line, rather than the beginning and the end of the entire string. This should fix your weird problem of matching every other line.
I also make some of the groups non-capturing (?:pattern) - since the bigger capturing groups have captured the text already.
The code below is not tested. I only tested the regex on regex tester.
preg_replace_callback(
"/(^|[^=\"\/])\b((?:\w+:\/\/|www\.)[^\s<]+)((?:\W+|\b)(?:[\s<]|$))/m",
function ($m) {
return "$m[1]".shortURL($m[2])."$m[3]";
},
$markup
);

php & .htaccess clean url problem

i wanna use clean url for my site but i have an big problem!
i have urls like :
index.php?lang=en&mod=product&section=category
index.php?lang=en&mod=product&caption=fetch&id=45
index.php?lang=pe&mod=blog&section=category&id=560
index.php?lang=pe&mod=blog&section=category&id=564
index.php?lang=pe&mod=blog&section=category&id=567
index.php?lang=pe&mod=blog&section=category&id=571
index.php?lang=pe&mod=blog&id=556
index.php?lang=pe&mod=page&id=537
index.php?lang=pe&mod=blog&id=558&o_t=cDate_ASC
index.php?lang=pe&mod=product&caption=fetch&id=7804
As you see i have a problem that my varibale's order is diference toghether and my 3rd or 4th variable are not stable sometimes it's id or sometimes is caption.
i want to set my template url to ( e.g en/product/category ) but when i want to set it in .htaccess it's not clear that theird depth is "id" or is "caption" !
do i should put all variables in my url like this ? :
index.php?lang=en&mod=product&section=category
|
|
|
V
index.php?lang=en&mod=product&section=category&caption=&id=&o_t=&v_t=&offset=
EDIT :
So i use smarty as my template engine.i should change my link address in templates like my clean url ( e.g en/product/category/324 ) . my problem is when i set a link to en/product/34 or en/product/category/23 according to my .htaccess rewrite rules it's not clear that 3rd part is id or category
in this case :
RewriteRule ^/(en|pe)/(product|blog|page)/(category)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&section=$3&id=$4
3rd variable is category an .htaccess define 3rd part as category but as you can see sometimes url has not category and instead of it has id !
My big problem is this
You'd need to make a few rewrite rules I think.
E.G.
RewriteRule ^/(en|pe)/(product|blog|page)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&id=$3
Would rewrite index.php?lang=en&mod=page&id=22 to /en/page/22 (so long as ID was > 1 character)
RewriteRule ^/(en|pe)/(product|blog|page)/(category)/([0-9]{1,})/$ index.php?lang=$1&mod=$2&section=$3&id=$4
Would rewrite index.php?lang=en&mod=blog&section=category&id=22 to /en/blog/category/22
You may need to fiddle with the ^/ at the start depending on if you have a RewriteBase set or not.
EDIT:
Explanation:
^ indicates the starting position
of the URL from the base i.e.
site.com/(whatever here is in the
URL)
(en|pe) means that first
value in that particular rule can be
EITHER en OR pe. To add more is easy
(en|pe|ru|jp) etc. Same goes for the
product/blog/page part. I included
(category) just incase you had other
'section' types that were not
'category'.
[0-9] is any
numeric character 0 to 9. {1,} means
1+ character in length. If you want
between 2 and 4, do {2,4} for
example. Exactly 3 characters? {3}.
It's useful when targetting specific
things.
$ Means the end of the matched
string. If you intend on having
nothing after the id except a /
(could even remove that /) then use
that example as is. If you intend on
having a title of a blog past
afterward, you can do (.*)$ which
means anything can be after the page
id e.g.
/en/blog/category/22/oheyoheyoheyohey
would be the same as
/en/blog/category/22/abcjhrefgwgrjurgh.
If you pass the title as a parameter
&title=this is the title, just do the
same thing as I did in the example
for ID except use [a-zA-Z0-9-+_.] to
include alphanumeric characters, +,
-, _, .
$1 is the order of the paranthesis arguments in the
first argument of the rule. E.G. $1
refers to (en|pe), so lang can either
be en or pe.
IF you want the rule to apply to multiple pages, and not just the index.php, make it:
RewriteRule ^/([a-zA-Z])/(en|pe)/(product|blog|page)/([0-9]{1,})/$ $1.php?lang=$2&mod=$3&id=$4
So in that case, site.com/blah/en/product/22 would relate to site.com/blah.php?lang=en&mod=product&id=22
Why should you do so? I see no problems with your urls, normal user does not mind and those who inspet your site can read it all without problems ...
You have to agree on some ordering of parameters, and use mulitple rewrite rules.
For the order lang > mod > section > caption > id > o_t > v_t > offset you want to have something like this:
RewriteRule ^/(\w+)$ index.php?lang=$1
RewriteRule ^/(\w+)/(\w+)$ index.php?lang=$1&mod=$2
RewriteRule ^/(\w+)/(\w+)/(\w+)$ index.php?lang=$1&mod=$2&section=$3
RewriteRule ^/(\w+)/(\w+)/(\w+)/(\w+)$ index.php?lang=$1&mod=$2&section=$3&caption=$4
RewriteRule ^/(\w+)/(\w+)/(\w+)/(\w+)/(\d+)$ index.php?lang=$1&mod=$2&section=$3&caption=$4&id=$5
...
and so on
Above, I assume lang, mod, section and caption are made up of alphabet characters (no digits or special chars), and the id is made of digits.
The real file index.php does not care about the order of variables in the query string, or if a value is missing, because it receives couple with name-value: the right association is ensured by the presence or absence of the full couple.
To be clear, for index.php is the same if the query string is "?lang=en&mode=blog" or "?mode=blog&lang=en" or if it is only "?lang=en", because the variables are managed inside the script by use of the $_GET associative array, independently by their order or presence inside the array.
What is important is that you plan a correct order of variables inside the new virtual URL's to rewrite because they contain only the variable content, while the variable name is taken from the position inside the virtual URL. That is (note this is pseudo-code):
yourdomain/val1/val2/val3/...etc
to be rewritten in:
index.php?var1=val1&var2=val2&var3=val3&...etc
so in the new URL's you are going to plan, there cannot be missing values.
You can solve this problem by assigning fake values to the missing variables that will not be taken as valid by your script.
As example, if the mode variable is missing, you can put in that position a string that will not be considered valid by the script, so to be managed as if it was empty.
If you have an array of allowed values, you can just add a control *if (in_array())* instead of (or other than) if(empty()).
When you build the links to other page, you can just add this control for a missing value:
*if (empty(val3)) val3 = 'fake_value';*

Categories