Capitalize the encoded part of encoded url in PHP

Capitalize the encoded part of encoded url in PHP - php

I have string urls like this:
$url = 'htttp://mysite.com/sub1/sub2/%d8%f01'
I'd like to capitalize the encoded part of an url(only %** substrings) as per example it is '%d8%f01' so final url will be :
htttp://mysite.com/sub1/sub2/%D8%F01
Probably using preg_replace(), but can't make a correct regex.
Any clues? Thanks!!!

You could use preg_replace_callback to convert matched %** substrings to upper case:
$url = 'http://example.com/sub1/sub2/%d8%f01';
echo preg_replace_callback('/(%..)/', function ($m) { return strtoupper($m[1]); }, $url);
Output:
http://example.com/sub1/sub2/%D8%F01
Note this will also work if not all of the URL is encoded, for example:
$url = 'http://example.com/sub1/sub2/%cf%81abcd%ce%b5';
echo preg_replace_callback('/(%..)/', function ($m) { return strtoupper($m[1]); }, $url);
Output:
http://example.com/sub1/sub2/%CF%81abcd%CE%B5
Update
It is also possible to solve this with a straight preg_replace, although the patterns and replacements are quite repetitive as you have to consider all possible hex digits in each position after the %:
$url = 'http://example.com/sub1/sub2/%cf%81abcd%ce%5b';
echo preg_replace(array('/%a/', '/%b/', '/%c/', '/%d/', '/%e/', '/%f/',
'/%(.)a/', '/%(.)b/', '/%(.)c/', '/%(.)d/', '/%(.)e/', '/%(.)f/'),
array('%A', '%B', '%C', '%D', '%E', '%F',
'%$1A', '%$1B', '%$1C', '%$1D', '%$1E', '%$1F'),
$url);
Output:
http://example.com/sub1/sub2/%CF%81abcd%CE%5B
Update 2
Inspired by #Martin I did some performance testing, and the preg_replace_callback solution typically ran about 25% faster than the preg_replace (0.0156 seconds vs 0.0220 seconds for 10000 iterations).

I am not at all knowledgeable with PHP, but here's an example (albeit a bodged-together one which could probably be refactored) of how to do it without Regex. There's no need to use Regex with something like this.
$str = "http://example.com/sub1/sub2/%d8%f01";
$expl = explode('%', $str);
foreach ($expl as &$val) {
if(strpos($val, 'http') === false) {
$val = '%' . strtoupper($val);
};
}
print_r(join('', $expl));

Related

PHP: preg_replace() to get "parent" component of NameSpace

How can I use the preg_replace() replace function to only return the parent "component" of a PHP NameSpace?
Basically:
Input: \Base\Ent\User; Desired Output: Ent
I've been doing this using substr() but I want to convert it to regex.
Note: Can this be done without preg_match_all()?
Right now, I also have a code to get all parent components:
$s = '\\Base\\Ent\\User';
print preg_replace('~\\\\[^\\\\]*$~', '', $s);
//=> \Base\Ent
But I only want to return Ent.
Thank you!

As Rocket Hazmat says, explode is almost certainly going to be better here than a regex. I would be surprised if it's actually slower than a regex.
But, since you asked, here's a regex solution:
$path = '\Base\Ent\User';
$search = preg_match('~([^\\\\]+)\\\\[^\\\\]+$~', $path, $matches);
if($search) {
$parent = $matches[1];
}
else {
$parent = ''; // handles the case where the path is just, e.g., "User"
}
echo $parent; // echos Ent

I think maybe preg_match might be a better choice for this.
$s = '\\Base\\Ent\\User';
$m = [];
print preg_match('/([^\\\\]*)\\\\[^\\\\]*$/', $s, $m);
print $m[1];
If you read the regular expression backwards, from the $, it says to match many things that aren't backslashes, then a backslash, then many things that aren't backslashes, and save that match for later (in $m).

How about
$path = '\Base\Ent\User';
$section = substr(strrchr(substr(strrchr($path, "\\"), 1), "\\"), 1);
Or
$path = '\Base\Ent\User';
$section = strstr(substr($path, strpos($path, "\\", 1)), "\\", true);

Preg_replace adding second parameter to php function

I'm still totally lost when it comes to preg_replace function so I would be very happy if someone helped me with this one.
I have string which can contain a call to function like: Published("today")and I need to convert it through regular expression to Published("today", 1)
I basically need to add a second parameter to the function via regular expression.
I cant use str_replace because the first parameter can be (has to be) alphanumeric text.

$string = 'Published("today")';
$foo = preg_replace('/Published\("(\w+)"\)/', 'Published("$1", 1)', $string);

preg_replace_callback should do the job I reckon.
<?php
$string = 'Published("today"); Published("yesterday"); Published("5 days ago");';
$callback = function($match) {
return sprintf('%s, 1', $match[0]);
};
$string = preg_replace_callback(
'~(?<=Published\()"[^"]+"(?=\))~',
$callback,
$string
);
echo $string;
/*
Published("today", 1); Published("yesterday", 1); Published("5 days ago", 1);
*/

PHP Spintax Processor

I've been using the recurisve SpinTax processor as seen here, and it works just fine for smaller strings. However, it begins to run out of memory when the string goes beyond 20KB, and it's becoming a problem.
If I have a string like this:
{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!
and I want to have random combinations of the words put together, and not use the technique as seen in the link above (recursing through the string until there are no more words in curly-braces), how should I do it?
I was thinking about something like this:
$array = explode(' ', $string);
foreach ($array as $k=>$v) {
if ($v[0] == '{') {
$n_array = explode('|', $v);
$array[$k] = str_replace(array('{', '}'), '', $n_array[array_rand($n_array)]);
}
}
echo implode(' ', $array);
But it falls apart when there are spaces in-between the options for the spintax. RegEx seems to be the solution here, but I have no idea how to implement it and have much more efficient performance.
Thanks!

You could create a function that uses a callback within to determine which variant of the many potentials will be created and returned:
// Pass in the string you'd for which you'd like a random output
function random ($str) {
// Returns random values found between { this | and }
return preg_replace_callback("/{(.*?)}/", function ($match) {
// Splits 'foo|bar' strings into an array
$words = explode("|", $match[1]);
// Grabs a random array entry and returns it
return $words[array_rand($words)];
// The input string, which you provide when calling this func
}, $str);
}
random("{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!");
random("{This|That} is so {awesome|crazy|stupid}!");
random("{StackOverflow|StackExchange} solves all of my {problems|issues}.");

You can use preg_replace_callback() to specify a replacement function.
$str = "{Hello|Howdy|Hola} to you, {Mr.|Mrs.|Ms.} {Smith|Williams|Austin}!";
$replacement = function ($matches) {
$array = explode("|", $matches[1]);
return $array[array_rand($array)];
};
$str = preg_replace_callback("/\{([^}]+)\}/", $replacement, $str);
var_dump($str);

Why does this PHP spintax code repeat identical iterations?

http://ronaldarichardson.com/2011/09/23/recursive-php-spintax-class-3-0/
I like this script, but it isn't perfect. If you use this test input case:
{This is my {spintax|spuntext} formatted string, my {spintax|spuntext} formatted string, my {spintax|spuntext} formatted string example.}
You can see that the result ALWAYS contains 3 repetitions of either "spintax" or "spuntext". It never contains 1 "spintax" and 2 "spuntext", for example.
Example:
This is my spuntext formatted string, my spuntext formatted string, my spuntext formatted string example.
To be truly random it needs to generate a random iteration for each spintax {|} block and not repeat the same selection for identical blocks, like {spintax|spuntext}.
If you look at comment #7 on that page, fransberns is onto something, however when using his modified code in a live environment, the script would repeatedly run in an infinite loop and eat up all the server memory. So there must be a bug there, but I'm not sure what it is.
Any ideas? Or does anyone know of a robust PHP spintax script that allows for nested spintax and is truly random?

Please check this gist, it is working (and it is far simpler than original code ..).

The reason the Spintax class replaces all instances of {spintax|spuntext} with the same randomly chosen option is because of this line in the class:
$str = str_replace($match[0], $new_str, $str);
The str_replace function replaces all instances of the substring with the replacement in the search string. To replace only the first instance, progressing in a serial fashion as you desired, we need to use the function preg_replace with a passed "count" argument of 1. However, when I looked over your link to the Spintax class and reference to post #7 I noticed an error in his suggested augmentation to the Spintax class.
fransberns suggested replacing:
$str = str_replace($match[0], $new_str, $str);
with this:
//one match at a time
$match_0 = str_replace("|", "\|", $match[0]);
$match_0 = str_replace("{", "\{", $match_0);
$match_0 = str_replace("}", "\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
The problem with fransbergs' suggestion is that in his code he did not properly construct the regular expression for the preg_replace function. His error came from not properly escaping the \ character. His replacement code should have looked like this:
//one match at a time
$match_0 = str_replace("|", "\\|", $match[0]);
$match_0 = str_replace("{", "\\{", $match_0);
$match_0 = str_replace("}", "\\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
Consider replacing the original class with this augmented version utilizing my correction on fransberns' suggested replacemnet:
class Spintax {
function spin($str, $test=false)
{
if(!$test){
do {
$str = $this->regex($str);
} while ($this->complete($str));
return $str;
} else {
do {
echo "<b>PROCESS: </b>";var_dump($str = $this->regex($str));echo "<br><br>";
} while ($this->complete($str));
return false;
}
}
function regex($str)
{
preg_match("/{[^{}]+?}/", $str, $match);
// Now spin the first captured string
$attack = explode("|", $match[0]);
$new_str = preg_replace("/[{}]/", "", $attack[rand(0,(count($attack)-1))]);
// $str = str_replace($match[0], $new_str, $str); //this line was replaced
$match_0 = str_replace("|", "\\|", $match[0]);
$match_0 = str_replace("{", "\\{", $match_0);
$match_0 = str_replace("}", "\\}", $match_0);
$reg_exp = "/".$match_0."/";
$str = preg_replace($reg_exp, $new_str, $str, 1);
return $str;
}
function complete($str)
{
$complete = preg_match("/{[^{}]+?}/", $str, $match);
return $complete;
}
}
When I tried using fransberns' suggested replacement "as is", because of the improper escaping of the \ character, I got an infinite loop. I assume that this is where your memory problem came from. After correcting fransberns' suggested replacement with the correct escaping of the \ character I did not enter an infinite loop.
Try the class above with the corrected augmentation and see if it works on your server (I can't see a reason why it shouldn't).

PHP preg_match between text and the first occurrence of -

I'm trying to grab the 12345 out of the following URL using preg_match.
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$beg = "http://www.somesite.com/directory/";
$close = "\-";
preg_match("($beg(.*)$close)", $url, $matches);
I have tried multiple combinations of . * ? \b
Does anyone know how to extract 12345 out of the URL with preg_match?

Two things, first off, you need preg_quote and you also need delimiters. Using your construction method:
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$beg = preg_quote("http://www.somesite.com/directory/", '/');
$close = preg_quote("-", '/');
preg_match("/($beg(.*?)$close)/", $url, $matches);
But, I would write the query slightly differently:
preg_match('/directory\/(\d+)-/i', $url, $match);
It only matches the directory part, is far more readable, and ensures that you only get digits back (no strings)

This doesn't use preg_match but would achieve the same thing and would execute faster:
$url = "http://www.somesite.com/directory/12345-this-is-the-rest-of-the-url.html";
$url_segments = explode("/", $url);
$last_segment = array_pop($url_segments);
list($id) = explode("-", $last_segment);
echo $id; // Prints 12345

Too slow, I am ^^.
Well, if you are not stuck on preg_match, here is a fast and readable alternative:
$num = (int)substr($url, strlen($beg));
(looking at your code I guessed, that the number you are looking for is a numeric id is it is typical for urls looking like that and will not be "12abc" or anything else.)

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Capitalize the encoded part of encoded url in PHP - php

Related

PHP: preg_replace() to get "parent" component of NameSpace

Preg_replace adding second parameter to php function

PHP Spintax Processor

Why does this PHP spintax code repeat identical iterations?

PHP preg_match between text and the first occurrence of -

Categories

Resources