reorder / rewrap bbcodes - php

I'm trying to reorder the BBCodes but I failed
so
[̶b̶]̶[̶i̶]̶[̶u̶]̶f̶o̶o̶[̶/̶b̶]̶[̶/̶u̶]̶[̶/̶i̶]̶ ̶-̶ ̶w̶r̶o̶n̶g̶ ̶o̶r̶d̶e̶r̶ ̶ ̶
I̶ ̶w̶a̶n̶t̶ ̶i̶t̶ ̶t̶o̶ ̶b̶e̶:̶ ̶
̶[̶b̶]̶[̶i̶]̶[̶u̶]̶f̶o̶o̶[̶/̶u̶]̶[̶/̶i̶]̶[̶/̶b̶]̶ ̶-̶ ̶r̶i̶g̶h̶t̶ ̶o̶r̶d̶e̶r̶
PIC:
I tried with
<?php
$string = '[b][i][u]foo[/b][/u][/i]';
$search = array('/\[b](.+?)\[\/b]/is', '/\[i](.+?)\[\/i]/is', '/\[u](.+?)\[\/u]/is');
$replace = array('[b]$1[/b]', '[i]$1[/i]', '[u]$1[/u]');
echo preg_replace($search, $replace, $string);
?>
OUTPUT: [b][i][u]foo[/b][/u][/i]
any suggestions ? thanks!

phew, spent awhile thinking of the logic to do this. (feel free to put it in a function)
this only works for the scenario given. Like other users have commented it's impossible. You shouldn't be doing this. Or even on server side. I'd use a client side parser just to throw a syntax error.
supports [b]a[i]b[u]foo[/b]baa[/u]too[/i]
and bbcode with custom values [url=test][i][u]foo[/url][/u][/i]
Will break with
[b] bold [/b][u] underline[/u]
And [b] bold [u][/b] underline[/u]
//input string to be reorganized
$string = '[url=test][i][u]foo[/url][/u][/i]';
echo $string . "<br />";
//search for all opentags (including ones with values
$tagsearch = "/\[([A-Za-z]+)[A-Za-z=._%?&:\/-]*\]/";
preg_match_all($tagsearch, $string, $tags);
//search for all close tags to store them for later
$closetagsearch = "/(\[\/([A-Za-z]+)\])/is";
preg_match_all($closetagsearch, $string, $closetags);
//flip the open tags for reverse parsing (index one is just letters)
$tags[1] = array_reverse($tags[1]);
//create temp var to store new ordered string
$temp = "";
//this is the last known position in the original string after a match
$last = 0;
//iterate through each char of the input string
for ($i = 0, $len = strlen($string); $i < $len; $i++) {
//if we run out of tags to replace/find stop looping
if (empty($tags[1]) || empty($closetags[1]))
continue;
//this is the part of the string that has no matches
$good = substr($string, $last, $i - $last);
//next closing tag to search for
$next = $closetags[1][0];
//how many chars ahead to compare against
$scope = substr($string, $i, strlen($next));
//if we have a match
if ($scope === "$next") {
//add to the temp variable with a modified
//version of an open tag letter to become a close tag
$temp .= $good . substr_replace("[" . $tags[1][0] . "]", "/", 1, 0);
//remove the first key/value in both arrays
array_shift($tags[1]);
array_shift($closetags[1]);
//update the last known unmatched char
$last += strlen($good . $scope);
}
}
echo $temp;
Please also note: it might be the users intention to nest the tags out of order :X

Related

Get range letter and number with php

how can I get range for bottom string in php?
M0000001:M0000100
I want result
M0000001
M0000002
M0000003
..
..
..
M0000100
this is what i do
<?php
$string = "M0000001:M0000100";
$explode = explode(":",$string );
$text_one = $explode[0];
$text_two = $explode[1];
$range = range($text_one,$text_two);
print_r($range);
?>
So can anyone help me with this?
This is one of many ways you could do this and this is a little verbose but hopefully it shows you some "steps" to take.
It doesn't check for the 1st number being bigger than the 2nd.
It doesn't check your Range strings start with a "M".
It doesn't have all of the required comments.
Those are things for you to consider and work out...
<?php
$string = "M00000045:M000099";
echo generate_range_from_string($string);
function generate_range_from_string($string) {
// First explode the two strings
$explode = explode(":", $string);
$text_one = $explode[0];
$text_two = $explode[1];
// Remove the Leading Alpha character
$range_one = str_replace('M', '', $text_one);
$range_two = str_replace('M', '', $text_two);
$padding_length = strlen($range_one);
// Build the output string
$output = '';
for ( $index = (int) $range_one; $index <= (int) $range_two; $index ++ ) {
$output .= 'M' . str_pad($index, $padding_length, '0', STR_PAD_LEFT) . '<br>';
}
return $output;
}
The output lists a String in the format you have specified in the question. So this is based solely upon that.
This could undergo a few more revisions to make it more function like, as I'm sure some folks will pick out!

PHP get words from a word using pspell_check

I had a PHP string which contains English words. I want to extract all the possible words from the string, not by explode() by space as I have only a word. I mean extraction of words from a word.
Example: With the word "stackoverflow", I need to extract stack, over, flow, overflow all of them.
I am using pspell_check() for spell checking. I am currently getting the following combination.
--> sta
--> stac
--> stack
and so on.
So I found the only the words matching stack but I want to find the following words. Notice that I don't want the final word as I've already.
--> stack
--> over
--> flow
My Code:
$myword = "stackoverflow";
$word_length = strlen($myword);
$myword_prediction = $myword[0].$myword[1];
//(initial condition as words detection starts after 3rd index)
for ($i=2; $i<$word_length; $i++) {
$myword_prediction .= $myword[$i];
if (pspell_check(pspell_new("en"), $myword_prediction))
{
$array[] = $myword_prediction;
}
}
var_dump($array);
How about if you have an outer loop like this. The first time through you start at the first character of $myword. The second time through you start at the second character, and so on.
$myword = "stackoverflow";
$word_length = strlen($myword);
$startLetter = 0;
while($startLetter < $word_length-2 ){
$myword_prediction = $myword[$startLetter] . $myword[$startLetter +1];
for ($i=$startLetter; $i<$word_length; $i++) {
$myword_prediction .= $myword[$i];
if (pspell_check(pspell_new("en"), $myword_prediction)) {
$array[] = $myword_prediction;
}
}
$startLetter ++;
}
Well, you would need to get all substrings, and check each one:
function get_all_substrings($input){
$subs = array();
$length = strlen($input);
for($i=0; $i<$length; $i++){
for($j=$i; $j<$length; $j++){
$subs[] = substr($input, $i, $j);
}
}
return array_unique($subs);
}
$substrings = get_all_substrings("stackoverflow");
$pspell_link = pspell_new("en");
$words = array_filter($substrings, function($word) use ($pspell_link) {
return pspell_check($pspell_link, $word);
});
var_dump($words);

PHP Explode Show Seperator

So I wrote the following code to show the words after the fourth full stop / period in a sentence.
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i];
}
The algorithm is working and it is showing me the rest of the sentence after the fourth "." full stop / period.... My problem is that the output is not showing the full stops in the sentence therefore it is showing me just text without the proper punctuation "." .... Can someone please help me out on how to fix the code to display also the full stops / periods ??
Thanks a lot
you could try this...
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i].".";
}
notice the added period at the end of the echo command // .".";
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i].".";
}
Instead of splitting the input string and then iterating over it, you can find the nth position of the separator (.) in the string by using strpos() function by changing the offset parameter.
Then, it is just the matter of printing the sub-string from the position we just determined.
<?php
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$pos = 0;
//find the position of 4th occurrence of dot
for($i = 0; $i < $limit; $i++) {
$pos = strpos($text, '.', $pos) + 1;
}
print substr($text, $pos);
If desired output is "seperated.with.full.stops.", then you can use:
<?php
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
$minText = explode(".", $text);
$minText = array_slice($minText, $limit);
echo implode('.', $minText) . '.';
If you want to break it up on the periods between words, but keep the one at the end as actual punctuation, you may want to use preg_replace() to convert the periods to another character and then explode it.
$text = "this.is.the.message.seperated.with.full.stops.";
$limit = 4;
//replace periods if they are follwed by a alphanumeric character
$toSplit = preg_replace('/\.(?=\w)/', '#', $text);
$minText = explode("#", $toSplit);
for($i = $limit; $i < count($minText); $i++){
echo $minText[$i] . "<br/>";
}
Which Yields
seperated
with
full
stops.
Of course, if you just simply want to print all the full stops, then add them in after you echo the term.
echo $minText[$i] . ".";

PHP: Shift characters of string by 5 spaces? So that A becomes F, B becomes G etc

How can I shift characters of string in PHP by 5 spaces?
So say:
A becomes F
B becomes G
Z becomes E
same with symbols:
!##$%^&*()_+
so ! becomes ^
% becomes )
and so on.
Anyway to do this?
The other answers use the ASCII table (which is good), but I've got the impression that's not what you're looking for. This one takes advantage of PHP's ability to access string characters as if the string itself is an array, allowing you to have your own order of characters.
First, you define your dictionary:
// for simplicity, we'll only use upper-case letters in the example
$dictionary = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
Then you go through your input string's characters and replace each of them with it's $position + 5 in the dictionary:
$input_string = 'STRING';
$output_string = '';
$dictionary_length = strlen($dictionary);
for ($i = 0, $length = strlen($input_string); $i < $length; $i++)
{
$position = strpos($dictionary, $input_string[$i]) + 5;
// if the searched character is at the end of $dictionary,
// re-start counting positions from 0
if ($position > $dictionary_length)
{
$position = $position - $dictionary_length;
}
$output_string .= $dictionary[$position];
}
$output_string will now contain your desired result.
Of course, if a character from $input_string does not exist in $dictionary, it will always end up as the 5th dictionary character, but it's up to you to define a proper dictionary and work around edge cases.
Iterate over characters and, get ascii value of each character and get char value of the ascii code shifted by 5:
function str_shift_chars_by_5_spaces($a) {
for( $i = 0; $i < strlen($a); $i++ ) {
$b .= chr(ord($a[$i])+5);};
}
return $b;
}
echo str_shift_chars_by_5_spaces("abc");
Prints "fgh"
Iterate over string, character at a time
Get character its ASCII value
Increase by 5
Add to new string
Something like this should work:
<?php
$newString = '';
foreach (str_split('test') as $character) {
$newString .= chr(ord($character) + 5);
}
echo $newString;
Note that there is more than one way to iterate over a string.
PHP has a function for this; it's called strtr():
$shifted = strtr( $string,
"ABCDEFGHIJKLMNOPQRSTUVWXYZ",
"FGHIJKLMNOPQRSTUVWXYZABCDE" );
Of course, you can do lowercase letters and numbers and even symbols at the same time:
$shifted = strtr( $string,
"ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789!##$%^&*()_+",
"FGHIJKLMNOPQRSTUVWXYZABCDEfghijklmnopqrstuvwxyzabcde5678901234^&*()_+!##$%" );
To reverse the transformation, just swap the last two arguments to strtr().
If you need to change the shift distance dynamically, you can build the translation strings at runtime:
$shift = 5;
$from = $to = "";
$sequences = array( "ABCDEFGHIJKLMNOPQRSTUVWXYZ", "abcdefghijklmnopqrstuvwxyz",
"0123456789", "!##$%^&*()_+" );
foreach ( $sequences as $seq ) {
$d = $shift % strlen( $seq ); // wrap around if $shift > length of $seq
$from .= $seq;
$to .= substr($seq, $d) . substr($seq, 0, $d);
}
$shifted = strtr( $string, $from, $to );

PHP: trim word OR part of it from begining/end of string

I need to trim words from begining and end of string. Problem is, sometimes the words can be abbreviated ie. only first three letters (followed by dot).
I tried hard to find suitable regular expression. Basicaly I need to chatch three or more initial characters up to length of replacement, but I cannot find regular expression, that will match variable length and will keep order of characters.
For example, if I need to trim 'insurance' from sentence 'insur. companies are rich', then pattern \^[insurance]{3,9}\ comes to my mind, but this pattern will also catch words like 'sensace', because order of characters (and their occurance) inside [] is not important for regexp.
Also, at end of string, I need remove serial-numbers, that are abbreviated from beginig - say 'XK-25F14' is sometimes presented as '25F14'. So I decided to go purely with character by character comparison.
Therefore I end with following php function
function trimWords($s, $dirt, $case_insensitive = false, $reverse = true)
{
$pos = 0;
$func = $case_insensitive ? 'strncasecmp' : 'strncmp';
// Get number of initial characters, that match in both strings
while ($func($s, $dirt, $pos + 1) === 0)
$pos++;
// If more than 2 initial characters match, then remove the match
if ($pos > 2)
$s = substr($s, $pos);
// Reverse $s and $dirt so it will trim from the end of string
$s = strrev($s);
if ($reverse)
return trimWords($s, strrev($dirt), $case_insensitive, false);
// After second run return back-reversed string
return trim($s, ' .-');
}
I'm happy with this function, but it has one drawback. It trims only one occurence of word. How to make it trim more occurances, i.e. remove both 'insurance ' from 'Insurance insur. companies'.
And I'm also curious, it realy does not exists such regular expression, that will match variable length and will respect order of characters in pattern?
Final solution
Thanks to mrhobo I have ended with function based on regular expression. This function can be easily improved and shall also be the most efficient for this task.
I have modified my previous function and it is two times quicker than regexp, but it can remove only one word per single run, so to be able to remove word from begin and end, it has to runs itself twice and performance is same as regexp and to remove more than one occurance of word, it has to runs itself multiple times, which will then be more and more slower.
The final function goes like this.
function trimWords($string, $word, $case_insensitive = false, $min_abbrv = 3)
{
$exc = substr($word, $min_abbrv);
$pat = null;
$i = strlen($exc);
while ($i--)
$pat = '(?>'.preg_quote($exc[$i], '#').$pat.')?';
$pat = substr($word, 0, $min_abbrv).$pat;
$pat = '#(?<begin>^)?(?:\W*\b'.$pat.'\b\W*)+(?(begin)|$)#';
if ($case_insensitive)
$pat .= 'i';
return preg_replace($pat, '', $string);
}
NOTE: with this function, it does not matter, if abbreviation ends with dot or not, it wipes out any shorter form of word and also removes all nonword characters around the word.
EDIT: I just tried create replace pattern like insu(r|ra|ran|ranc|rance) and function with atomic groups is faster by ~30% and with longer words it could be possibly even more efficient.
Matching a word and all possible abbreviations from the nth letter isn't quite an easy task in regex.
Here is how I would do it for the word insurance from the 4th letter:
insu(?>r(?>a(?>n(?>c(?>(?<last>e))?)?)?)?)?(?(last)|\.)
http://regex101.com/r/aL2gV4
It works by using atomic groups to force the regex engine as far as possible forward past the last 'rance' letters using the nested pattern (?>a(?>b)?)?. If the last letter letter is matched we're not dealing with an abbreviation thus no dot is required, otherwise the dot is required. This is coded by (?(last)|\.).
To trim, I would create a function to build the above regex for an abbreviation. Then you can write a while loop that replaces each of the abbreviation regexes with empty space until there are no more matches.
Non regex version
Here is my non regex version that removes multiple words and abbreviated words from a string:
function trimWords($str, $word, $min_abbrv, $case_insensitive = false) {
$len = 0;
$word_len = strlen($word);
$strlen = strlen($str);
$cmp = $case_insensitive ? strncasecmp : strncmp;
for ($i = 0; $i < $strlen; $i++) {
if ($cmp($str[$i], $word[$len], $i) == 0) {
$len++;
} else if ($len > 0) {
if ($len == $word_len || ($len >= $min_abbrv && ($dot = $str[$i] == '.'))) {
$i -= $len;
$len += $dot;
$str = substr($str, 0, $i) . substr($str, $i+$len);
$strlen = strlen($str);
$dot = 0;
}
$len = 0;
}
}
return $str;
}
Example:
$string = 'ins. <- "ins." / insu. insuranc. insurance / insurance. <- "."';
echo trimWords($string, 'insurance', 4);
Output is:
ins. <- "ins." / / . <- "."
I wrote function that constructs regular expression pattern according to mrhobo and also simple test and benchmarked it against my function with pure PHP string comparison.
Here is the code:
$string = 'Insur. companies are nasty rich';
$dirt = 'insurance';
$cycles = 500000;
$start = microtime(true);
$i = $cycles;
while ($i) {
$i--;
regexpStyle($string, $dirt, true);
}
$stop = microtime(true);
$i = $cycles;
while ($i) {
$i--;
trimWords($string, $dirt, true);
}
$end = microtime(true);
$res1 = $stop - $start;
$res2 = $end - $stop;
$winner = $res1 < $res2 ? '<<<' : '>>>';
echo 'regexp: '.$res1.' '.$winner.' string operations: '.$res2;
function trimWords($s, $dirt, $case_insensitive = false, $reverse = true)
{
$pos = 0;
$func = $case_insensitive ? 'strncasecmp' : 'strncmp';
// Get number of initial characters, that match in both strings
while ($func($s, $dirt, $pos + 1) === 0)
$pos++;
// If more than 2 initial characters match, then remove the match
if ($pos > 2)
$s = substr($s, $pos);
// After second run return back-reversed string
return trim($s, ' .-');
}
function regexpStyle($s, $dirt, $case_insensitive, $min_abbrev = 3)
{
$ss = substr($dirt, $min_abbrev);
$arr = str_split($ss);
$patt = '(?>(?<last>'.array_pop($arr).'))?';
$i = count($arr);
while ($i)
$patt = '(?>'.$arr[--$i].$patt.')?';
$patt = '#^'.substr($dirt, 0, $min_abbrev).$patt.'(?(last)|\.)#';
$patt .= $case_insensitive ? 'i' : null;
return trim(preg_replace($patt, '', $s));
}
and the winner is... moment of silence... it is...
a draw
regexp: 8.5169589519501 >>> string operations: 8.0951890945435
but I have strong feeling that regexp approach could be better utilized.

Categories