How to make explode() include the exploded character - php

$text = "This is /n my text /n wow";
$quotes = explode('/n',$text);
This would split the string into "This is" "My text" "wow"
but I want it to leave the string "/n" as it is, instead of cutting it off,
the output should look like this:
"This is /n" "my text /n" "wow"

Explode your string into an array and then append the separator onto each element of the resulting array.
$sep = "/n";
$text = "This is /n my text /n wow";
$quotes = explode($sep,$text);
$quotes = array_map(function($val) use ($sep) {
return $val . $sep;
}, $quotes);
$last_key = count($quotes)-1;
$quotes[$last_key] = rtrim($quotes[$last_key], $sep);
(Might need to trim($val) as well).

If you have only one possible separator then you can simply append it to the tokens that explode returned. However, if you're asking this question, e.g. because you have multiple possible separators and need to know which one separated two tokens, then preg_split might work for you. E.g. for separators ',' and ';':
$matches = preg_match('/(,|;)/', $text, -1, PREG_SPLIT_DELIM_CAPTURE);

Have you looked into using the preg_split() function. Per the documentation:
preg_split — Split string by a regular expression
Using this function, apply a positive lookbehind that matches spaces followed by a preceding /n string.
$quotes= preg_split("/(?<=\/n) /", $text);
You can test that this is the desired functionality by doing print_r($quotes); after the above statement. This output from the print_r function will looks similar to the following:
Array ( [0] => This is /n [1] => my text /n [2] => wow )
You may need to use trim() on the values to clear off leading and trailing whitespace but overall it seems to do what you're asking.
DEMO:
If you want to test this functionality out, try copying the following code block and pasting it into the CodeSpace window on http://phpfiddle.org.
<?php
$text = "This is /n my text /n wow";
$values = preg_split("/(?<=\/n) /", $text);
print_r($values);
?>
Select the Run - F9 option to see the output. My apologies for the copy and paste demo example. I couldn't figure out how to create a dedicated URL like some of the other fiddle programs.

Related

How can i get the string between html commetns start end with regular expression ?

I wan to get he text between the HTML comments start and end Like
<!--Q1-->
\nフレンチブルドックと遊んでるとき\n
<!--Q1END-->\n
<!--Q2-->
\n表参道、新宿、銀座\n
<!--Q2END-->\n
<!--Q3-->
\nヒューマンドラマ全般が好きです。<BR>\n<BR>\n好きなアーティスト サザンオールスターズ\n
<!--Q3END-->
I want to get it as array like this
$data = [
1 => 'フレンチブルドックと遊んでるとき',
2 => '表参道、新宿、銀座',
3 = 'ヒューマンドラマ全般が好きです。<BR>\n<BR>\n好きなアーティスト サザンオールスター ズ'
]
So how can i find the text between html comments ?
Thanks in advance
Here's a regex that would get you what you want for the above string:
/<!--Q(\d)-->\n\\n(.*)\\n\n<!--Q\1END-->/gs
(Note: This removes the literal '\n' before and after each of the strings you want since this is what you have above, but if the strings don't have this, it won't match either.)
To put that into PHP remember you have to double escape the literal backslashes. Unfortunately it's quite ugly to keep track of all the newlines and literal '\n' strings (at least to me).
preg_match_all('/<!--Q(\d)-->\n\\\\n(.*)\\\\n\n<!--Q\1END-->/s', $text, $matches);
print_r($matches[2]);
Or if you want something more readable, you can remove the literal '\n' strings from the input text, match everything between the HTML quotes and then trim it:
// Remove all literal '\n' strings from the text
$text = preg_replace('#\\\\n#', '', $text);
// Match desired strings
preg_match_all('/<!--Q(\d)-->(.*)<!--Q\1END-->/s', $text, $matches);
// Trim all desired strings
$output = array_map('trim', $matches[2]);
To get literally what you want lookarounds are good option:
(?<=<!--([A-Z]\d)-->)[\s\S]*?(?=<!--\1END-->)
Demo
Caveat: Works as long as your comment keys (e.g. Q1) do not exceed A0-Z9. You cannot simply use [A-Z]\d+ instead since PHP's/PCRE regex engine does not like quantifiers/variable length patterns in lookbehinds.
Otherwise, I recommend using a capture group like this:
<!--([A-Z]\d+)-->([\s\S]*?)<!--\1END-->
Use it in your code like this:
$re = '/<!--([A-Z]\d+)-->([\s\S]*?)<!--\1END-->/s';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
To get rid of the newline, just use trim(), there are several ways to apply it, e.g. a foreach, a map, etc.
foreach ($matches as $match){
$result[] = trim($match[2]);
}
var_dump($result);

preg replace would ignore non-letter characters when detecting words

I have an array of words and a string and want to add a hashtag to the words in the string that they have a match inside the array. I use this loop to find and replace the words:
foreach($testArray as $tag){
$str = preg_replace("~\b".$tag."~i","#\$0",$str);
}
Problem: lets say I have the word "is" and "isolate" in my array. I will get ##isolate at the output. this means that the word "isolate" is found once for "is" and once for "isolate". And the pattern ignores the fact that "#isoldated" is not starting with "is" anymore and it starts with "#".
I bring an example BUT this is only an example and I don't want to just solve this one but every other possiblity:
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
Output will be:
this #is ##isolated #is an example of this and that
You may build a regex with an alternation group enclosed with word boundaries on both ends and replace all the matches in one pass:
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
echo preg_replace('~\b(?:' . implode('|', $testArray) . ')\b~i', '#$0', $str);
// => this #is #isolated #is an example of this and that
See the PHP demo.
The regex will look like
~\b(?:is|isolated|somethingElse)\b~
See its online demo.
If you want to make your approach work, you might add a negative lookbehind after \b: "~\b(?<!#)".$tag."~i","#\$0". The lookbehind will fail all matches that are preceded with #. See this PHP demo.
A way to do that is to split your string by words and to build a associative array with your original array of words (to avoid the use of in_array):
$str = "this is isolated is an example of this and that";
$testArray = array('is','isolated','somethingElse');
$hash = array_flip(array_map('strtolower', $testArray));
$parts = preg_split('~\b~', $str);
for ($i=1; $i<count($parts); $i+=2) {
$low = strtolower($parts[$i]);
if (isset($hash[$low])) $parts[$i-1] .= '#';
}
$result = implode('', $parts);
echo $result;
This way, your string is processed only once, whatever the number of words in your array.

How to find which string occuring first in text among multiple strings?

I have text like this, "wow! It's Amazing.". I need to split this text by either "!" or "." operator and need to show the first element of array(example $text[0]).
$str="wow! it's, a nice product.";
$text= preg_split('/[!.]+/', $str);
here $text[0] having the value of "wow" only. but I want to know which string occurring first in text (whether its "!" or "."), so that I will append it to $text[0] and shown like this "wow!".
I want to use this preg_split in smarty templates.
<p>{assign var="desc" value='/[!.]+/'|preg_split:'wow! it's, a nice product.'}
{$desc[0]}.</p>
the above code displays the result as "wow". There is no preg_match in smarty, so far i have searched.other wise,i would use that.
Any help would be appreciated.Thanks in Advance.
Instead of preg_split you should use preg_match:
$str="wow! it's, a nice product.";
if ( preg_match('/^[^!.]+[!.]/', $str, $m) )
$s = $m[0]; //=> wow!
If you must use preg_split only then you can do:
$arr = preg_split('/([^!.]+[!.])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
$s = $arr[0]; //=> wow!
Try this
/(.+[!.])(.+)/
it will split the string in to two.
$1 => wow!
$2 => it's, a nice product.
see here

Only last element of array being used when replacing text

I am trying to replace some "common" words from a large block of text, however it's only using the last word from the array, please can you see where I'm going wrong?
Thanks
$glue = strtolower ($glue);//make all lower case
//remove common words
$Maffwordlist = array('the','to','for');
foreach($Maffwordlist as $Maffword)
$filtered = preg_replace("/\s". $Maffword ."\s/", " ", $glue);
The extract above only removes 'for' from the text, 'the' and 'to' are still included.
Any help appreciated.
The problem is that the subject of your preg_replace() is always $glue, which itself never changes. Before iterating your list of words, you need to assign the starting contents of $glue into $filtered since that is what you are acting on in order to accumulate all the values into it.
// $filtered is the string you'll be modifying...
$filtered = strtolower ($glue);//make all lower case
$Maffwordlist = array('the','to','for');
foreach($Maffwordlist as $Maffword) {
$filtered = preg_replace("/\s". $Maffword ."\s/", " ", $glue);
}
But we can do better.
A regular expression can be constructed to handle all the replacements without a loop using a (a|b|c) grouping.
// Stick the words together with pipes
$pattern = implode("|", $Maffwordlist);
// And surround with regex delimiters and ()
// so the whole regex looks like /\s(the|to|for)\s/
$pattern = '/\s(' . $pattern . ')\s/';
// And do the operation in one go:
$filtered = preg_replace($pattern, " ", $filtered);
I'll note you may wish to use \b word boundaries instead of \s delimiting these by whitespace. That way, you would get proper replacements in a sentence like "You should not end a sentence with for." where one of your list words appears but not bound by whitespace.
Finally then, you'll end up with multiple consecutive spaces in some places where replacements have taken place. You can collapse those into single spaces with something like the following.
// Replace multiple spaces with a single space
$filtered = preg_replace('/\s+/', ' ', $filtered);

Php regexp for escaping characters

I have a string that the user may split manually using comma's.
For example, the string value1,value2,value3 should result in the array:
["value1", "value2", "value3"]
Now what if the user wishes to allow a comma as a substring? I would like to solve that problem by letting the user escape a comma using two comma's or a backslash. For example, the string
"Hi, Stackoverflow" would be written as "Hi,, Stackoverflow" or "Hi\, Stackoverflow".
I find it difficult to evaluate such a string however. I have attempted preg splitting, but there is no way to see if a lookbehind or lookahead series of characters consists of an even or odd number. Furthermore, backslashes and double comma's meant for escaping must be removed as well, which probably requires an additional replace function.
$text = 'Hello, World \,asdas, 123';
$data = preg_split('/(?<=[^\\\]),/',$text);
print_r($data);
Result
Array ( [0] => Hello [1] => World \,asdas [2] => 123 )
For this I would run preg_replace_callback which allows you to count escape characters used and determine what to do with them. If it turns out that coma is not escaped, replace it to some non-printable character that should not be used by user in his input and then explode by this character:
<?php
$str = "One,Two\\, Two\\\\,Three";
$delimiter = chr(0x0B); // vertical tab, hope you do not expect it in the input?
$escaped = preg_replace_callback('/(\\\\)*,?/', function($m) use($delimiter){
if(!isset($m[1]) || strlen($m[0])%2) {
return str_replace(',',$delimiter,preg_replace('/\\\\{2}/','\\',$m[0]));
} else {
return str_replace('\\,',',', preg_replace('/\\\\{2}/','\\',$m[0]));
}
}, $str);
$array = explode($delimiter, $escaped);

Categories