I am looking to convert a string with a special HTML tag and parse it accordingly. Below I will show what the original string is followed by what I want the parsed string to be. If someone can direct me towards a proper coding method to make this possible that would be fantastic.
Original String:
$string = '<string 1="Jacob" 2="ice cream">{1} likes to have a lot of {2}.</string>';
Parsed String:
$parsed_string = 'Jacob likes to have a lot of ice cream.';]
EDIT:
I forgot to add that the $string variable may having multiple strings with multiple options, for example the $string variable could be the following:
$string = '<string 1="hot dog">I like to have {1}</string> on <string 1="beach" 2="sun">the {1} with the blazing hot {2} staring down at me.';
I need a solution that can parse the code example above.
EDIT 2:
Here is a sample code I developed that is incomplete and has a few bugs. If there is more than one option e.x. 1='blah' 2='blahblah' it will not parse the second option.
$string = '<phrase 1="Jacob" 2="cool">{1} is {2}</phrase> when <phrase 1="John" 2="Chris">{1} and {2} are around.</phrase>';
preg_match_all('/<phrase ([0-9])="(.*?)">(.*?)<\/phrase>/', $string, $matches);
print $matches[1][0] . '<br />';
print $matches[2][0] . '<br />';
print $matches[3][0] . '<br />';
print '<hr />';
$string = $matches[3][0];
print str_replace('{' . $matches[1][0] . '}', $matches[2][0], $output);
print '<hr />';
print '<pre>';
print_r($matches);
print '</pre>';
As $string is no valid XML (e.g. containing numbers as attribute names), you may try:
$string = '<string 1="Jacob" 2="ice cream">{1} likes to have a lot of {2}.</string>';
$parsed_string = strip_tags($string);
for ($i = 1; $i <= 2; $i++) {
if (preg_match('/' . $i . '="([^"]+)"/', $string, $match))
$parsed_string = str_replace('{' . $i .'}', $match[1], $parsed_string);
}
echo $parsed_string;
UPDATE
Your EDIT switched from having one <string> tag to having multiple <string> tags in the variable now. This one should work for multiples:
$string2 = '<string 1="hot dog">I like to have {1}</string> on <string 1="beach" 2="sun">the {1} with the blazing hot {2} staring down at me.</string>';
$parsed_string2 = '';
$a = explode('</string>', $string2);
foreach ($a as $s) {
$parsed_elm = strip_tags($s);
for ($i = 1; $i <= 2; $i++) {
if (preg_match('/' . $i . '="([^"]+)"/', $s, $match))
$parsed_elm = str_replace('{' . $i .'}', $match[1], $parsed_elm);
}
$parsed_string2 .= $parsed_elm;
}
echo $parsed_string2;
<?php
$rows = array();
$xml = "
<string 1="Jacob" 2="ice cream">{1} likes to have a lot of {2}.</string>
<string 1="John" 2="cream">{1} likes to have a lot of {2}.</string>
"
$parser = xml_parser_create();
xml_parse_into_struct($parser, trim($xml), $xml_values);
foreach ($xml_values as $row){
$finalRow = $row['values'];
foreach ($row['attributes'] as $att => $attval){
$finalRow = str_replace ($finalRow, "{".$att."}", $attval);
}
$rows[] = $finalRow;
}
?>
Here's a version that doesn't use regex, this seemed more straight forward. I don't know how the xml parser would cope with attributes that start with a number though.
Related
I have a search String: $str (Something like "test"), a wrap string: $wrap (Something like "|") and a text string: $text (Something like "This is a test Text").
$str is 1 Time in $text. What i want now is a function that will wrap $str with the wrap defined in $wrap and output the modified text (even if $str is more than one time in $text).
But it shall not output the whole text but just 1-2 of the words before $str and then 1-2 of the words after $str and "..." (Only if it isn`t the first or last word). Also it should be case insensitive.
Example:
$str = "Text"
$wrap = "<span>|</span>"
$text = "This is a really long Text where the word Text appears about 3 times Text"
Output would be:
"...long <span>Text</span> where...word <span>Text</span> appears...times <span>Text</span>"
My Code (Obviusly doesnt works):
$tempar = preg_split("/$str/i", $text);
if (count($tempar) <= 2) {
$result = "... ".substr($tempar[0], -7).$wrap.substr($tempar[1], 7)." ...";
} else {
$amount = substr_count($text, $str);
for ($i = 0; $i < $amount; $i++) {
$result = $result.".. ".substr($tempar[$i], -7).$wrap.substr($tempar[$i+1], 0, 7)." ..";
}
}
If you have a tipp or a solution dont hesitate to let me know.
I have taken your approach and made it more flexible. If $str or $wrap changes you could have escaping issues within the regex pattern so I have used preg_quote.
Note that I added $placeholder to make it clearer, but you can use $placeholder = "|" if you don't like [placeholder].
function wrapInString($str, $text, $element = 'span') {
$placeholder = "[placeholder]"; // The string that will be replaced by $str
$wrap = "<{$element}>{$placeholder}</{$element}>"; // Dynamic string that can handle more than just span
$strExp = preg_quote($str, '/');
$matches = [];
$matchCount = preg_match_all("/(\w+\s+)?(\w+\s+)?({$strExp})(\s+\w+)?(\s+\w+)?/i", $text, $matches);
$response = '';
for ($i = 0; $i < $matchCount; $i++) {
if (strlen($matches[1][$i])) {
$response .= '...';
}
if (strlen($matches[2][$i])) {
$response .= $matches[2][$i];
}
$response .= str_replace($placeholder, $matches[3][$i], $wrap);
if (strlen($matches[4][$i])) {
$response .= $matches[4][$i];
}
if (strlen($matches[5][$i]) && $i == $matchCount - 1) {
$response .= '...';
}
}
return $response;
}
$text = "text This is a really long Text where the word Text appears about 3 times Text";
string(107) "<span>text</span> This...long <span>text</span> where...<span>text</span> appears...times <span>text</span>"
To make the replacement case insensitive you can use the i regex option.
If I understand your question correct, just a little bit of implode and explode magic needed
$text = "This is a really long Text where the word Text appears about 3 times Text";
$arr = explode("Text", $text);
print_r(implode('<span>Text</span>', $arr));
If you specifically need to render the span tags using HTML, just write it that way
$arr = explode("Text", $text);
print_r(implode('<span>Text</span>', $arr));
Use patern below to get your word and 1-2 words before and after
/((\w+\s+){1,2}|^)text((\s+\w+){1,2}|$)/i
demo
In PHP code it can be:
$str = "Text";
$wrap = "<span>|</span>";
$text = "This is a really long Text where the word Text appears about 3 times Text";
$temp = str_replace('|', $str, $wrap); // <span>Text</span>
// find patern and 1-2 words before and after
// (to make it casesensitive, delete 'i' from patern)
if(preg_match_all('/((\w+\s+){1,2}|^)text((\s+\w+){1,2}|$)/i', $text, $match)) {
$res = array_map(function($x) use($str, $temp) { return '... '.str_replace($str, $temp, $x) . ' ...';}, $match[0]);
echo implode(' ', $res);
}
Using the following code:
$text = "أطلقت غوغل النسخة المخصصة للأجهزة الذكية العاملة بنظام أندرويد من الإصدار “25″ لمتصفحها الشهير كروم.ولم تحدث غوغل تطبيق كروم للأجهزة العاملة بأندرويد منذ شهر تشرين الثاني العام الماضي، وهو المتصفح الذي يستخدمه نسبة 2.02% من أصحاب الأجهزة الذكية حسب دراسة سابقة. ";
$tags = "غوغل, غوغل النسخة, كروم";
$tags = explode(",", $tags);
foreach($tags as $k=>$v) {
$text = preg_replace("/\b{$v}\b/u","$0",$text, 1);
}
echo $text;
Will give the following result:
I love PHP">love PHP</a>, but I am facing a problem
Note that my text is in Arabic.
The way is to do all in one pass. The idea is to build a pattern with an alternation of tags. To make this way work, you must before sort the tags because the regex engine will stop at the first alternative that succeeds (otherwise 'love' will always match even if it is followed by 'php' and 'love php' will never be matched).
To limit the replacement to the first occurence of each word you can remove tag from the array once it has been found and you test if it is always present in the array inside the replacement callback function:
$text = 'I love PHP, I love love but I am facing a problem';
$tagsCSV = 'love, love php, facing';
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(?:' . implode('|', $tags) . ')\b/iu';
$text = preg_replace_callback($pattern, function ($m) use (&$tags) {
$mLC = mb_strtolower($m[0], 'UTF-8');
if (false === $key = array_search($mLC, $tags))
return $m[0];
unset($tags[$key]);
return '<a href="index.php?s=news&tag=' . rawurlencode($mLC)
. '">' . $m[0] . '</a>';
}, $text);
Note: when you build an url you must encode special characters, this is the reason why I use preg_replace_callback instead of preg_replace to be able to use rawurlencode.
If you have to deal with an utf8 encoded string, you need to add the u modifier to the pattern and you need to replace strtolower with mb_strtolower)
the preg_split way
$tags = explode(', ', $tagsCSV);
rsort($tags);
$tags = array_map('preg_quote', $tags);
$pattern = '/\b(' . implode('|', $tags) . ')\b/iu';
$items = preg_split($pattern, $text, -1, PREG_SPLIT_DELIM_CAPTURE);
$itemsLength = count($items);
$i = 1;
while ($i<$itemsLength && count($tags)) {
if (false !== $key = array_search(mb_strtolower($items[$i], 'UTF-8'), $tags)) {
$items[$i] = '<a href="index.php?s=news&tag=' . rawurlencode($tags[$key])
. '">' . $items[$i] . '</a>';
unset($tags[$key]);
}
$i+=2;
}
$result = implode('', $items);
Instead of calling preg_replace multiple times, call it a single time with a regexp that matches any of the tags:
$tags = explode(",", tags);
$tags_re = '/\b(' . implode('|', $tags) . ')\b/u';
$text = preg_replace($tags_re, '$0', $text, 1);
This turns the list of tags into the regexp /\b(love|love php|facing)\b/u. x|y in a regexp means to match either x or y.
I have a markdown text content which I have to replace without using library functions.So I used preg replace for this.It works fine for some cases.For cases like heading
for eg Heading
=======
should be converted to <h1>Heading</h1> and also
##Sub heading should be converted to <h2>Sub heading</h2>
###Sub heading should be converted to <h3>Sub heading</h3>
I have tried
$text = preg_replace('/##(.+?)\n/s', '<h2>$1</h2>', $text);
The above code works but I need to have count of hash symbol and based on that I have to assign heading tags.
Anyone help me please....
Try using preg_replace_callback.
Something like this -
$regex = '/(#+)(.+?)\n/s';
$line = "##Sub heading\n ###sub-sub heading\n";
$line = preg_replace_callback(
$regex,
function($matches){
$h_num = strlen($matches[1]);
return "<h$h_num>".$matches[2]."</h$h_num>";
},
$line
);
echo $line;
The output would be something like this -
<h2>Sub heading</h2> <h3>sub-sub heading</h3>
EDIT
For the combined problem of using = for headings and # for sub-headings, the regex gets a bit more complicated, but the principle remains the same using preg_replace_callback.
Try this -
$regex = '/(?:(#+)(.+?)\n)|(?:(.+?)\n\s*=+\s*\n)/';
$line = "Heading\n=======\n##Sub heading\n ###sub-sub heading\n";
$line = preg_replace_callback(
$regex,
function($matches){
//var_dump($matches);
if($matches[1] == ""){
return "<h1>".$matches[3]."</h1>";
}else{
$h_num = strlen($matches[1]);
return "<h$h_num>".$matches[2]."</h$h_num>";
}
},
$line
);
echo $line;
Whose Output is -
<h1>Heading</h1><h2>Sub heading</h2> <h3>sub-sub heading</h3>
Do a preg_match_all like this:
$string = "#####asdsadsad";
preg_match_all("/^#/", $string, $matches);
var_dump ($matches);
And based on count of matches you can do whatever you want.
Or, use the preg_replace_callback function.
$input = "#This is my text";
$pattern = '/^(#+)(.+)/';
$mytext = preg_replace_callback($pattern, 'parseHashes', $input);
var_dump($mytext);
function parseHashes($input) {
var_dump($input);
$matches = array();
preg_match_all('/(#)/', $input[1], $matches);
var_dump($matches[0]);
var_dump(count($matches[0]));
$cnt = count($matches[0]);
if ($cnt <= 6 && $cnt > 0) {
return '<h' . $cnt . ' class="if you want class here">' . $input[2] . '</h' . $cnt . '>';
} else {
//This is not a valid h tag. Do whatever you want.
return false;
}
}
I'm having trouble finding a correct regex to achieve what I want.
I have a sentence like that :
Hi, my name is Stan, you are welcome, hello.
and I would like to transform it like that :
[hi|hello|welcome], my name is [stan|jack] you are [hi|hello|welcome] [hi|hello|welcome].
Right now my regex is half working, because somes words are not replaced, and those replaced are deleting some characters
Here is my test code
<?php
$test = 'Hi, my name is Stan, you are welcome, hello.';
$words = array(
array('hi', 'hello', 'welcome'),
array('stan', 'jack'),
);
$result = $test;
foreach ($words as $group) {
if (count($group) > 0) {
$replacement = '[' . implode('|', $group) . ']';
foreach ($group as $word) {
$result = preg_replace('#([^\[])' . $word . '([^\]])#i', $replacement, $result);
}
}
}
echo $test . '<br />' . $result;
Any help will be appreciated
The regex you are using is overcomplicated. You simply need to use a regex substitution using regular brackets ():
<?php
$test = 'Hi, my name is Stan, you are welcome, hello.';
$words = array(
array('hi', 'hello', 'welcome'),
array('stan', 'jack'),
);
$result = $test;
foreach ($words as $group) {
if (count($group) > 0) {
$imploded = implode('|', $group);
$replacement = "[$imploded]";
$search = "($imploded)";
$result = preg_replace("/$search/i", $replacement, $result);
}
}
echo $test . '<br />' . $result;
Your regular expression:
'#([^\[])' . $word . '([^\]])#i'
matches one character before and after $word as well. And as they do, they replace it. So your replacement string needs to reference these parts, too:
'$1' . $replacement . '$2'
Demo
preg_replace supports array as parameter. No need to iterate with a loop.
$s = array("/(hi|hello|welcome)/i", "/(stan|jack)/i");
$r = array("[hi|hello|welcome]", "[stan|jack]");
preg_replace($s, $r, $str);
or dynamically
$test = 'Hi, my name is Stan, you are welcome, hello.';
$s = array("hi|hello|welcome", "stan|jack");
$r = array_map(create_function('$a','return "[$a]";'), $s);
$s = array_map(create_function('$a','return "/($a)/i";'), $s);
echo preg_replace($s, $r, $str);
//[hi|hello|welcome], my name is [stan|jack], you are [hi|hello|welcome], [hi|hello|welcome].
I am trying to do something similar to hangman where when you guess a letter, it replaces an underscore with what the letter is. I have come up with a way, but it seems very inefficient and I am wondering if there is a better way. Here is what I have -
<?
$word = 'ball';
$lettersGuessed = array('b','a');
echo str_replace( $lettersGuessed , '_' , $word ); // __ll
echo '<br>';
$wordArray = str_split ( $word );
foreach ( $wordArray as $letterCheck )
{
if ( in_array( $letterCheck, $lettersGuessed ) )
{
$finalWord .= $letterCheck;
} else {
$finalWord .= '_';
}
}
echo $finalWord; // ba__
?>
str_replace does the opposite of what I want. I want what the value of $finalWord is without having to go through a loop to get the result I desire.
If I am following you right you want to do the opposite of the first line:
echo str_replace( $lettersGuessed , '_' , $word ); // __ll
Why not create an array of $opposite = range('a', 'z'); and then use array_diff () against $lettersGuessed, which will give you an array of unguessed letters. It would certainly save a few lines of code. Such as:
$all_letters = range('a', 'z');
$unguessed = array_diff ($all_letters, $lettersGuessed);
echo str_replace( $unguessed , '_' , $word ); // ba__
It's an array, foreach is what you're suppose to be doing, it's lightning fast anyways, I think you are obsessing over something that's not even a problem.
You want to use an array becuase you can easily tell which indexes in the array are the ones that contain the letter, which directly correlates to which place in the string the _ should become a letter.
Your foreach loop is a fine way to do it. It won't be slow because your words will never be huge.
You can also create a regex pattern with the guessed letters to replace everything except those letters. Like this:
$word = 'ball';
$lettersGuessed = array('b','a');
$pattern = '/[^' . implode('', $lettersGuessed) . ']/'; // results in '/[^ba]/
$maskedWord = preg_replace($pattern, '_', $word);
echo $maskedWord;
Another way would be to access the string as an array, e.g.
$word = 'ball';
$length = strlen($word);
$mask = str_pad('', $length, '_');
$guessed = 'l';
for($i = 0; $i < $length; $i++) {
if($word[$i] === $guessed) {
$mask[$i] = $guessed;
}
}
echo $mask; // __ll