I always use preg_match and it always works fine,
but today I was trying to get a content between two html tags <code: 1>DATA</code>
And I have a problem, which my code explains:
function findThis($data){
preg_match_all("/\<code: (.*?)\>(.*?)\<\/code\>/i",$data,$conditions);
return $conditions;
}
// plain text
// working fine
$data1='Some text...Some.. Te<code: 1>This is a php code</code>';
//A text with a new lines
// Not working..
$data2='some text..
some.. te
<code: 1>
This is a php code
..
</code>
';
print_r(findThis($data1));
// OUTPUT
// [0][0] => <code: 1>This is a php code</code>
// [1][0] => 1
// [2][0] => This is a php code
print_r(findThis($data2));
//Outputs nothing!
This is because the . character in PHP is a wildcard for anything but newline. Examples including newlines would break. What you want to do is add the "s" flag to the end of your pattern, which modifies the . to match absolutely everything (including newlines).
/\<code: (.*?)\>(.*?)\<\/code\>/is
See here: http://www.php.net/manual/en/regexp.reference.internal-options.php
Related
Sending an API request I get a json string as answer which seems to include a hidden character, a midpoint [·]. In my ATOM editor the character is not visible but trying to remove the character after the midpoint results in no visible action, which indicates that it then removed the midpoint.
The consequence of the problem that transforming the json string to a PHP array results in array having value NULL.
Question:
What is the most straightforward way to remove the hidden character?
Should I search for the character and simply cut that character out of the string?
I understand that potentially the best would be to find the root-cause of why the midpoint got there, but I cannot find the root-cause.
Investigation and outcomes:
Comparing [$body1] and [body2] in https://www.diffchecker.com/, it shows:
[$body1] ·'{"columns":"test"}'
[$body1] '{"columns":"test"}'
This test shows that I do in fact have a hidden character.
It might not work in your environment to test since the hidden character probably is removed by copy/paste.
$body1 = '{"columns":"test"}'; // Hidden character.
$body2 = '{"columns":"test"}'; // Removed hidden character.
$body3 = '{"columns":"test"}'; // Same as body2.
var_dump(json_decode($body2, true));
if($body1 == $body2) {
echo 'Content the same';
} else
echo 'Content differs';
Result:
Content differs
Checking string length of the body strings.
echo strlen($body1) . "\n";
echo strlen($body2) . "\n";
echo strlen($body3) . "\n";
Result:
21
18
18
I have text in a with paragraphs, spaces and justied text like this coming from the database.
Hello.
My name is John.
Thank you
But when I use PHP Word with TemplateProcessor to move to a Word document it generates everything without paragraphs.
One solution I found for this was to do this:
$text=preg_replace('/\v+|\\\r\\\n/','<w:p/>',$TextWhitoutParagraphs);
He actually writes with paragraphs but only the first paragraph is justified.
How do I do this correctly with paragraphs and all text justified?
You can use cloneBlock. I used it with justified text and it worked perfectly. In your template use:
${paragraph}
${text}
${/paragraph}
And then explode your string by"\n":
$textData = explode("\n", $text);
$replacements = [];
foreach($textData as $text) {
$replacements[] = ['text' => $text];
}
$templateProcessor->cloneBlock('paragraph', count($replacements), true, false, $replacements);
The TemplateProcessor can only be used with single line strings (see docs: "Only single-line values can be replaced." http://phpword.readthedocs.io/en/latest/templates-processing.html)
What you could try is replacing your new lines with '' (closing the opened paragraph and starting a new one), but that would just be my guess right now. It always helps to check the resulting Word-XML for syntax-errors.
I'm trying to convert text ($icon) to smiley image ($image). I used to do it with str_replace(), but that seems to perform the replace sequentially and as such it also replaces items in previously converted results (for example in the tag).
I am now using the following code:
foreach($smiliearray as $image => $icon){
$pattern[]="/(?<!\S)" . preg_quote($icon, '/') . "(?!\S)/u";
$replacement[]=" <img src='$image' border='0' alt=''> ";
}
$text = preg_replace($pattern,$replacement,$text);
This code works, but only if the smiley code is surrounded by whitespace. So basically if someone types ":);)", it won't catch it as two separate smilieys, but ":) ;)" does.
How can I fix it so that also a string of smileys (not separated by space) are converted?
Note that there can be unlimited kinds of smiley codes and smiley images. I do not know beforehand which ones, because other people can submit codes and smileys, so it is not just ":)" and ";)", but can also be "rofl", ":eh", ":-{", etc.
I can partially fix it by adding a \W non-word to the end of the 2nd capturegroup: (?!\S\W), and further by adding a 2nd $pattern and $replacement with a \W to the first capturegroup. But I don't think that is the way it should be done, and it only partially solves it.
I used to do it with str_replace(), but that seems to perform the
replace sequentially and as such it also replaces items in previously
converted results...
A good and true reason to use strtr(). You don't even need Regular Expressions:
<?php
// I assume your original array looks like this
$origSmileys = [
"/1.png" => ':)',
"/2.png" => ':(',
"/3.png" => ':P',
"/4.png" => '>:('
];
// sample input string
$str = " I'm :) but :(>:(:( now :P";
// iterating over smileys to add html tag
$newSmileys = array_map(function($value) {
return "<img src='$value' border='0' alt=''>";
}, array_flip($origSmileys));
// replace
echo strtr($str, $newSmileys);
Live demo
I am running a RST to php conversion and am using preg_match.
this is the rst i am trying to identify:
An example of the **Horizon Mapping** dialog box is shown below. A
summary of the main features is given below.
.. figure:: horizon_mapping_dialog_horizons_tab.png
**Horizon Mapping** dialog box, *Horizons* tab
Some of the input values to the **Horizon Mapping** job can be changed
during a Workflow using the internal programming language, IPL. For
details, refer to the *IPL User Guide*.
and I am using this regex:
$match = preg_match("/.. figure:: (.*?)(\n{2}[ ]{3}.*\n)/s", $text, &$result);
however it is returning as false.
here is a link of the expression working on regex
http://regex101.com/r/oB3fW7.
Are you sure that the line break is \n, is doubt, use \R:
$match = preg_match("/.. figure:: (.*?)(\R{2}[ ]{3}.*\R)/s", $text, &$result);
\R stands for either \n, \r and \r\n
My instinct would be to do some troubleshooting around the s flag as well as the $result variable passed by reference. To achieve the same without any interference from dots and the return variable, can you please try this regex:
..[ ]figure::[ ]([^\r\n]*)(?:\n|\r\n){2}[ ]{3}[^\r\n]*\R
In code, please try exactly like this:
$regex = "~..[ ]figure::[ ]([^\r\n]*)(?:\n|\r\n){2}[ ]{3}[^\r\n]*\R~";
if(preg_match($regex,$text,$m)) echo "Success! </br>";
Finally:
If this does not working, you might have a weird Unicode line break that php is not catching. To debug, for each character of your string, iterate through all the string's characters
Iterate: foreach(str_split($text) as $c) {
Print the character: echo $c . " value = "
Print the value from this function: . _uniord($c) . "<br />"; }
I've been producing a letter compilation system (to save people time after a questionaire has been filled in) and near the end of the project we've found a bug. Long story short it would take many hours to fix without this regular expression - which is why I'm asking for your fine help!
We have some text that contains the following...
"<k>This is the start of the paragraph
This is some more of the paragraph.
And some more";
I basically need a regular expression that can search for the opening tag, "<k>", and also the first new line it comes across "\r\n"? and then insert the contents into a variable I can then use (with the <k> removed but the new line codes, "\r\n", left in place).
I'm using PHP and the text (like the example above) is stored in MySQL.
Please help!
I promise I'll learn these properly after I've fixed this bug! :)
If you are using 5.3 you can make some use of closures like this:
$text = "<k>This is the start of the paragraph
This is some more of the paragraph.
And some more";
$matches = array();
$text = preg_replace_callback('/<k>(.*)$/m', function($match) use (&$matches){
$matches[] = $match[1];
}, $text);
var_dump($text,$matches);
Output is:
string '
This is some more of the paragraph.
And some more' (length=52)
array
0 => string 'This is the start of the paragraph' (length=34)
I'm assuming there could be multiple <k> tags in the text, and therefore, I put all the text that follows the tag into an array called matches.
As a further example... with the following input:
$text = "<k>This is the start of the paragraph
Line 2 doesn't have a tag...
This is some more <k>of the paragraph.
Line 4 doesn't have a tag...
<k>And some more";
The output is:
string '
Line 2 doesn't have a tag...
This is some more
Line 4 doesn't have a tag...
' (length=78)
array
0 => string 'This is the start of the paragraph' (length=34)
1 => string 'of the paragraph.' (length=17)
2 => string 'And some more' (length=13)
/^<k>(\w*\s+)$/
would probably work.