Why would whitespace cause a fatal error in PHP? - php

I have an associative array that, for all intents and purposes, appears to have absolutely no reason to throw even a warning. The entirety of the source code is as follows:
<?php
$entries = array(
array('date' => '2012-08-12', 'change' => '-19'),
array('date' => '2012-08-13', 'change' => '-21'), 
array('date' => '2012-08-14', 'change' => '-19'),
array('date' => '2012-08-15', 'change' => '-17'),
);
foreach ($entries as $entry) {
print $entry['date'] . ': ' . $entry['change'] . '<br/>';
}
To me everything looks fine, however when I go to view the output in my browser, I get the following error message:
Parse error: syntax error, unexpected T_ARRAY, expecting ')' in /Applications/MAMP/htdocs/wtf2.php on line 5
I looked a little closer and then discovered that on line 4, there was what appeared to be a trailing space or two (which I didn't even think twice about, at first). However when I copied the whitespace, pasted it into a new document like so (line 2):
<?php
$whitespace = '  ';
print rawurlencode($whitespace);
... and then viewed the output in my browser, this is what I saw:
%C2%A0%20
I would ask, "How did that get there in the first place?" but I don't really think that's a feasible question to answer. So my actual question is this: how does whitespace like that differ from any other whitespace (especially to the point where it causes a fatal error when ran through the PHP interpreter)? And is there a way to prevent this from happening in the future?
PS: I'm running PHP version 5.3.20 (via MAMP Pro on a Mac).
PPS: To clarify, WHEN THE WHITESPACE ITSELF IS DELETED, THE CODE RUNS FINE.

What you have there is a UTF-8 encoded non-breaking space (%C2%A0, aka ), which breaks PHP's parser. It's a known problem with PHP that is marked as "won't fix" because people "might use this character as part of an identifier." See Request #38766 Nonbreaking whitespace breaks parsing.

just banged my head for about an hour going totally insane about this.
Notepad++ doesn't show ANY difference between a regular space and a nbsp.
If you wonder how this character actually got into your code, this is how i got mine :
altgr + space

Related

PHP regex not working, despite successfully matching in regex101

I'm having the weirdest issue. I've tried referencing other similar answers here, but none seem to fix my issue.
I have the following regex in PHP
/if\s+(?:(.*?)\s*==\s*(?:UrlStatus|DeadURL)|in_array\s*\((?:UrlStatus|DeadURL),\s*(.*?)\s*\))\s*then\s+local\s+arch_text\s+=\s+cfg.messages\['archived'\];(?:(?:\n|.)*?if\s+(?:(.*?)\s*==\s*(?:UrlStatus|DeadURL)|in_array\s*\((?:UrlStatus|DeadURL),\s*(.*?)\s*\))\s*then\s+Archived = sepc \.\.)?/im
It's a messy regex I know, it's supposed to parse code from a module of various versions from different location. It works perfectly in regex101, but preg_match returns false, indicating an error occured. The regex you see is pulled straight from a var_dump. Also pulled from the var_dump is the string being tested. I have included the excerpt that is supposed to match it below.
if is_set(ArchiveURL) then
if not is_set(ArchiveDate) then
ArchiveDate = seterror('archive_missing_date');
end
if "no" == DeadURL then
local arch_text = cfg.messages['archived'];
if sepc ~= "." then arch_text = arch_text:lower() end
Archived = sepc .. " " .. substitute( ```
In the full block of text it takes 81,095 steps to match.
Could it have something to do with that?
Getting a read from preg_last_error(), it returned 6, which maps to the constant PREG_JIT_STACKLIMIT_ERROR.
PHP 7 uses a JIT compiler for preg_match with a small stack size limit. Disabling it allows preg_match to do its job.
This can be done in the php.ini file, or on the fly in the script by using ini_set( 'pcre.jit', false );

Parsing XML tags with hyphens not working as expected

Thanks in part to SO I was able to figure out how I can access XML tags with hyphens (<some-tag>). All the examples I have seen do it something like this.
$content = $xml->{'document-content'};
But for me that doesn't work, and this does
$content = $xml->{document-content};
that is without the quotes (how I figured that out I don't recall, a mistake maybe). If I use the quotes I get this error
Notice: Trying to get property of non-object in /html/my/dir/myfile.php on line 26
So one would think just use it without the quotes. Sure that is till I get to the reason I am parsing the XML. The XML is from an ODT file and will ultimately be used as template to generate PDF's. While developing anything I always use "E_ALL" error reporting. With it set I get these 2 errors when I use it without quotes.
Notice: Use of undefined constant document - assumed 'document' in /html/my/dir/myfile.php on line 24
Notice: Use of undefined constant content - assumed 'content' in /html/my/dir/myfile.php on line 24
But, it does parse the rest of the document just fine. Problem is that I need to create a PDF and if it outputs that "Notice" error prior to the PDF generator running the "header" does not get set properly and no PDF is created. Now one might suggest I turn off error reporting, but then if the PDF isn't working I can't see those errors.
In truth I am at a loss as to why it works at all without quotes. Everything I know about PHP syntax says that without quotes it would be a constant (as the error points out) that must be defined some where prior to it. As such the entire parser should fail at that point, but it doesn't, in fact the opposite is true the parser works.
Mostly I just need to know how to get rid of those 2 notice errors, without disabling error reporting. And I would be very interested in why it works without the quotes, as how it is working seems to drastically deviate from all the norms of programing.
Just in case its needed here is all the code leading up to "$content"
$zip = new ZipArchive;
if ($zip->open('../docs/myfile.odt') === true)
{
$xmlstring = $zip->getFromName('content.xml');
$zip->close();
}
// remove all namespaces and swaps out tab and space tags
$replace = array('office:', 'style:', 'draw:', 'fo:', 'text:', 'svg:', '<tab/>', '<s/>');
$value = array('', '', '', '', '', '', "\t", ' ');
$xmlstring = str_replace($replace, $value, $xmlstring);
$xmlstring = preg_replace_callback('/<s c="(.+?)"\/>/s', 'ReplaceSpaces', $xmlstring);
$xml = simplexml_load_string($xmlstring);
$content = $xml->{document-content};
I should kick myself, I have parsed a lot of XML files and RSS feeds, over the years, and the answer is very simple. You do not need to reference the main element of an XML when using SimpleXML.
<document-content>
<tag1>
</tag1>
<tag2>
<anothertag>
</anothertag>
</tag2>
</document-content>
So $xml->{document-content}->tag2 (or $xml->{0}->tag2) is the exact same as $xml->tag2
I am guessing with this being my first go round with odt file's, and having to deal with some of its hassles, I overlooked the obvious.

quercus php and RegexpException: Delimiter A in regexp 'Array' must not be backslash or alphanumeric

I'm new here, I know the error in title has been already discussed here but I didn't find any answer to my problem.
I'm trying to make phpbb3 work on my server with tomcat6 using quercus for php.
Everything is ok except bbcode.php module that give me an error (in title) on line 112 that is:
$message = preg_replace($preg['search'], $preg['replace'], $message);
I asked for help in phpbb3 forum but they told me the issue is from quercus.
Still never find an answer in quercus mailing list.
I'dd like to know how can I change that line with another that do the same job.
Thanks in advance.
edit:
Maybe I found where the problem starts:
'preg' => array(
'#\[quote(?:="(.*?)")?:$uid\]((?!\[quote(?:=".*?")?:$uid\]).)?#ise' => "\$this->bbcode_second_pass_quote('\$1', '\$2')"
the point is that this code works perfectly in most cases maybe is Quercus that need a different sintax.
You can find the full bbcode.php here: http://ftp.phpbb-fr.com/public/cdd/phpbb3/3.0.10/nav.html?includes/bbcode.php.source.html
A regular expression must be delimited. Generally the delimiter is the slash. PHP also allows an alphanumeric delimiter as well.
$preg["search"] is not delimited and likely just regex. It needs to be: /regex/, #regex#, or |regex|, etc.
The following code throws the error:
echo preg_replace(array('1', '2'), array('one', 'two'), '1 2');
Should be:
echo preg_replace(array('/1/', '/2/'), array('one', 'two'), '1 2');
Welcome to StackOverflow.

Issue with lsl to php script

PHP part:
$php = $_POST['php'];
//$php = "print \"hello world\";";
if ($php != null){
if (strlen($php) < 400){
echo $php;
eval($php);
//eval("print \"hello world\";");
}else die ("Evaluated Expression Exceeds Maximum Length");
}
LSL part:
string php = "print \"hello world\";";
Now I added the commented out bits into PHP to show that it works. And then when the LSL script sends to PHP it returns:
print \"hello world\"; -- this line is from, 'echo $php;'
<b>Parse error</b>: syntax error, unexpected '"', expecting identifier
(T_STRING) in <b>xxxxxx.php(141) : eval()'d code</b> on line <b>1</b><br />
-- this is the error.
And it is something to do with the the way the two scripts are sending data. I thought maybe had something to do with $php = $_POST['php']; so changed it to $php = $_POST[php]; With no change to the result. I then tried changing print \"hello world\"; to print 'hello world'; It then just returns the error : T_ENCAPSED_AND_WHITESPACE.
I did not supply the full source here. Only the section that was having an issue. It is supplied in a example state. The output is the same as the actual error result, that is being seen in the source. Usage of eval is required for the lsl script and php. In that the code is dynamically being reconfigured by both and sent to one another. Essentially giving the two the ability to code into one another. This is for a game in Second Life.
So if anyone knows of an actual way to pass the required data to and from the scripts. I could use some advice. Or a smack in the head if I missed something simple.
With the kind poke from mario on turning off magic_quotes. I then found what the data was doing in the source. I then ended up researching and using the following : eval(stripslashes($php)); Which completely solves the issue. And based on marios poke.
It had nothing to do with escape data. Didn't think so as echo reported that fine. And it was indeed a slap me in the head error too.
stripslashes — Un-quotes a quoted string
Will vote this as best answer and also a best answer for mario. Wish he would have done his as a answer over comment. So could have voted it.

setcookie() always fails "headers already sent" even following the rules on php.net

The relevant peice of code is below. According to php.net I have to make sure there is no output, not even any whitespace. There isn't any. the php tag is the very first tag in the document no whitespace preceding it. What am I doing wrong?
<?php
// main.php
// 6:48 PM 8/6/2010
include('config.php');
// Does myid cookie exist?
if ( !isset( $_COOKIE['myid'] ) )
{
// Generate myid
$myid = substr(md5(date( 'Ymdhis' ) . str_replace( '.', '', $_SERVER['REMOTE_ADDR'] ) ), 0, 10);
// set the cookie
setcookie( 'myid', $myid, time() + 31536000 );
There needs to be no output at all before calling setcookie, even from other scripts.
If config.php had whitespace before the opening <?php tag (or after the closing ?>), for example, that would cause the problem.
Hunt down where header is called or some text outside of PHP tags (even whitespace) exists. That's what's happening, no doubt.
Some text editors don't show all symbols. Please try to open your main.php and config.php un different editors and check for something before <?php in both files. I can't tell about Windows, but on Linux you can use vim or vi. Also, full error message is telling you where exactly error started.
You fail to read or at least to paste this error message here. It has DETAILED explanation, what are you doing wrong.
most likely it would say output started at config.php:XXX. Pretty clear.
if it says output started at main.php:0 - this is most likely a BOM character, and you have to re-save your file without it, using your editor Save dialog or another editor.
Always read entire error message. It is not only says what something went wrong, but often explains, what certainly happened.

Categories