finding wrong characters in javascript - php

Since I am about to write a small php-script I like to call to get all the javascript for my page, this leads to strange error on client side. The script does actually use an configuration xml-file and some xsl-stylesheets to generate an large Javascript string. Sometimes it happens that I get an 'unterminated String literal' error, sometimes an error rises that says: 'An attempt was made to use an object that is not, or is no longer, usable', just after the javascript executes an document.write operation.
Are there any resources, or is there any tutorial, just something that reveals about the traps of running into such problems when copying a bunch of javscript files into one String or file?
greetings philipp
EDIT::
the following error:
is thrown in an webpage that is delivered with content-type: 'application/xhtml+xml'. The actual generated Javascript looks like this:
source code generated
The script itself runs until the first document.write command is triggered.

It sounds like the strings are not escaped completely/correctly. If you take care of
escaping all apostrophes to \'
escaping all quotation marks to \"
escaping all line feeds to \n
you shouldn't have any errors.

Related

Load string into html from file? Preferably not using javascript

I have the following code in index.html:
<div class="button">
Title
</div>
I'd like to save "ridiculously long string" in a text file, referenced by index.html. Is this possible?
I tried replacing the string like so the following, but it doesn't work: php reference: file_get_contents()
<div class="button">
Title
</div>
Errors symptoms: the button on my page now reads title="title">Title and clicking it takes me to a 404: The requested URL /~user/html_root/< was not found on this server.. index.html and text.txt are in the html_root directory.
Here's how one of the shorter text.txts read:
?autoplay=0&trail=0&grid=1&colors=1&zoom=1&s=%5B{%228%22:%5B60,61,98,103,109,115%5D},{%229%22:%5B60,61,77,78,97,99,102,104,108,110,114,116%5D},{%2210%22:%5B76,79,98,103,105,109,111,115,117%5D},{%2211%22:%5B76,79,104,110,112,116,118%5D},{%2212%22:%5B60,61,63,64,77,78,111,117%5D},{%2213%22:%5B60,61,63,64%5D},{%2219%22:%5B76,77,79,97,98,102,103,108,109,114,115%5D},{%2220%22:%5B76,78,79,97,99,102,104,108,110,114,116%5D},{%2221%22:%5B98,103,105,109,111,115,117%5D},{%2222%22:%5B104,110,112,116,118%5D},{%2223%22:%5B61,111,117%5D},{%2224%22:%5B60,62,76,77%5D},{%2225%22:%5B60,62,75,78%5D},{%2226%22:%5B61,76,79%5D},{%2227%22:%5B77,78,96,97,102,103,109,110,115,116%5D},{%2228%22:%5B96,98,102,104,109,111,115,117%5D},{%2229%22:%5B61,65,97,98,103,105,110,112,116,118%5D},{%2230%22:%5B60,62,64,66,104,105,111,113,117,119%5D},{%2231%22:%5B60,62,64,66,75,76,112,113,118,120%5D},{%2232%22:%5B61,65,75,78,119,120%5D},{%2233%22:%5B77,78%5D},{%2237%22:%5B78,79%5D},{%2238%22:%5B77,79%5D},{%2239%22:%5B77%5D},{%2240%22:%5B60,61,63,64,75,77%5D},{%2241%22:%5B61,63,75,76%5D},{%2242%22:%5B61,63%5D},{%2243%22:%5B60,61,63,64,114%5D},{%2244%22:%5B78,79,84,85,92,93,95,113,115%5D},{%2245%22:%5B79,84,86,92,93,95,96,97,104,112,115%5D},{%2246%22:%5B78,86,98,103,105,111,113,114%5D},{%2247%22:%5B75,77,86,87,92,93,95,96,97,102,105,110,112%5D},{%2248%22:%5B75,76,93,95,103,104,109,112%5D},{%2249%22:%5B93,95,110,111%5D},{%2250%22:%5B94%5D}%5D
I thought changing text.txt to a more benign URL might help debugging. I changed text.txt to https://www.google.com/ and get the same 404.
I could implement a javascript solution. There's already js on this webpage. But it's controlled by a colleague and I'd prefer to try a stand alone solution first. Many thanks to anyone who can help!
Anytime you want to inject arbitrary data into HTML, you need to wrap it with htmlspecialchars() so that any reserved characters are escaped. Additionally, you actually need to surround attribute values with quotes or you're going to be generating invalid HTML.
Title
Really though, "ridiculously long string" is questionable anyway. I assume you're using some huge data URI? If so, consider not doing that, as there are limits you'll run into and it's not efficient to base64-encode things.

Getting tinymce contents stops when a special character is encountered

I can't figure out why my php processing script stops when it encounters a special character in a tinymce textarea.
example if I type foo and submit, fine...no problems but if I type foo<<<, it stops after foo when I submit
the editor is creating the html entities and sending them through ajax
getting the content with
var c = tinyMCE.get('content').getContent();
and sending the content
ajax.send("action=edit_content&c="+c+"&id="+id);
and I can see in firebug that the string is being passed
action=edit_content&c=<p>foo <<<</p>&id=8
and the php is really nothing special at all, just set that post to a var
is it maybe because of the & in the < ? maybe it thinks that is actually another post parameter?
I am still getting my feet wet when it comes to ajax. If I am correct on my assumption, how do I fix that?
You have the right idea. The ampersand is breaking the URL string.
In order to fix breaking characters, you have to escape the string.
Try this:
ajax.send("action=edit_content&c="+escape(c)+"&id="+id);
You probably won't have to (because Apache will do it for you), but if necessary, you can also unescape the string on the PHP side using urldecode:
<?php echo urldecode($_GET['c']); ?>

preventing php injection in forms (if a user submits php code into a form)

In a contact form, if someone enters the following into the textbox:
<?php echo 'hi'; ?>
I see that the server will not execute it because of an error. What I would like it to do is instead, somehow escape it into plain text and display it correctly. I have seen other sites been able to do this. I originally thought this could be solved by the addslashes() function, but that doesn't seem to work.
Thanks,
Phil
No. Use htmlspecialchars instead. Don't use addslashes.
To be more specific, addslashes bluntly escapes all instances of ', " and \ and NUL. It was meant to prevent SQL injection, but it has no real use in proper security measures.
What you want is preventing the browser to interpret tags as is (and that's entirely different from preventing SQL injections). For instance, if I want to talk about <script> elements, SO shouldn't simply send that string literally, causing to start an actual script (that can lead to Cross-site scripting), but some characters, especially < and >, need to be encoded as HTML entities so they're shown as angle brackets (the same is true for &, that otherwise would be interpreted as the start of an HTML entity).
In your case, output after htmlspecialchars would look like:
<?php echo 'hi'; ?>
Use htmlspecialchars before outputing anything provided by the user. But in this case, also make sure that you do not execute anything the user inputs. Do not use eval, include or require. If you save the user data to a file, use readfile or file_get_contents+htmlspecialchars instead of include/require. If you're using eval, change it into echo and so on.

Showing plain PHP code in a HTML page

And I'm talking (especially) forums here - [PHP]code here[/PHP] - style. Some forums escape double quotes or other "dangerous characters" and others don't.
What is the best method? What are you guys using?
Can it be done without the fear of code injection?
Edit: Who said anything about reinventing the wheel?
When PHP echo or print text, it never executes it. That only happens with eval. This means that if you did this:
echo '<?php ... ?>';
it would carry through to the page output and not be parsed or executed.
This means that all you need to do is escape the usual characters (<, >, &, etc.) and you should generally be safe.
Don't reinvent the wheel. I see BBCode in your question. Grab a markdown library and use it instead. SO uses this: http://daringfireball.net/projects/markdown/
There is no fear of PHP code injection (unless you are doing some unusual things like eval'ing HTML templates) but always a fear of JS code injection, often called XSS. And all danger coming only from possible JS code.
Thus, there is no special treatment for the PHP code, shown on a HTML page. Just treat it as any other data. < > brackets usually being escaped, for obvious reason.
Don't reinvent the wheel. PHP has it's highlight_string function for this
If you see escaped quotes on some page, that's most likely because their script escaped them twice (for example magic_quotes did it once, then mysql_query() again). When data sanitisation is done properly, you should not see escape characters in output.

Fastcgi 500 error on preg_match_all in PHP

I'm trying to set up some exotic PHP code (I'm not an expert), and I get a FastCGI Error 500 on a PHP line containing 'preg_match_all'.
When I comment out the line, the page is returned with a 200 (but not how it was meant to be).
The code is parsing PHP, HTML and JavaScript content loaded from the database and is composing them to return the finished page.
Now, by placing around some error_log entries I could determine that the line with the preg_match_all is the cause of the 500. However the line is hit multiple times during the loading of the page and on other occasions, the line does not cause an error.
Here's how it looks like exactly:
preg_match_all ("/(<([\w]+)[^>]*>)((?:.|\n)*)(<\/\\2>)/",
$part['data'], $tags, PREG_PATTERN_ORDER|PREG_OFFSET_CAPTURE);
The subject string is a piece of text that looks like:
<script> ... some javascript functions ... </script>
Edit: This is code that is up and running correctly elsewhere, so this very well could be a PHP setting or environment difference. I'm using PHP 5.2.13 on IIS6 with FastCGI.
Edit: Nothing is mentioned in the log files. At least not in the ones I checked:
IIS Logs
Event Logs
PHP Log
Edit: jab11 has pointed out the problem, but there's no solution yet:
Any thoughts or direction would be welcome.
Any chance that $part['data'] might be extremely big?
I used to get 500 error on preg_match_all when I used it on strings bigger than 100 KB.
This is a wonderful example why it's a bad idea to process HTML with regular expressions. I'm willing to bet you're running into a Stack Overflow because the HTML source string is containing some unclosed tags, making the regex try all sorts of permutations in its futile attempt to find a closing tag (</\2>). In an HTML file of 32 KB, it's easy to throw your regex off the trolley. Perhaps the stack is a different size on a different server so it works on one but not the other.
A quick test:
I applied the regex to the source code of this page (after having removed the closing </html> tag). RegexBuddy promptly went catatonic for about a minute before then matching the <head> and <body> tags (successfully). Debugging the regex from <html> on showed that it took the regex engine 970257 steps to find out that it couldn't match.

Categories