I have the following code in index.html:
<div class="button">
Title
</div>
I'd like to save "ridiculously long string" in a text file, referenced by index.html. Is this possible?
I tried replacing the string like so the following, but it doesn't work: php reference: file_get_contents()
<div class="button">
Title
</div>
Errors symptoms: the button on my page now reads title="title">Title and clicking it takes me to a 404: The requested URL /~user/html_root/< was not found on this server.. index.html and text.txt are in the html_root directory.
Here's how one of the shorter text.txts read:
?autoplay=0&trail=0&grid=1&colors=1&zoom=1&s=%5B{%228%22:%5B60,61,98,103,109,115%5D},{%229%22:%5B60,61,77,78,97,99,102,104,108,110,114,116%5D},{%2210%22:%5B76,79,98,103,105,109,111,115,117%5D},{%2211%22:%5B76,79,104,110,112,116,118%5D},{%2212%22:%5B60,61,63,64,77,78,111,117%5D},{%2213%22:%5B60,61,63,64%5D},{%2219%22:%5B76,77,79,97,98,102,103,108,109,114,115%5D},{%2220%22:%5B76,78,79,97,99,102,104,108,110,114,116%5D},{%2221%22:%5B98,103,105,109,111,115,117%5D},{%2222%22:%5B104,110,112,116,118%5D},{%2223%22:%5B61,111,117%5D},{%2224%22:%5B60,62,76,77%5D},{%2225%22:%5B60,62,75,78%5D},{%2226%22:%5B61,76,79%5D},{%2227%22:%5B77,78,96,97,102,103,109,110,115,116%5D},{%2228%22:%5B96,98,102,104,109,111,115,117%5D},{%2229%22:%5B61,65,97,98,103,105,110,112,116,118%5D},{%2230%22:%5B60,62,64,66,104,105,111,113,117,119%5D},{%2231%22:%5B60,62,64,66,75,76,112,113,118,120%5D},{%2232%22:%5B61,65,75,78,119,120%5D},{%2233%22:%5B77,78%5D},{%2237%22:%5B78,79%5D},{%2238%22:%5B77,79%5D},{%2239%22:%5B77%5D},{%2240%22:%5B60,61,63,64,75,77%5D},{%2241%22:%5B61,63,75,76%5D},{%2242%22:%5B61,63%5D},{%2243%22:%5B60,61,63,64,114%5D},{%2244%22:%5B78,79,84,85,92,93,95,113,115%5D},{%2245%22:%5B79,84,86,92,93,95,96,97,104,112,115%5D},{%2246%22:%5B78,86,98,103,105,111,113,114%5D},{%2247%22:%5B75,77,86,87,92,93,95,96,97,102,105,110,112%5D},{%2248%22:%5B75,76,93,95,103,104,109,112%5D},{%2249%22:%5B93,95,110,111%5D},{%2250%22:%5B94%5D}%5D
I thought changing text.txt to a more benign URL might help debugging. I changed text.txt to https://www.google.com/ and get the same 404.
I could implement a javascript solution. There's already js on this webpage. But it's controlled by a colleague and I'd prefer to try a stand alone solution first. Many thanks to anyone who can help!
Anytime you want to inject arbitrary data into HTML, you need to wrap it with htmlspecialchars() so that any reserved characters are escaped. Additionally, you actually need to surround attribute values with quotes or you're going to be generating invalid HTML.
Title
Really though, "ridiculously long string" is questionable anyway. I assume you're using some huge data URI? If so, consider not doing that, as there are limits you'll run into and it's not efficient to base64-encode things.
In a contact form, if someone enters the following into the textbox:
<?php echo 'hi'; ?>
I see that the server will not execute it because of an error. What I would like it to do is instead, somehow escape it into plain text and display it correctly. I have seen other sites been able to do this. I originally thought this could be solved by the addslashes() function, but that doesn't seem to work.
Thanks,
Phil
No. Use htmlspecialchars instead. Don't use addslashes.
To be more specific, addslashes bluntly escapes all instances of ', " and \ and NUL. It was meant to prevent SQL injection, but it has no real use in proper security measures.
What you want is preventing the browser to interpret tags as is (and that's entirely different from preventing SQL injections). For instance, if I want to talk about <script> elements, SO shouldn't simply send that string literally, causing to start an actual script (that can lead to Cross-site scripting), but some characters, especially < and >, need to be encoded as HTML entities so they're shown as angle brackets (the same is true for &, that otherwise would be interpreted as the start of an HTML entity).
In your case, output after htmlspecialchars would look like:
<?php echo 'hi'; ?>
Use htmlspecialchars before outputing anything provided by the user. But in this case, also make sure that you do not execute anything the user inputs. Do not use eval, include or require. If you save the user data to a file, use readfile or file_get_contents+htmlspecialchars instead of include/require. If you're using eval, change it into echo and so on.
And I'm talking (especially) forums here - [PHP]code here[/PHP] - style. Some forums escape double quotes or other "dangerous characters" and others don't.
What is the best method? What are you guys using?
Can it be done without the fear of code injection?
Edit: Who said anything about reinventing the wheel?
When PHP echo or print text, it never executes it. That only happens with eval. This means that if you did this:
echo '<?php ... ?>';
it would carry through to the page output and not be parsed or executed.
This means that all you need to do is escape the usual characters (<, >, &, etc.) and you should generally be safe.
Don't reinvent the wheel. I see BBCode in your question. Grab a markdown library and use it instead. SO uses this: http://daringfireball.net/projects/markdown/
There is no fear of PHP code injection (unless you are doing some unusual things like eval'ing HTML templates) but always a fear of JS code injection, often called XSS. And all danger coming only from possible JS code.
Thus, there is no special treatment for the PHP code, shown on a HTML page. Just treat it as any other data. < > brackets usually being escaped, for obvious reason.
Don't reinvent the wheel. PHP has it's highlight_string function for this
If you see escaped quotes on some page, that's most likely because their script escaped them twice (for example magic_quotes did it once, then mysql_query() again). When data sanitisation is done properly, you should not see escape characters in output.
I'm trying to set up some exotic PHP code (I'm not an expert), and I get a FastCGI Error 500 on a PHP line containing 'preg_match_all'.
When I comment out the line, the page is returned with a 200 (but not how it was meant to be).
The code is parsing PHP, HTML and JavaScript content loaded from the database and is composing them to return the finished page.
Now, by placing around some error_log entries I could determine that the line with the preg_match_all is the cause of the 500. However the line is hit multiple times during the loading of the page and on other occasions, the line does not cause an error.
Here's how it looks like exactly:
preg_match_all ("/(<([\w]+)[^>]*>)((?:.|\n)*)(<\/\\2>)/",
$part['data'], $tags, PREG_PATTERN_ORDER|PREG_OFFSET_CAPTURE);
The subject string is a piece of text that looks like:
<script> ... some javascript functions ... </script>
Edit: This is code that is up and running correctly elsewhere, so this very well could be a PHP setting or environment difference. I'm using PHP 5.2.13 on IIS6 with FastCGI.
Edit: Nothing is mentioned in the log files. At least not in the ones I checked:
IIS Logs
Event Logs
PHP Log
Edit: jab11 has pointed out the problem, but there's no solution yet:
Any thoughts or direction would be welcome.
Any chance that $part['data'] might be extremely big?
I used to get 500 error on preg_match_all when I used it on strings bigger than 100 KB.
This is a wonderful example why it's a bad idea to process HTML with regular expressions. I'm willing to bet you're running into a Stack Overflow because the HTML source string is containing some unclosed tags, making the regex try all sorts of permutations in its futile attempt to find a closing tag (</\2>). In an HTML file of 32 KB, it's easy to throw your regex off the trolley. Perhaps the stack is a different size on a different server so it works on one but not the other.
A quick test:
I applied the regex to the source code of this page (after having removed the closing </html> tag). RegexBuddy promptly went catatonic for about a minute before then matching the <head> and <body> tags (successfully). Debugging the regex from <html> on showed that it took the regex engine 970257 steps to find out that it couldn't match.