PHP adding empty text node before including html file - php

I'm using php as templating engine, and I've noticed that when I include view file, empty text node is added before content of that view.
For example, I have html file I want to include that has following content:
<p>Some text</p>
than I include that file like this:
<div><?php require_once('file/path.htm'); ?></div>
(notice that I've removed any spaces between div and php) And after php includes file he adds empty text node (which I'll mark like this "") that adds space before p tag, so I get something like this:
Some previous content...
<div>
"" //empty text node
<p>Some text</p>
</div>
This is quite problematic since it ruins content composition. Is there any solution to this?

FSou1 has it right, it's the charset, it can also be solved by saving as UTF-8 without BOM:
Open your PHP inlcude file in Notepad++ (download here: http://notepad-plus-plus.org/)
Select Encoding --> Encode in UTF-8 without BOM
Empty nodes disappear. Hope that helps someone. This was driving me crazy.

I had the same problem right now, and i had a luck when find answer. There answer is in charset. It could be strange, but when you save your file in UTF-8, you have empty in your markup. When your file in cp1251, you dont have this problem.

This was my second issue caused by BOM (both took over an hour of debugging, Googling and hairpulling).
I just found this (windows-only) small drag and drop program that check for BOM which it can remove:
File BOM Detector by Brynt Younce
Softpedia.com/get/System/File-Management/File-BOM-Detector.shtml
Small, easy and simple. There seems to by a PHP solution for all platforms bu I have not tested it.
Take a look if interested:
Github.com/emrahgunduz/BomCleaner

Related

body in a php file

I created a custom index.php for a wordpress theme. I just renamed the .html to .php file. Everything seems to work fine except there are extra characters printed if I run the page.
These characters are printed at start of the body area in the browser : " --> "
I am confused as to from where these characters are printed. I can create a .php with complete html contents right? Or do I need to do some modification.
<!--this is a HTML comment line -->
If you forget to delete last --> characters after deleting the first part, you might be seeing that. We cannot know without seeing your code.
As answer to the last question, you can mix php and plain HTML. Whenever you are writing php your code must be within
<?php ... CODE HERE ... ?>
Inline php however is not a good programming pattern in my opinion.

Using strip_tags() and preg_replace() to display text entered in a WYSIWYG/TinyMCE Text Editior

Good morning,
Here's the problem:
I have some text being entered in via text editor (WYSIWYG/TinyMCE) and being displayed elsewhere as posting. The problem we have is that the text looses its formatting when being displayed as a posting. After digging through the code, I discovered that this was being done with a strip_tags() + echo preg_replace() combo. I'm still new to PHP, but I was able to figure out:
strip_tags() was taking out the formatting (b/c that's how it rolls)
I could add and to get the bold and italicized text to display
the underlined and strikethrough text are CSS styles and adding the code (as it is saved on the db table) to the strip_tags() list did NOT solve the problem
My question is: can I modify the existing code to solve this, or should I use something else (htmlentities() perhaps)?
EDIT: I tried htmlentities and it failed.
EDIT: I added just the tag and the problem is 50% solved. My text is underlined, but it shows lower than the non-underlined text that comes after it. Its as if the underlined text is being treated as subtext or something.
code snippet:
<div class="display_text_area">
<?php $text = strip_tags(str_ireplace("</p>", "</p><br/>",
$text_detail->description),
'<font><ul><li><br/><strong><em><span style="text-decoration: underline;">'); ?>
<?php echo preg_replace('/(<br[^>]*>\s*){2,}/', '<br/>', $text); ?>
</div>
I'm leaving the tag here to show that (a) I tried it, and (b) it didn't work. So (c) I know it needs to be removed or modified.
Many thanks in advance.
The point is that TinyMCE returns nominally valid rich HTML that doesn't need stripping or escaping before being used in an HTML page. However, you can't assume that the TinyMCE editor is running on the client, as a you might be exploited by someone who simply directly posts a response which contains an XSS attack.
IIRC, TinyMCE returns XHTML by default. You need to ensure that any returned HTML is correct using a library such as HTML Purifier.

Print text with original spacing

I have a problem with the following:
I want to make a page that gets a file (I upload it), reads it and outputs it in an html file.
I am uploading the file and saving the contents in a mysql DB just fine, but when I show it again, I don't have any <br />'s there (maybe because the file should have \t\n or something.
How can I make it show it like it was originally written. (In the DB I see it with the fine spacing).
You probably want nl2br(). It will transform all line breaks to <br>s
You can either wrap inside <pre></pre> tags to display it as it is, or better yet use nl2br() function to add html break lines <br /> before any newline/carriage return /r /r/n /n
Are you sure the problem isn't just in the HTML? Multiple whitespaces convert to one in web browsers. In modern browsers, you can use the CSS white-space property to prevent that.
body { white-space: pre; }
Alternatively, you could wrap that section of HTML in a <pre> element, or you could hardcode extra spaces into
at time u store file data in database encode data using htmlentities() and at time of displaying decode it using html_entity_decode()

php include causes unwanted newline

(PHP 5.3.6)
I have a php file which contains simply this text - there are no php tags, no trailing newline or extraneous whitespace anywhere:
<div style="border:1px solid green">abc</div>
Now including this from another php file as follows (again, with no extraneous whitespace anywhere):
<div style="border:1px solid red"><?php include "abc.php" ?></div>
<br />
<div style="border:1px solid red"><div style="border:1px solid green">abc</div></div>
I get the result below.
Note that the second method just uses the included content directly. These should both be like the lower one, but as you can see the include causes some wierd kind of newline to be inserted before the content of the included file. I say 'wierd' because when I check the outputted source (via Chrome's view source) there is nothing visible there:
When this section of the page is shown in Chrome's element inspector, there seems to be something there but what exactly it is I can't tell:
It appears to be simply an empty string, but why an empty string would cause newlines and why it would be there in the first place are a mystery. Adding a semicolon to the end of the include statement makes no difference. It occurred to me that it might be a null byte or a 13 (CR) but that should still not cause an HTML line break.
Does anyone know how I can get rid of this unwanted newline?
Check the encoding of the included abc.php - does it have a Byte-Order Mark (BOM)? If so, remove it (good code editors allow you to change the file encoding in the Save dialog), that could be the culprit.
Relying on the Chrome inspector to check the raw output is not a good idea, as the tree is formatted. Use show source is slightly better.
It's most probably not related to include() itself. The first step to take is to open your included file with a hex editor and check whether it is really empty. As Jens Roland pointed it, it can contain a BOM for example, which will be hidden by most text editors.
You can also generate a raw abc.php file with this code and test your code against it:
file_put_contents('abc.php', 'abc');
I was in same problem. I found the problem in editor. please edit and save your included file by your notepad or text editor. You will see the change.

New line formatting when using HTML file as Word file?

I'm writing a PHP application for a client that needs a pre-existing HTML page I've already created to be "exported" as an Word file. Simply, this is how it's done:
if (isset($_GET["word"])) {
header("Content-type: application/vnd.ms-word");
header("Content-Disposition: attachment;Filename=some_file.doc");
}
This, of course, will be called if a "word" flag is located in the page querystring, e.g.:
whateverpage.php?somequery=string&someother=test&word
Anyways, my question is, despite how complex this HTML page actually is, it actually transfers pretty well to a nicely formatted Word file just by changing the content-type. The only problem I'm having is that new line breaks (HTML <br> tags) aren't formatting properly. E.g.: In my html, if I have something that looks like
Aug
01
with a BR between the lines, it always ends up showing
Aug 01
in the generated Word file.
I've done some Googling and lots of tests with various other things but nothing seems to format properly with a simple new line.
Does anyone know how to properly format a new line character in a Word file that's being created from an HTML file?
Any help is greatly appreciated.
Edit:
I've tried wrapping the said line in a P tag, ala:
<p>Aug<br>01</p>
Without luck. I've also tried making a basic document and Word, saving it as an HTML file and looking at the generated (i.e sloppy) Word HTML source. There is some CSS in there that I thought might give me a clue, but I tried everything and nothing seemed to work properly. Word seems to add an 'MsoNormal' class to wrapped paragraphs, I tried adding this but it just removes any font formatting I had and doesn't help. Here is the CSS Word creates itself:
p.MsoNormal, li.MsoNormal, div.MsoNormal
{mso-style-unhide:no;
mso-style-qformat:yes;
mso-style-parent:"";
margin-top:0cm;
margin-right:0cm;
margin-bottom:10.0pt;
margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
font-family:"Calibri","sans-serif";
mso-ascii-font-family:Calibri;
mso-ascii-theme-font:minor-latin;
mso-fareast-font-family:Calibri;
mso-fareast-theme-font:minor-latin;
mso-hansi-font-family:Calibri;
mso-hansi-theme-font:minor-latin;
mso-bidi-font-family:"Times New Roman";
mso-bidi-theme-font:minor-bidi;
mso-fareast-language:EN-US;}
I had this same problem, I was tagging my line breaks like so:
<br/>
When I changed it to just
<br>
Then my line breaks starting working.
Your problem is probably due to the fact that when you switch the content type to a Word document, the browser doesn't render it as HTML. My guess is that you need to add a newline to the Word document if you want a line break.
How to insert this line break? I'm not sure, but you could always try:
echo "Aug\r\n01";
Where \r\n are the newline characters.
How about, if you want to maintain a line-break, just echo "<p>Aug</p><p>01</p>"; it ain't pretty, but it should effect the line break you're looking for.

Categories