What is the benefit of \n and PHP_EOL in PHP? - php

I'm trying to output a newline character in PHP that gets viewed in a web browser. I can only manage it by using the <br /> tag.
When I use \n, nothing occurs, so what is the benefit of using \n? Also what is the benefit of PHP_EOL? When I concatenate it to a string, just a space is printed not a newline.

A web browser interprets the output of a PHP program as HTML, so \n and \r\n will not appear to do anything, just like inserting a newline in an HTML file. On the other hand, <br /> makes a new line in the interpreted HTML (hence "line BReak"). Therefore, <br /> will make new lines, whereas \r\n will not do anything.

The PHP_EOL define is correct for the platform that you are on. So on windows PHP_EOL is \r\n on MAC it's \r on Linux, it's \n. Whereas <br /> or <br> is the HTML markup for line brake. If you're new to HTML & PHP, it's better to get a grasp of HTML first, then worry about PHP. Or start reading some source code, and run other peoples source code to see how they have done it. It will make you're code better just by copying their style. (Most of the time.)

When you are using PHP to make a web app, there are a few layers involved:
Your PHP code, which outputs some data to
a web server, which transmits the data over the network to
a web browser, which parses the data and displays it on the screen.
Note that in the above, it is just data that is being passed along. In your case, that data is HTML, but it could just as easily be plain text or even a PNG formatted image. (This is one reason why you send a Content-Type: header, to specify the format of your data.)
Because it is so often used for HTML, PHP has a lot of HTML-specific features, but that's not the only format it can output. So, while a newline character is not always useful for HTML, is is useful:
if you want to format the HTML you are generating, not for the web browser, but for another person to be able to read;
if you want to generate plain text or another format where newline characters do matter.

PHP_EOL is useful when you're writing data to a file, example a log file. It will create line breaks specific to your platform.

Related

Character Encoding/decoding becomes a mess

In a webapp I place a <div id="xxx" contentEditable=true > for editing purpose. The encodeURIComponent(xxx.innerHTML) will be send via Ajax POST type to a server, where a PHP script creates a simple txt file from it which in turn can be downloaded from the user to store it locally or print it on screen. It works perfect so far, but … Yes, but, character encoding is a mess. All special characters like the german Ä are interpretated wrong. In this case as ä
I google for some days and I study PHP methods like iconv() and I know how to set up a browsers character encoding and also set a text editor for a correct correspondending decoding. But nothing helps, its still a messs, or becoming even weired.
So my question is : Where in this encoding/decoding roundtrip from the browser to a server and back to the browser I have to do what, to ensure that an Ä will still be an Ä ?
I answer my question, because it turns out to be another problem as stated above. The contenteditable is actually part of a section of html code. On the serverside with PHP I need to filter out the contenteditable text which I do via a DOMDocument like this:
$doc = new DOMDocument();
$doc->loadHTML($_POST["data"]);
then I access the elements and their textual content as usual.
Finally I save the text with
file_put_contents($txtFile, $plainText, LOCK_EX);
The saved text then was a mess as written above. Now it turns out that you need to tell the DOMDocument the character set wich loadHTML() has to interpretate. In this case UTF-8.
First I did it as recommended in PHP this way :
$doc = new DOMDocument('1.0', 'UTF-8');
But that doesn't help (I wonder). Then I found this answer in SO. And the final solution is this :
$doc->loadHTML('<?xml encoding="UTF-8">' . $_POST["data"]);
Though it works it is a trick. Finally the question is left over, how to do it the right way ? If somebedoy has the definite answer, he is very welcome.
You need to make sure that the content is encoded consistently throughout its roundtrip from user input to server-side storage and back to the browser again.
I would recommend using UTF-8. Check that your HTML document (which includes the contenteditable zone) is UTF-8 encoded, and that the XMLHttpRequest/Ajax request does not specify a different encoding when it sends the content to the server.
Check that your server-side application encodes the text file as UTF-8 also. And check that the HTTP response headers declare the file's encoding as UTF-8 when the file is requested and downloaded in the browser.
Somewhere along this path, the encoding differs, and that is what is causing the error. iconv converts between different encodings, which should not be necessary if everything is consistent.
Good luck!

Converting from DOMNodeList to string in PHP extra characters

I have converted results from a web scrape from DOMNodeLists to strings:
$node = $the_sentence->item(0);
$the_sentence = "{$node->nodeName} - {$node->nodeValue}";
However now when I print out the result it includes whatever tag the text had in the page as well as the &nbsp character:
Before:
"This is the sentence"
Now:
"h2 - This is the Âsentence Â"
Any ideas how I can get rid of these characters? Thanks for any help.
This looks like a character set problem.
Have a look at the source page and see what character set it is encoded in. This might be in a Content-Type HTTP header, or it might be in a <meta> tag at the start of the document. Then, when you handle the data, make sure that everything you do handles it in the same format.
You probably want to store the data in UTF-8. Thus, if you capture in another format, in general it is a good idea to convert it from that charset to UTF-8; this will mean you can capture from a wide range of sources and store it in the same database. Look at iconv in the PHP manual if you wish to learn more about charset conversion.
Are you printing the output to console or a browser? If the former, note that some consoles (old versions of Windows in particular) do not handle UTF-8 well at all. If you are echoing to a browser, make sure your character set is set to "UTF-8" in your own HTML.

php and newlines: what I need to know?

I have some questions about \r\n:
newlines are browser dependent? (not how they are displayed in a browser, but how <textarea> sends them to php via http request)
newlines are system dependent? (where php runs)
will php apply some implicit conversion?
will mysql apply some implicit conversion?
Thanks in advance!
newlines are browser dependent?
No. Use <br> to get a newline in a browser
newlines are system dependent? (where php runs)
yes : \n on OSX, \n on Unix/Linux, \r\n on Windows
will php apply some implicit conversion?
no
will mysql apply some implicit conversion?
no
Generally, for browser \r and \n are whitespace chars, like ' ' (whitespace) of \t (tab). Inside some tags (script, pre etc.) they are treated as line break symbols. In this case browser will understand any of common line break sequences (\r, \r\n, \n).
When data comes from textarea, line breaks will always be represented as \r\n.
Line breaks in php files doesn't depend on system where they're running. It depends on settings of file editor used for creating php files. When you copy a php file to another system, line breaks format will not change.
For example, look at this code:
print_r("
" === "\r\n");
Its result will depend on settings of the editor used for creating this file. It doesn't depend on current system.
But if you're trying to read some other files contained by your system (text files, for example) these files will most probably use system's common line breaks format.
No, PHP and MySQL don't apply implicit conversions.
The system independent way is using PHP_EOL constant.
New lines is not browser dependent, outer a tag with CSS white-space:pre you must to execute nl2br() php function to convert newlines to BR tags.
You may be interested in nl2br, this takes new line characters like you described and replaces them with a HTML line break (<br />).
A big gotcha for me was that in single quoted strings 'like\nthis' escape sequences (like \n) will not be interpreted. You have to use double quotes "like\nthis" to get an actual newline.
<br> is browser independent, \n should be too.
Don't know about \r
MySQL won't convert it

browsers do not read new line in a php file

I have a password protected text file and to make it password protected, i used a password protector script (which works great) but it required me to rename the text file to .php on my server. This went fine, however, when I open this text file in any browser on windows, i do not seeing any new lines (I used to see them)
I tried writing -"\n", "\r", "\r\n". I think it has to do with the browser thinking its a .php file i guess.
This is because the server is sending a different MIME type. It is now sending text/html (the default type returned by PHP) rather than text/plain.
Your browser is then expecting HTML. Line breaks are just like any other white space in HTML, so they are essentially meaningless for what you are trying to do.
You can use this to fix it:
header('Content-Type: text/plain');
Be sure to put that at the top of your code, or at least before you output anything.
This causes the server to send the MIME type you are expecting.
By default the output of PHP scripts are rendered as HTML, which means that whitespace is folded. If you want to change this back to text then you need to set the Content-Type header to "text/plain", either in the web server or via the header() function.
That's because the browser would see the content as html and in html a newline is just a whitespace
I am not sure if I understood your question properly, but in two cases there is a solution:
-You output the text: In this case, you have to use
<br>
-You want to write it with new lines in the file: Use the PHP-Constant
PHP_EOL
which means End-of-Line. This inserts always a correct break.
You need to use <br>, as html is being rendered in the browser.
Browsers uses HTML to format text (contained in html of course)
use <br> or <p>
Also considering your file is a .php it's a normal behaviour that your webserver will send it as text/html

How to ignore PHP break lines with POEdit parser?

I am currently translating my PHP application using gettext with POEdit. Since I respect the print margin in my source code, I was used to writing strings like that:
print $this->translate("A long string of text
that needs to follow the print margin and since
php outputs whitespaces for every break line I do
my sites renders correctly.");
However, in POEdit, as expected, the linebreaks are not escaped to whitespaces.
A long string of text\n
that needs to follow the print margin and since\n
php outputs whitespaces for every break line I do\n
my websites render correctly.\n
I know one approach would be to close the strings when changing lines in the source code like that:
print $this->translate("A long string of text " .
"that needs to follow the print margin and since " .
"php outputs whitespaces for every break line I do " .
"my sites renders correctly. ");
But it is not an approach that is extensible for me when texts need to change and print margin
still respected, unless netbeans (the IDE I use) can do that for me automatically just like eclipse
in java.
So in conclusion, is there a way to tell the POEdit parser to escape linebreaks as whitespaces in the preferences?
I know that the strings are still translatable even though linebreaks are not escaped, I'm asking this so my traductor (sometimes even the customer/user) will avoid confusion into thinking he needs to duplicate the linebreaks while he translates in POEdit.
You have to make sure that your using the right line breaks in your script and your app
LF: Line Feed, U+000A
FF: Form Feed, U+000C
CR: Carriage Return, U+000D
CR+LF: CR (U+000D) followed by LF (U+000A)
NEL: Next Line, U+0085
LS: Line Separator, U+2028
PS: Paragraph Separator, U+2029
Within Windows systems (ms-dos) there line feed is CR+LF, And within "Unix-like" systems its LF adn 8Bit commodore's its a CR
You have to make sure that the source location contains the same type of feeds to your edit location.
Your server handles its line feeds different to the host that the editor is running on, just double check this and develope some means of auto replacing the Unicode chars depending on your OS
As you say that your "translating my PHP application using gettext with POEdit", i would create a script to go threw all your files via shell/doss/php and auto convert the character codes to the type of system your running on.
so if your working on Windows then you would search for all chars that are U+000A and replace with U+000DU+000A

Categories