PHP character encoding issue - php

When I grab the title from my Word Press posts in code and pass them around as email, the punctuation gets a bit mangled.
For example "TAMAGOTCHI P’S LOVE & MELODY SET" comes out as "TAMAGOTCHI P’S LOVE & MELODY SET".
Any ideas how I prevent this?
Let me know if you need to see the specific code I'm currently using. (I'm not really sure if this is a WordPress issue, or a PHP issue.
EDIT
What happens is that this title is passed to a form via the query string. Then when the form is submitted, I take the string from the form field and email it.
So I guess I need to decode the html either before I pass it into the form field, or else before I email it.
EDIT 2
Weird, so I looked closer at the code and I'm already doing a urldecode before I pass the value into the form field
jQuery('#product_name').val("<?php echo urldecode(strip_tags($_GET['pname'])); ?>
Is there some default encoding happening when you serialize (for ajax formhandler)
var dataString = $(this).serialize();
EDIT 3
OK turns out the code is more complex. Title is also passed to some kind of wordpress session before it's hits the form. I'll figure it out where exactly I need to put urldecode. Thanks!

This is one WordPress "feature" I could do without.
Here's one down-n-dirty method to get the fancy quotes (or other entities) replaced:
$title = get_the_title( get_the_ID() );
$title = str_replace( '&#8217', "'", $title );
echo $title;
We could integrate deeper, by hooking into the_title, if you want this same de-entities functionality throughout the site. This code block would belong in your theme's functions.php file.
function reform_title($title, $id) {
$title = str_replace( '&#8217', "'", $title );
return $title;
}
add_filter('the_title', 'reform_title', 10, 2);

Im not really sure about wordpress, but the issue itself its that the text its coming out as URLENCODE instead of a UTF-8 or other encode.
You have two options
When you receive the text you never turn it back to normal encoding (Which is weird as usually is de-encoded by php when you access the $_GET or $_POST variables)
You are parsing the message with the urlencode() function.

Related

htmlspecialchars and ampersand in forms?

I have a little static function so that I can easily build html valid urls on my local website, it is below;
public static function url($path = false) {
// Build return url with special html characters escaped
return 'http://127.0.0.1/' . htmlspecialchars($path);
}
I have two urls one inside an anchor and another is inside a form action, they are below;
Root::url('test?category=' . $category . '&index=' . $index) // Href
Root::url('test?category=' . $_GET['category'] . '&index=' . $_GET['index']) // Form
GET === $, you can see inside my static function that I use htmlspecialchars to escape special html characters from my url.
The anchor one returns a valid link and works as expected. The form one however returns the following, as in when I click on the form submit, my url in my browser is as follows.
http://127.0.0.1/test?category=innate&index=0
Why is this? My website breaks because it is dependant on the GET parameters being valid.
Thanks for your time, hope this made sense.
EDIT
I insert the return value of the function call straight into my form action,
<form
action="<?= Root::url('test?category=' . $_GET['category'] . '&index=' . $_GET['index']); ?>"
method="post">
EDIT
The form html is as follows;
<form action="http://example.com/test?category=innate&index=0" method="post">
The anchor html is as follows
<a href="http://example.com/test?category=innate&index=0">
Could it be something to do with the server sending a POST request even though I have GET parameters?
EDIT #3
Ok so it has something to do with my function or what I am passing in, I hard typed in the url in the form submit and it worked, no problems, which means it can only be what my function is returning.
I myself cannot see what I may be!
ANSWER
After the form was being submitted, I was redirecting to the same page using header to counter form resubmission. The string for the header was being generated by Root::url().
Two hours this took me to figure out, but boy does it feel good!
Normally you wouldn't add a query string to a POST URL. It's not forbidden, though, it may only be somewhat confusing, especially if you use $_REQUEST (which you don't, it seems).
I don't know why your browser shows an uninterpreted &, it should interpret it.
Your problems are likely due to one of these:
a bad browser - try another one
bad content of the form input fields
other
This is quite logic.
I assume your url() method looks like this:
url($string){
echo htmlspecialchars($string);
}
Let's have a look at the $string you are passing:
'test?category=' . $_GET['category'] . '&index=' . $_GET['index'];
As I see in your output, replacing the values, the final string before htmlspecialchars() occur would be:
'test?category=innate&index=0' and after it: test?category=innate&index=0
What happened here? you first concatenated the string, and then htmlspecialchars()'ed the & used to separate the parameters. And to not break the url, you don't want to convert THAT '&'.
Also to sanitize the url you shouldn't use htmlspecialchars() because most html entities would convert to somthing like & + somename + ; for example the Euro symbol would convert to € and you don't want the actual & symbol in your url, the browsers will interpret it as you have another new parameter awaiting.
You should use urlencode(), which will convert your & into: %26 , also, the function's name is self-explanatory, it's encoding a string to use on a URL.
Still, you want the & to separate the parameters, but not in the $GET values. What should we do? to urlencode the values before concatenating the string. I would suggest a method like this one:
function url($page, $get){
$parameters = array();
foreach($get as $k => $v) $parameters[] = urlencode($k)."=".urlencode($v);
//We are concatenating with ? and & the urlencoded() values in the next line:
echo urlencode($page).'?'.implode('&', $parameters);
}
url('test', $_GET); // outputs: test?category=innate&index=0
This would get rid of the special chars from a form's field names and values.
I noticed you will use 2 fixed parameters, category and index, so the method could be like this:
function url($page, $get){
$page = urlencode($page);
$category = urlencode($get['category']);
$index = urlencode($get['index']);
echo "$page?category=$category&index=$index";
}
Hope this is what you needed

Dealing with plus signs showing up when using json_encode() in PHP

I am using PHP to save the values of a form as JSON into a cookie like so:
// set cookie with search values so we can use jQuery to repopulate the form
setcookie('jobSearchValues', json_encode($form_state['values']), 0, '/');
This works great and then on the JavaScript side I can use this to get at the values:
var jobSearchValues = JSON.parse($.cookie("jobSearchValues"));
$("#keywords").val(jobSearchValues.keywords);
Again this works great, but the problem is that when a value for one of the fields in the form has a space in it, the space gets replaced with a "+". So when the form gets repopulated the text field displays like this for example "hi+mom". Is there a better way to go about this? By the way, $form_state['values'] is a PHP array. There are 4 fields in the form that I am setting as JSON into the cookie.
Use setrawcookie( '<name>', rawurlencode( json_encode( $value ) ), ... ) and then manually url-decode & json-parse on the client side (with JSON.parse(decodeURIComponent(cookie)))
This is weird. json_encode is not supposed to replace spaces with +..
setcookie is probably urlencoding it.
You will have to urldecode it in javascript before using it.
Try this:
(taken from phpjs)
function urldecode(str) {
return decodeURIComponent((str+'').replace(/\+/g, '%20'));
}
and then
var jobSearchValues = JSON.parse($.cookie("jobSearchValues"));
$("#keywords").val(urldecode(jobSearchValues.keywords));

Post text from a text area

I have a form where a user types paragraphs into a text area and then it takes them to another page after they submit. How can I pass whatever they typed to the page after they submit? The text area might have linebreaks and if I use a query string to pass the data, it gives me an error. This is my current code to pass the field:
<?php
if(isset($_POST['form']))
{
$title = $_POST['title'];
$body = $_POST['body'];
header("SubmitForm.php?title=$title&body=$body");
?>
<html>
...html form...
It doesn't work when the text area has line breaks in it.
I would suggest installing a wysiwyg editor to make this easier for you, but i assume that would add some time for the learning curve.
The simplest tips I can give you is to set a CSS attribute for your textarea: white-space:pre so that when it gets submitted, all line breaks get sent as well.
On your server side, you would need to use the nl2br() function, so that when it gets saved on your DB or wherever you store them, all line breaks are converted to HTML breaks.
For your additional reference, I had a similar question like this last year.
You really shouldn't be putting anything that long in a query string in the first place. Look into using sessions to store data across pages instead.
(This is assuming I understood the question right)
urlencode the data in order to pass it via query string.

How save JavaScript and HTML in option without it being auto-escaped?

And there I thought I knew Wordpress well. It now seems that update_option() auto-escapes code. If I want to save some Javascript or HTML code in an option, this behavior renders the code unusable.
I refuse to do a str_replace on the returned value to filter out every backslash. There has to be a better way.
Here's the PHP for the text box to enter some code:
$option = unserialize(get_option('option'));
<textarea name="option[box]"><?php echo $option['box']; ?></textarea>
This is what happens after submitting the form (in essence):
update_option('option', serialize($_POST));
Any ideas?
Edit: I now got it to work by using PHP's stripslashes() where the script has to be rendered, and htmlentities(stripslashes()) in the text box to display the stored code. While this does the job, I'd still like to know if there is a better solution.
It now seems that update_option() auto-escapes code.
It only sanitizes the value for database entry. You'll find the real troublemaker is around line 750 in wp-settings.php, and the WP function add_magic_quotes().
Yep, you read that right, add magic quotes!
For some reason, WordPress decided to enforce magic quotes, so you'll always need to stripslashes on GET and POST when writing plugins and the like.
That's true #TheDeadMedic stripslashes must be used like;
echo stripslashes(get_option( 'option' ));

PHP form auto escaping posted data?

I have an HTML form POSTing to a PHP page.
I can read in the data using the $_POST variable on the PHP.
However, all the data seems to be escaped.
So, for example
a comma (,) = %2C
a colon (:) = %3a
a slash (/) = %2
so things like a simple URL of such as http://example.com get POSTed as http%3A%2F%2Fexample.com
Any ideas as to what is happening?
Actually you want urldecode. %xx is an URL encoding, not a html encoding. The real question is why are you getting these codes. PHP usually decodes the URL for you as it parses the request into the $_GET and $_REQUEST variables. POSTed forms should not be urlencoded. Can you show us some of the code generating the form? Maybe your form is being encoded on the way out for some reason.
See the warning on this page: http://us2.php.net/manual/en/function.urldecode.php
Here is a simple PHP loop to decode all POST vars
foreach($_POST as $key=>$value) {
$_POST[$key] = urldecode($value);
}
You can then access them as per normal, but properly decoded. I, however, would use a different array to store them, as I don't like to pollute the super globals (I believe they should always have the exact data in them as by PHP).
This shouldn't be happening, and though you can fix it by manually urldecode()ing, you will probably be hiding a basic bug elsewhere that might come round to bite you later.
Although when you POST a form using the default content-type ‘application/x-www-form-encoded’, the values inside it are URL-encoded (%xx), PHP undoes that for you when it makes values available in the $_POST[] array.
If you are still getting unwanted %xx sequences afterwards, there must be another layer of manual URL-encoding going on that shouldn't be there. You need to find where that is. If it's a hidden field, maybe the page that generates it is accidentally encoding it using urlencode() instead of htmlspecialchars(), or something? Putting some example code online might help us find out.

Categories