php convert foreign language chars to "string" - php

UPDATE: Please ignore this question, it appears that md5 is not
returning result because I pass the URL through filter_var($url,
FILTER_SANITIZE_URL) and looks like FILTER_SANITIZE URL doesn't work
for foreign characters.
I have a problem where I want to get a hash from URLs e.g
https://ko.wikipedia.org/wiki/추간판_탈출증
The URL is provided by user in a form with so I assume it's already UTF-8 since my website is UTF-8.
However the above cannot be used with md5() as it returns empty result. May I know what php function do I use to convert it to something like below where md5() can be used?
https://ko.wikipedia.org/wiki/%EC%B6%94%EA%B0%84%ED%8C%90_%ED%83%88%EC%B6%9C%EC%A6%9D
I tried iconv, htmlspecialchar, htmlentities and I cannot seems to be able to find the right function to convert the strings.

You can use directly md5 to Encode whole URL as Below :
echo md5('https://ko.wikipedia.org/wiki/추간판_탈출증');
Which gives output as :
26eb333445f4e154f8ecb76e7c2ac858
UPDATED :
As Per w3schools FILTER_SANITIZE_URL
The FILTER_SANITIZE_URL filter removes all illegal URL characters from
a string.
This filter allows all letters, digits and
$-_.+!*'(),{}|\^~[]`"><#%;/?:#&=

The function you are looking for is rawurlencode. However you will have to extract the part of the url you want to encode or the whole url will be encoded.
$encoded = rawurlencode('추간판_탈출증');
// value of $encoded is now '%EC%B6%94%EA%B0%84%ED%8C%90_%ED%83%88%EC%B6%9C%EC%A6%9D'

use urlencode(url) function for conversion
once check this url url encode functions are found here

Related

php simple encode decode function using only (0-9 A-Z a-z)

I'm setting up a PHP email tracking system that uses url parameters to track link click throughs. Something like:
www.example.com?trackToken=10
I'm looking for a simple PHP encode / decode function I can put in place that will take a number (in this case 10) and convert in to strictly to number and letters. something like:
www.example.com?trackToken=7aj8nG93nDpw9M9Nk1
I have found several variations of encrypt / decrypt functions using mcrypt. However, the encrypted output always ends up containing strange characters. These strange characters make it hard for my email messages to be sent/delivered.
Does anyone know of a good encrypt function that only outputs numbers 0-9 and letters a-z or A-Z? Additionally, I'm looking for a decrypt function to complement the encrypt function so I can actually use it.
I'm not looking for something super secure here. Just a way to mask the actual tracking token so the user can't change it on their own.
Base64 should be fine in any modern system - and any system handling email in PHP fits the definition of "modern". There is absolutely no reason I can think of to limit to just alphanumerics. The only catch is that as a URL parameter you don't want to have a '+' or '/' in the string. There is base64url to solve this problem but that doesn't have a standard PHP function. You can easily replicate that by using base64_encode() and str_replace() and to decode str_replace followed by base64_decode():
$coded = str_replace('+','-',str_replace('/','_',base64_encode($original)));
$original = base64_decode(str_replace('_','/',str_replace('-','+',$coded)));

PHP parse_str and special characters

I'm using parse_str to get a raw value from a URL (which is obviously entered by the user), and I'm wondering if there's anything I should to to make it safe before I use it (i.e. convert special characters like '<').
I noticed that the function does remove some characters, but I couldn't find the specifics anywhere.
Thanks.
You can use htmlentities() and then parse_str() or parse_url() function

Pass text with special characters as get parameter in php

I want to pass any text as a get parameter to a php script. For know I just append the text this way:
action.php?text=Hello+my+name+is+bob
This url is composed by javascript and I do a ajax request with this url.
In action.php I do
$encoded = array_map('rawurlencode', $_GET);
But this does not work for special chars like ÖÄüä.
Any idea how to solve this?
url_encode(string) will return the given string with special characters converted into %XX format.
http://us3.php.net/manual/en/function.urlencode.php
I know you can send special characters fine without encoding through $_POST, which is another alternative
try with url_encode().........

get values in url

I need to get values in the url as it is
ex:
http://www.example.com/index?url=1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==
but vriable url give me value of "1 LY2ePh1pjX4tjZ4 GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
Even though i have expected "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=="
any one can help me for this or know the reason
You see, you need to encode certain characters if you need to send them in a URL. For further references, I suggest you should read this Page. It seems that the URL you are getting isn't being encoded properly. If the URL is coming from your site, then I would suggest you to encode it properly.
In PHP, there is a function called urlencode, which may help you with this task.
A short explanation
URLs can only be sent over internet using ASCII character set.If you want to send characters which is outside this set, you need to encode it.URL encoding replaces unsafe ASCII characters with % followed by two hexadecimal digits corresponding to the character values in the ISO-8859-1 character-set.
The client sending the request apparently isn't URL encoding the value correctly. You can re-encode it after it's being decoded like this:
urlencode($_GET["url"])
IT convert %2B to space
The parameter you sent is wrong, it should have been encoded like so..
<?php
echo '<a href="http://www.example.com/index?url=', urlencode('1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA=='), '">';
?>
i have added encoding correctly now,It convert == correctly, but + sign encode to %2B correctly but in decode process it convert to space
As it seems that you’re having a Base-64 value there: You can use the URL safe alphabet for Base-64 that uses - and _ instead of + and / respectively:
$base64 = "1+LY2ePh1pjX4tjZ4+GS393Y2pjd16Cbq63T3tbfzMzd16CarA==";
// plain Base-64 to URL safe Base-64
$base64_safe = strtr($base64, '+/', '-_');
// URL safe Base-64 to plain Base-64
$base64 = strtr($base64_safe, '-_', '+/');
And if you know the length of the data, you can also omit the = padding:
rtrim($base64, '=')

How to parse unicode format (e.g. \u201c, \u2014) using PHP

I am pulling data from the Facebook graph which has characters encoded like so: \u2014 and \u2014
Is there a function to convert those characters into HTML? i.e \u2014 -> —
If you have some further reading on these character codes), or suggested reading about unicode in general I would appreciate it. This is so confusing to me. I don't know what to call these codes... I guess unicode, but unicode seems to mean a whole lot of things.
that's not entirely true bobince.
How do you handle json containing spanish accents?
there are 2 problems.
I make FB.api(url, function(response)
... var s=JSON.stringify(response);
and pass it to a php script via $.post
First I get a truncated string. I need escape(JSON.stringify(response))
Then I get a full json encoded string with spanish accents.
As a test, I place it in a text file I load with file_get_contents and apply php json_decode and get nothing.
You first need utf8_encode.
And then you get awaiting object of your desire.
After a full day of test and google without any result when decoding unicode properly, I found your post.
So many thanks to you.
Someone asked me to solve the problem of Arabic texts from the Facebook JSON archive, maybe this code helps someone who searches for reading Arabic texts from Facebook (or instagram) JSON:
$str = '\u00d8\u00ae\u00d9\u0084\u00d8\u00b5';
function decode_encoded_utf8($string){
return preg_replace_callback('#\\\\u([0-9a-f]{4})#ism', function($matches) { return mb_convert_encoding(pack("H*", $matches[1]), "UTF-8", "UCS-2BE"); }, $string);
}
echo iconv("UTF-8", "ISO-8859-1//TRANSLIT", decode_encoded_utf8($str));
Facebook Graph API returns JSON objects. Use json_decode() to read them into PHP and you do not have to worry about handling string literal escapes like \uNNNN. Don't try to decode JSON/JavaScript string literals by yourself, or extract chosen properties using regex.
Having read the string value, you'll have a UTF-8-encoded string. If your target HTML is also UTF-8-encoded, you don't need to replace — (U+2014) with any entity reference. Just use htmlspecialchars() on the string when outputting it, so that any < or & characters in the string are properly encoded.
If you do for some reason need to produce ASCII-safe HTML, use htmlentities() with the charset arg set to 'utf-8'.

Categories