I have a directory with PDF files that I need to create an index for. It is a PHP page with a list of links:
filename
The filenames can be complicated:
LVD 2-1133 - Ändring av dumpningslina (1984-11-20).pdf
What is the correct way to link to this file on a Linux/Apache server?
Is there a PHP function to do this conversion?
You can use rawurlencode() to convert a string according to the RFC 1738 spec.
This function replaces all non-alphanumeric characters by their associated code.
The difference with urlencode() is that spaces are encoded as plus signs.
You'll probably want to use the last one.
This technique is called Percent or URL encoding. See Wikipedia for more details.
The urlencode() function will convert spaces into plus signs (+), so it won't work. The rawurlencode does the trick. Thanks.
Be sure to convert each part of the path separately, otherwise path/file will be converted into path%2Ffile. (which was what I missed)
URL encoding. I think it's urlencode() in PHP.
urlencode() should probably do what you want.
Edit: urlencode() works fine on swedish characters.
<?php
echo urlencode("åäö");
?>
converts to:
%E5%E4%F6
rawurlencode will encode "exotic" characters in a URL.
Related
I use file_get_contents() to download a JSON. There're some Chinese characters in the URL, I tried to print the URL out, it's OK. But when I ran the program, the URL I put in the function became error code. How do I know that is this URL links to a JSON that links to a MySQL request, and in the console of MySQL, I saw the URL became error code. I tried lots of ways to change URL string to UTF-8 or GB2312, etc, but none of that works. I Wish I could get help here, thanks.
Its very difficult to understand your question. I think i understood the first part of your question:
I use file_get_contents() to download a JSON. There're some Chinese
characters in the URL, I tried to print the URL out, it's OK. But when
I ran the program, the URL I put in the function became error code.
You try to access a URL containing chinese characters using file_get_contents():
The answer to this is:
You need to encode the part of the url containing chinese characters using urlencode() or rawurlencode().
The main difference between urlencode()and rawurlencode() is, that urlencode() converts spaces to +. rawurlencode() converts spaces to %20.
urlencode is used for Query Parameters as example ?q=my+search+key, in every other case you use rawurlencode.
Example:
$test = 'http://www.example.com/'.rawurlencode('以怎么下载').'.html';
print_r($test);
// $html = file_get_contents($test);
// output:
http://www.example.com/%E4%BB%A5%E6%80%8E%E4%B9%88%E4%B8%8B%E8%BD%BD.html
I hope it solves your problem.
In php, what function can I use to convert the text 'pétition' to 'p%E9tition'.
I have tried with uft8_encode and uft8_decode with no success.
%E9 is an URL encoded escape character. You can achieve this by urlecode($string).
If you want HTML escaping, you can either use htmlentities($string) (more encoding) or htmlspecialchars($string) (less encoding).
http://php.net/manual/en/function.urlencode.php
http://php.net/manual/en/function.htmlentities.php
http://php.net/manual/en/function.htmlspecialchars.php
When dealing with UTF-8 strings, you will need to decode the string (ie. with utf8_decode) before encoding with urlencode to be used in a query part of a URL.
print_r( urlencode(utf8_decode('pétition')) );
// p%E9tition
You can try to have a look at htmlentities.
This link can help
How can I convert spaces in string into %20?
Here is my attempt:
$str = "What happens here?";
echo urlencode($str);
The output is "What+happens+here%3F", so the spaces are not represented as %20.
What am I doing wrong?
Use the rawurlencode function instead.
The plus sign is the historic encoding for a space character in URL parameters, as documented in the help for the urlencode() function.
That same page contains the answer you need - use rawurlencode() instead to get RFC 3986 compatible encoding.
I believe that, if you need to use the %20 variant, you could perhaps use rawurlencode().
I have a php which generates an xml file and prints it on screen.
Amongst other variables, it prints an Image link.
The problem is that if this Image link has an '&' character in it, I get an xml error because it isn't encoded properly.
So I solve it by replacing the & sign with &.
Atleast I thought it was solved, now the link to the image is for example like this:
www.domain.com/phones & equipment/img1.jpg
which causes a 404 file not found.
The real path is
www.domain.com/phones & equipment/img1.jpg
So how can I solve this then?
I would prefer not to change the folder names, I simply didn't know this when I created the folders.
Thanks
If it's a URL, you might want to URL encode it instead:
www.domain.com/phones%20%26%20equipment/img1.jpg
try to
www.domain.com/phones+%26+equipment/img1.jpg
You should url encode the link using php urlencode() function. The code for '&' is "%26".
Additionally, if you check IANA RFC regarding URL/URI, you will see that space character is not a valid character and shouldn't be present inside REQUEST URI. Having your URL like www.domain.com/phones-and-equipment/img1.jpg would be much beneficial from SEO standpoint as well.
ADDENDUM: For example, check page 2, section "Unsafe" of RFC 1738 and see why non-printable and non-US-ASCII characters are not safe.
use html_entity_decode to convert & back into an &
see http://php.net/manual/en/function.html-entity-decode.php
The correct representation of the ampersand in an XML file is &. If that isn't working, the problem is with the code that is reading the XML file and deferencing the URI.
I am attempting to open a page with window.open and it's not working. The path shown is like xyz/a%20b%20c%20.pdf, but it is supposed to be xyz/abc.pdf. If I remove the % and 20 manually, it works, how can I remove these characters using PHP?
Use urldecode:
(PHP 4, PHP 5)
urldecode — Decodes URL-encoded string
Description
string urldecode ( string $str )
Decodes any %## encoding in the given string. Plus symbols ('+') are decoded to a space character.
Example
echo urldecode('xyz/a%20b%20c%20.pdf');
This is known as URL Encoding. You need to decode the string. If you are using jQuery you should check out the URL Encode plug in.
You need to urldecode (as stated above).
However, you say that you can remove the %20 and it will work. I would say you need them, they decode to spaces. Check it out using this online url decoder:
http://www.convertstring.com/EncodeDecode/UrlDecode
it decodes to:
xyz/a b c .pdf
not
xyz/abc.pdf