How to convert an url into a hyperlink with php urlencode()? - php

Trying to convert a plain URL text into a valid link.
The problem I have is that my link might contain both English (A-Z/a-z) and Hebrew (אבגדהוזחטיכךלמםנןסעפףצץקרשת) letters.
Using PHP's urlencode() function I was able to get the correct format for Hebrew, yet I cannot find the right way in which I convert it into a link.
My code so far (does not work with Hebrew letters):
$replyText = preg_replace('#(https?://([-\w\.]+[-\w])+(:\d+)?(/([\w/_\.#-]*(\?\S+)?[^\.\s])?)?)#', '$1', $replyText);
An example for a URL I need to convert into a link:
google.co.il%2F%D7%A9%D7%9C%D7%95%D7%9D_Hello.html
Will become:
google.co.il%2F%D7%A9%D7%9C%D7%95%D7%9D_Hello.html

Despite what I believe you have posted to represent the desired output, if this was my task, I think I would have a urlencoded href value in the <a> tag and human-readable link text.
Code: (Demo)
$replyText = "google.co.il%2F%D7%A9%D7%9C%D7%95%D7%9D_Hello.html";
echo '', urldecode($replyText), '';
Source Code Output:
google.co.il/שלום_Hello.html
Effective Output:
google.co.il/שלום_Hello.html
Notice that when you mouseover the link, your browser's status bar will show the un-encoded url anyhow.

You just need to replace %2F => /, so your link will be: google.co.il/%D7%A9%D7%9C%D7%95%D7%9D_Hello.html
link

Related

utf-8 url parsing in php:file_get_content and browser

I want to get a a URL content using file_get_contents($url); when I copy the URL from browser address bar it is like this:
$url="http://www.mashadhome.com/fa-estate-39855-tags-%D9%81%D8%B1%D9%88%D8%B4-%D8%A2%D9%BE%D8%A7%D8%B1%D8%AA%D9%85%D8%A7%D9%86-%D8%A8%D9%84%D9%88%D8%A7%D8%B1%20%D8%B5%DB%8C%D8%A7%D8%AF%20%D8%B4%DB%8C%D8%B1%D8%A7%D8%B2%DB%8C";
but when I automatic get the url using
$homepage1 = file_get_contents($url_value);
$doc1 = new DOMDocument;
$doc1->preserveWhiteSpace = false;
#$doc1->loadHTML($homepage1);
$xpath1 = new DOMXpath($doc1);
$nodes1 = $xpath1->query("//html/body/section/div/div/section/figure/a");
$href = $node1->getAttribute('href');
it is sothing like this:
$href="http://www.mashadhome.com/fa-estate-39855-tags-فروش-آپارتمان-بلوار صیاد شیرازی";
I use code like above to get content of this link, but the file_get_contents($href) don't work for second URL, either when I copy second address to browser it works good;
so question is this: why second address doesn't work? how to convert first address to second type?
Url can accept restricted character set, namely ASCII letter, digits, hyphen. To access such url, it needs to be encoded to the format accepted by your server, like in your first example. Have a look at urlencode() function.
Of course you need to use urlencode only on parts that are not url special characters (like :, /). In this instance, you would use urlencode on the fa-estate-39855-tags-فروش-آپارتمان-بلوار صیاد شیرازی part only.

sanitize string for use in href with PHP GET

I am trying to add a user-defined string to information passed to a third party via href. So I have something that will look like
Link Text
USERSTRING is known when the page loads so it could be put in the href by php when the page loads, or I can dynamically add it with javascript.
What I don't know is what I need to do to escape any special characters so that the link works and can be read on the other end - USERSTRING could be something really annoying like: [He said, "90% isn't good enough?"] The data is only used in an auto-generated file name so it doesn't need to be preserved 100%, but I'm trying to avoid gratuitous ugliness.
The urlencode() function provides exactly what you are looking for, ie:
Link Text
You need to urlencode it. If the variant of urlencode you end up using doesn't encode '&', '#', '"', and angle brackets as it should then you'll need to HTML encode it too.

Formatting text with regex

I have two functions to format the text of my notices.
1. Converts [white-text][/white-text] into <font color=white></font>
$string = preg_replace("/\[white-text\](\S+?)\[\/white-text\]/si","<font color=white>\\1</font>", $string);
2. Converts [url][/url] into <a href></a>
$string = preg_replace("/\[url\](\S+?)\[\/url\]/si","\\1", $string);
Problems:
WHITE-TEXT - It only changes the color if the phrase has only ONE word.
URL - It works fine, but I would like to be able to write anything in the readable part of the URL.
URL - It works fine, but I would like to be able to write anything in the readable part of the URL.
Make the URL code have the form [url=href]description[/url], you can then use this simple RegExp
"/\[url=([^\]]*)\](.+?)\[\/url\]/si"
"\\2"

about use regex to convert url to link

I need to convert the url in the article to the 3g domain.
for example, i need to convert
here is the link:http://www.mydomain.com/index thanks
to
here is the link:<a href='http://3g.mydomain.com$4' target='_self'>http://3g.$3.com$4</a> thanks
don't convert the other domain, just mydomain. here is the code:
$c = "/([^'\"=])?http:\/\/([^ ]+?)(mydomain)\.com([A-Za-z0-9&%\?=\/\-\._#]*)/";
$b=preg_replace($c, "$1<a href='http://3g.$3.com$4' target='_self'>http://3g.$3.com$4</a>",$b);
it works very well,but if the text like this:
a link
it will return the wrong result like this:
a link
but l need the result of
a link
how should i do?
You should do the following:
Strip target attributes from existing hyperlinks
Rewrite hyperlinks in href attributes
Rewrite any other hyperlinks
$plain = "http://([^ ]+?)(mydomain)\.com(/?[^'\"\s]*(?=['\"\s]))";
$plain_replace = "http://3g.$3.com$4";
$in_href = "href=(['\"])" + plain + "(['\"])";
$in_href_replace = "href='http://3g.$3.com$4' target='self'";
$strip_target = "target=['\"][^'\"]*['\"]";
...
So:
Replace $strip_target with ""
Replace $in_href with $in_href_replace
Replace $plain with $plain_replace
(The regexes are tested to work in C#, you might have to adjust the \ escaping to suit the php regex rules.)
Get rid of the first ? in your regular expression. That allows for the absence of a preceding character.
Or, perhaps more to your intention, if you want to allow URLs at the beginning, you can replace:
([^'\"=])?
with:
(^|[^'\"=])
...which will allow a link if at the very beginning, or if not preceded by a quote, etc., but not otherwise.

Get youtube video feed by tags with space in tags

To get the feed of a certain uploader with a certain tag I use the following api url:
http://gdata.youtube.com/feeds/api/users/UPLOADER/uploads/-/TAG
if I wanted to search i.e. for the feed with tags foo and bar, I would use the following:
http://gdata.youtube.com/feeds/api/users/UPLOADER/uploads/-/foo/bar
BUT since youtube lets you specify tags with space, i.e. "foo bar", I want to search exactly for this tag. So when I use the first URL in combination with urlencode in PHP it won't return anything.
In the browser, the URL will change to .../uploads/-/foo%20bar, but also no results.
When I use uploads/-/foo/bar, the problem is that it returns videos having the tags 'foo' and 'bar' (wrong), or only 'foo bar' (right).
I also tried to replace the space with /, +, and -. Using the keywords.cat scheme in the URL will also return the same results.
Is there anything I missed, or is it generally not possible?
seems there is a bug in the youtube api which is not fixed yet. see http://groups.google.com/group/youtube-api-gdata/browse_thread/thread/dc195bd6ad6a1fa4/2d9cf0e15ce7de50
As mentioned there is a YouTube bug, but you can use %2B (+) instead of a space to work around it.
So using your example:
http://gdata.youtube.com/feeds/api/users/UPLOADER/uploads/-/foo%2Bbar
I would have added this as a comment, but I lack the rep.
http://gdata.youtube.com/feeds/api/users/UPLOADER/uploads/-/{http%3A%2F%2Fgdata.youtube.com%2Fschemas%2F2007%2Fkeywords.cat}foo+bar
Try it with the keywords.cat specified before your tags.
I ran into the same problem and was able to get it to work by using the
https://gdata.youtube.com/feeds/api/videos?q= format.
i.e.
https://gdata.youtube.com/feeds/api/videos?q=+Team%20Fortress%202
And you can check this by changing the + to a -:
https://gdata.youtube.com/feeds/api/videos?q=-Team%20Fortress%202
The downside to this is that the q parameter will query all metadata of the video, so adjust accordingly.
hth
Greg

Categories