file_get_contents empty spaces removal for Google image API - php

Currently i was facing a problem while developing a dictionary where i can show google image with meanings and it works fine.
The problem was the API was showing this warning message while submitting more than 1 words to the URL in PHP.
Warning: file_get_contents(https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=Pakistani Flag)
In the above example the API works fine to find out picture for Pakistani but adding Flag creating problems to show the message given above.

$encoded = urlencode('Pakistani Flag');

We can reslove the problem by replacing the empty spaces with %20 in PHP, for example your words are stored in a varible $word
$word = "Pakistani Flag";
convert the words with
$word_con = str_replace(" ", "%20", $word);
Finally we have
https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=$word_con
which works absolutely perfect!

Related

Preserve white space in URL with PHP

I am stuck big time on this problem and google has been of no help to me so far. I am trying to find a way to preserve white space in a URL with moderate to no luck.
I have a form that needs to gather post data, mail it, and then append the post data to the URL as comma separated value and redirects them to a page where they download a product.
Once the user presses download that page reads the data in the URL and applies it to a billing invoice (the program is billed on time usage).
A simplified example:
$addressOne = $_POST['addressOne'];
$newURL = "http://subdomain.domain.com/connectnow=on?" . ", Address1=" . $addressOne;
If(mailSent) {
header("Location: $newURL")
}
There are a lot more values obviously, but the address is one of the areas that I am having this issue.
I have tried doing something like:
$newURL = str_replace(" ", " ", $newURL);
That worked as far as preserving the whitespace in the URL visually, but when the program that gets downloaded reads the URL it replaces the as %C2%.
I have also tried:
$newURL = str_replace(" ", " \40", $newURL);
That made the spaces in the URL convert back to %20.
Any guidance would be appreciated.
URL:
www.site.com/my spaces preserved/
urlencode()
www.site.com%2Fmy+spaces+preserved%2F
urldecode()
www.site.com/my spaces preserved/

PHP code to present query result present as html, and add hyperlink to text that is a url

I have a cell in a database $Data. It could contain any data. some of that data might contain a url.
Using mysql I need php to return the text out of the cell, hyperlinking anything that is a url. I found the preg_replace function elsewhere on Overflow but its not working.
Im trying to find code that will extract $Data then present $Data as text plus hyperlinked url.
I found this:
preg_replace('/\b(https?:\/\/(.+?))\b/', '\1', $text);
but a) its not working and b) I need a statement to extract $Data first
Live regex says it works: http://www.phpliveregex.com/p/aqB (click on preg_replace on the right).
$data = \preg_replace('/\b(https?:\/\/.+)\b/i', '\1', $data);

Removing the "\ufeff" from the end of object -> content in Google+ API json result

The result from the Google+ API has \ufeff appended to the end of every "content" result (I don't really know why?)
What is the best way to remove this unicode character from the json result? It is producing a '?' in some of the output I am displaying.
Example:
https://developers.google.com/+/api/latest/activities/get#try-it
enter activity id
z12pvrsoaxqlw5imi22sdd35jwvkglj5204
and click Execute, result will be:
{
.....
"object": {
......
"content": "CONTENT OF GOOGLE PLUS POST HERE \ufeff",
......
example PHP code which shows a '?' where the '\ufeff' is:
<?php
$data = json_decode($result_from_google_plus_api, true);
echo $data['object']['content'];
// outputs "CONTENT OF GOOGLE PLUS POST HERE ?"
echo trim($data['object']['content']);
// outputs "CONTENT OF GOOGLE PLUS POST HERE ?"
Or am I going about this the wrong way? Should I be fixing the '?' issue rather than trying to remove the '\ufeff'?
In your case, you could use this regexp:
$str = preg_replace('/\x{feff}$/u', '', $str);
That way you can exactly match that code point value and have it removed.
From my experience there are a lot more white-spacey-character you want to remove. From my experienced this works well for me:
# I like to call this unicodeTrim()
$str = preg_replace(
'/
^
[\pZ\p{Cc}\x{feff}]+
|
[\pZ\p{Cc}\x{feff}]+$
/ux',
'',
$str
);
I found http://www.regular-expressions.info/unicode.html a pretty good resource about the fine details:
\pZ - match any kind of whitespace or invisible separator
\p{Cc} - match control characters
\x{feff} - match BOM
I've seen regex suggest to match \pC instead of \pCc, however this is dangerous because pC includes any code point to which no character has been assigned. I've had actual data (certain emojis or other stuff) being removed because of this.
But, YMMW, I cant' stress this.
By Respect to All Answers
I test most of answers but finally find solution here: GitHub
$field = preg_replace('/[\x00-\x1F\x80-\xFF]/', '', $field);

Removing emoticons from Instagram captions in PHP

Does anyone know to strip the emoticons from instagram captions? I subscribe to their api for feeding certain photo types and display them on my website. I tried used the command below in PHP to strip out emoticons with no success.
$newcaption = preg_replace('/[^a-zA-Z0-9 \.]/s', '', $caption);
Does anyone know how to get rid of instagram emoticons using PHP? I have no idea what their ascii character set is.
Instagram does not have native emoticons. My suggestion is, create an array with all known emoticons (emoji, Android, iOS, etc) and than search and remove them from the $caption.
I don't see other way to do it.
Update:
I have written a regex that could help you without creating the array, but it can lead into some mistakes. Try and watch the results, here it is:
$pattern = '/((?=:)|(?=;))(.*)(?=\s{1,})/i';
preg_replace($pattern, '', $string);

Need PHP Regex help

I've been working on this simple script all day trying to figure it out. I'm new to regex so please keep that in mind. On top of that, I've tried just about anything and everything I could to get this to work.
I'm trying to (to learn, please don't point me to the API) download a TSV file from Yahoo Site Explorer via either cURL or file_get_contents (both work, just messing with different things) and then using regex to get only the URL column to appear. I realize I might have more luck with other functions, but I can't find anything dealing with TSV and now it's become a challenge. I've literally spent the entire day trying to get this correct.
So a URL would be:
https://siteexplorer.search.yahoo.com/search?p=www.google.com&bwm=i&bwmo=&bwmf=s
And my regex currently looks like this (I know it's horrible...it's probably the millionth attempt):
preg_match_all('((http(s?)://?(([^/]+(\/.+))))^[\t]$)', $dl, $matches);
My issue right now is that there's 4 columns. TITLE URL SIZE FORMAT. I'm able to strip out everything from the first column (TITLE) and the last (FORMAT) column, but I cannot seem to strip out the SIZE column and get rid of the last slash in case the sites linking in don't have that last slash.
Another thing - I've actually accomplished getting JUST the URL to appear, but they all had ending slashes which leave out links from, say, Twitter.
Any help would be greatly appreciated!
Don't know much about PHP, but this regex works in python (should be the same in PHP):
".+?\t(.+?)\t.*"
Just match it and get the content of group 1. FWIW, code in Python:
import re
import fileinput
urlre = re.compile(".+?\t(.+?)\t.*")
for line in fileinput.input():
m = urlre.match(line)
if m:
print m.group(1)
Personally, I'd split the lines by tab. For example:
$stuff = file_get_contents($url);
// split the whole file by newlines, to get an array of lines
$lines = explode("\n", $stuff);
// loop through the lines
foreach ($lines as $line) {
// split by tab
$parts = explode("\t", $line);
// put the URLs in a list
$urls[] = $parts[1];
// or keep track of them by title
$urls[$parts[0]] = $parts[1];
// or whatever...
}
Just use parse_url or parse_str instead. Always try to find anything else than regular expressions which are extremely slow.

Categories