Begginer here, people. Could anybody suggest any kind of solution? I've an user inputed text.
First of all I check if the text has any urls:
$post = preg_replace('/https?:\/\/[\w\-\.!~?&+\*\'"(),\/]+/','<a class="post_link"
href="$0">$0</a>',$post);
And after that I need to retrieve that url and put as a variable($url) to this function:
$short=make_bitly_url('$url','o_6sgltp5sq4as','R_f5212f1asdads1cee780eed00d2f1bd2fd794f','xml');
And finally, echo both url and user's text. Thanks in advance for ideas and critiques.
I've tried something like that:
$post = preg_replace('/https?:\/\/[\w\-\.!~?&+\*\'"(),\/]+/e',$url,$post){
$shorten = make_bitly_url($url,'o_6sgltpmm5sq4','R_f5212f11cee780ekked00d2f1bd2fd794f','json');
return '<a class="post_link" href="$shorten">$shorten</a>';
};
But even for me it looks some kind of nonsense.
Bitly does have an API available for use. You should check out API Documentation
Here's how to use the bit.ly API from PHP:
/* make a URL small */
function make_bitly_url($url,$login,$appkey,$format = 'xml',$version = '2.0.1')
{
//create the URL
$bitly = 'http://api.bit.ly/shorten?version='.$version.'&longUrl='.urlencode($url).'&login='.$login.'&apiKey='.$appkey.'&format='.$format;
//get the url
//could also use cURL here
$response = file_get_contents($bitly);
//parse depending on desired format
if(strtolower($format) == 'json')
{
$json = #json_decode($response,true);
return $json['results'][$url]['shortUrl'];
}
else //xml
{
$xml = simplexml_load_string($response);
return 'http://bit.ly/'.$xml->results->nodeKeyVal->hash;
}
}
/* usage */
$short = make_bitly_url('http://davidwalsh.name','davidwalshblog','R_96acc320c5c423e4f5192e006ff24980','json');
echo 'The short URL is: '.$short;
// returns: http://bit.ly/11Owun
Source: David Walsh article
HOWEVER, if you wanted to create your own URL shortening system (similar to bit.ly -- and surprisingly easy to do), here is an 8-part tutorial from PHPacademy on how to do that:
Difficulty level: beginner / intermediate
Each video is approx ten minutes.
Part 1
Part 2
Part 3
Part 4
Part 5
Part 6
Part 7
Part 8
Related
I own a webshop and one of my suppliers is kind enough to give me a CSV file with product model numbers, price and title but they can't give me database dumps including their product descriptions. I am allowed to scrape the product descriptions though - the question is how?
All URLs include the model number like "title-of-product-MN-504-1.htm"
The descriptions are inside a <div> tag like "<div id="description"> Bla bla bla <other tag>bla bla </other tag> bla bla </div>"
Lets say I have all the model numbers in a csv file or MySQL table - how can I save the descriptions associated with the model number in the URL(also located within another div tag if that's easier)?
To sum up - input will be model numbers from a csv or MySQL table and the output should be a MySQL table(or csv) with the model numbers and the description from the div tag on individual pages.
I'm considering the following tools but I'm unsure how to connect them to do what I want: wget, cURL and PHP Simple HTML DOM Parser
You could use this http://phpcrawl.cuab.de/ and use this particular property: http://phpcrawl.cuab.de//classreferences/index.html, then to find the description : Extract string between html tags in php
As for your requirement of finding the modelnumber in URL's found on the crawled page you could use the following property: http://phpcrawl.cuab.de/classreferences/index.html
If you'd index the CSV file you got from them and index their site; I'd do the following
You build up a list of all the modelnumbers you need to get the description of.
Crawl their frontpage to start the process. gather URLs, add to visitlist
Visit every URL in your list that matches the modelnumber, get description, remove the model from the list. gather URLs, add to visitlist
Back to step 2 - repeat untill there's no more model on your list
As for how to get the URLs with the modelnumber in them: http://php.net/manual/en/function.strpos.php
Something like this, I leave the implementation up to you:
foreach($list_of_urls as $url) {
foreach($list_of_modelnumbers as $model) {
if(strpos($url, $model)) {
$list_of_urls_to_crawl[] = $url;
/* you can also remove the $model, but I already wrote it in a foreach loop */
break;
}
}
}
Then you can clear the $list_of_urls and append the new ones from the crawler results :)
foreach($list_of_urls_to_crawl as $url) {
//Set $crawler, let him go, get your description etc.
foreach($crawler->links_found as $url) {
$list_of_urls[] = $url;
}
}
And place it in a grand while($still_need_descriptions) loop.
Alternatively, if you don't like http://phpcrawl.cuab.de/, you could use PHP-Spider.
It would be as simple as writing a custom URL discoverer based on the CSV
and then parsing the crawled pages with XPath queries. See the example on https://mvdbos.github.io/php-spider/. The only thing you would need to change is the Discoverer class that is added to the Spider. Assuming you know how the URLs are built, it could look like this:
class CsvModelNumberDiscoverer implements Discoverer
{
protected $modelNumbersAndTitles = array();
public function __construct(array $modelNumbersAndTitles)
{
$this->modelNumbersAndTitles = $modelNumbersAndTitles;
}
public function discover(Spider $spider, Resource $document)
{
$urls = array();
foreach ($this->modelNumbersAndTitles as $number => $title) {
$urls[] = 'http://example.com/' . $title . '-MN-' . $number . '.htm';
}
return $urls;
}
}
The code where you run the spider would look like this:
$spider = new Spider('http://www.example.com');
$spider->addDiscoverer(new CsvModelNumberDiscoverer($modelNumbersAndTitles);
$result = $spider->crawl();
Finally, you could get the descriptions from the results like this:
foreach ($result['queued'] as $resource) {
$modelNo = $resource->getCrawler()->filterXpath("div[#id='modelNo']")->text();
$description = $resource->getCrawler()->filterXpath("div[#id='description']")->text();
}
If you don't know how the URLs are built, you would have spider the whole site (as in AmazingDreams' answer) and use the discoverer to match URLs to the list of model numbers. It take more time though.
Full disclosure: I wrote PHP-Spider.
You can first get the html code using
$homepage = file_get_contents('http://www.example.com/title-of-product-MN-504-1.htm');
Then you use the html code with the php dom parser, to get the value of the exact elements you need.
Im trying to perform a simple call using ebays search API, when I make a call I get no response, and the problem is with the actual call itself.
$endpoint = 'http://open.api.ebay.com/shopping?';
$responseEncoding = 'XML';
$version = '631'; // API version number
$appID = 'asdf3e6e3';
$itemType = "AllItemTypes";
$itemSort = "EndTime";
//find items advanced
$apicalla = "$endpoint"
."callname=FindItemsAdvanced"
."&version=$version"
."&siteid=0"
."&appid=$appID"
."&MaxEntries=10"
."&ItemSort=EndTime"
."&ItemType=AllItemTypes"
."&IncludeSelector=SearchDetails"
."&responseencoding=$responseEncoding";
$resp = simplexml_load_file($apicalla);
this call is the equivalent to
http://open.api.ebay.com/shopping?callname=FindItemsAdvanced&version=631&siteid=0&appid=asdf3e6e3&MaxEntries=10&ItemSort=EndTime&ItemType=AllItemTypes&IncludeSelector=SearchDetails&responseencoding=XML
My question is what am I missing to make this simple search call?
It looks like you're trying to use eBay's Shopping API, specifically the FindItemsAdvanced call which I believe was deprecated quite some time ago and may no longer be functional (I no longer see it in the call reference). What you want to do is use use findItemsAdvanced from eBay's Finding API.
First, you'll need to change your API endpoint & query string parameters a bit (see the aforementioned findItemsAdvanced call reference for the specifics, but I believe it'll look more like this (I haven't touched my findItemsAdvanced calls in at least 6 months, so I haven't tested this):
$endpoint = 'http://svcs.ebay.com/services/search/FindingService/v1?';
$responseEncoding = 'XML';
$version = '1.8.0'; // API version number (they're actually up to 1.11.0 at this point
$appID = 'asdf3e6e3';
$itemSort = "EndTimeSoonest";
//find items advanced
$apicalla = "$endpoint"
."OPERATION-NAME=findItemsAdvanced"
."&SERVICE-VERSION=$version"
."&GLOBAL-ID=EBAY-US"
."&SECURITY-APPNAME=$appID"
//."&MaxEntries=10" // look for an equivalent for this (maybe paginationInput.entriesPerPage?)
."&sortOrder=EndTimeSoonest"
//."&ItemType=AllItemTypes" // not needed AFAICT, otherwise look at itemFilterType
."&descriptionSearch=true";
."& RESPONSE-DATA-FORMAT=$responseEncoding";
$resp = simplexml_load_file($apicalla);
In addition to this, to use findItemsAdvanced, you must specify what you're searching for either by category (categoryId) or by keywords (keywords), hence the "Please specify a query!" error message.
So, you also need to add something like the following (assuming keywords):
$keywords = "something";
$apicalla .= "&keywords=" . urlencode($keywords);
Giving you the following:
$endpoint = 'http://svcs.ebay.com/services/search/FindingService/v1?';
$responseEncoding = 'XML';
$version = '1.8.0'; // API version number (they're actually up to 1.11.0 at this point
$appID = 'asdf3e6e3';
$itemSort = "EndTimeSoonest";
$keywords = "something"; // make sure this is a valid keyword or keywords
//find items advanced
$apicalla = "$endpoint"
."OPERATION-NAME=findItemsAdvanced"
."&SERVICE-VERSION=$version"
."&GLOBAL-ID=EBAY-US"
."&SECURITY-APPNAME=$appID"
//."&MaxEntries=10" // look for an equivalent for this (maybe paginationInput.entriesPerPage?)
."&sortOrder=$itemSort"
//."&ItemType=AllItemTypes" // not needed AFAICT, otherwise look at itemFilterType
."&descriptionSearch=true";
."& RESPONSE-DATA-FORMAT=$responseEncoding"
."&keywords=" . urlencode($keywords);
$resp = simplexml_load_file($apicalla);
One final note: If you want to load further details of specific items that you find in your results, you'll still want to use the Shopping API (specifically the GetSingleItem & GetMultipleItems calls). So, you may ultimately use a mix of the Shopping & Finding APIs.
It should be something like:
<?php
$url = 'http://svcs.ebay.com/services/search/FindingService/v1?OPERATION-NAME=findItemsAdvanced&SERVICE-VERSION=1.11.0&SECURITY-APPNAME=YOUR_APP_ID&RESPONSE-DATA-FORMAT=XML&REST-PAYLOAD&paginationInput.entriesPerPage=2&keywords=ipod&siteid=203&GLOBAL-ID=EBAY-IN';
$xml = file_get_contents( $url );
$xml = simplexml_load_string( $url );
?>
Log-in to your ebay developer account and click on this link: Test your calls with API Test Tool
Hope this helps.
I did this function in PHP to get a page's title. I know it might look a bit messy, but that's because I'm a beginner in PHP. I have used preg_match("/<title>(.+)<\/title>/i",$returned_content,$m) inside the if before and it hasn't worked as I expected.
function get_page_title($url) {
$returned_content = get_url_contents($url);
$returned_content = str_replace("\n", "", $returned_content);
$returned_content = str_replace("\r", "", $returned_content);
$lower_rc = strtolower($returned_content);
$pos1 = strpos($lower_rc, "<title>") + strlen("<title>");
$pos2 = strpos($lower_rc, "</title>");
if ($pos2 > $pos1)
return substr($returned_content, $pos1, $pos2-$pos1);
else
return $url;
}
This is what I get when I try to get the titles of the following pages using the function above:
http://www.google.com -> "302 Moved"
http://www.facebook.com -> ""http://www.facebook.com"
http://www.revistabula.com/posts/listas/100-links-para-clicar-antes-de-morrer -> "http://www.revistabula.com/posts/listas/100-links-para-clicar-antes-de-morrer"
(When I add a / to the end of the link, I can get the title successfully: "100 links para clicar antes de morrer | Revista Bula")
My questions are:
- I know google is redirecting to my country's mirror when i try to access google.com, but how can I get the title of the page it redirects to?
- What is wrong in my function that makes it get the title of some pages, but not of others?
HTTP clients should follow redirects. That 302 status code means that the content you tried to get isn't at that location, and the client should follow the Location: header to figure out where it is.
You have two problems here. The first is not following redirects. If you use cURL, you can get it to follow redirects by setting this:
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, TRUE);
See this question for a full solution:
Make curl follow redirects?
The second problem is that you are parsing HTML with RegEx. Don't do that. See this question for better alternatives:
How do you parse and process HTML/XML in PHP?
Why not try something like this?? Works very well.
function get_page_title($url)
{
$source = file_get_contents($url);
$results = preg_match("/<title>(.*)<\/title>/", $source, $title_matches);
if (!$results)
return null;
//get the first match, this is the title
$title = $title_matches[1];
return $title;
}
I have a web application where I'm having the user log into youtube to authorize my web application to access their account. However, I need their youtube user name to actually list their uploaded videos, and I want to ensure that the youtube username they provide matches up with the account they used to authorize with. In other words, I only want the user to be able to share videos that they've uploaded, and not someone else's. Is there a way to do this?
I had done this before with C# .Net with the following code:
YouTubeRequestSettings yt_settings = new YouTubeRequestSettings(<devID name>, devKey, auth);
YouTubeRequest yt_request = new YouTubeRequest(yt_settings);
Uri uri = new Uri("http://gdata.youtube.com/feeds/api/users/default/uploads/?start-index=1&max-results=1");
Feed<Video> videoFeed = yt_request.Get<Video>(uri);
string uploader = "";
if (videoFeed.Entries.Count() > 0)
uploader = videoFeed.Entries.ElementAt(0).Uploader;
But when I try something similar with php I get what appears to be a standard feed:
$yt = new Zend_Gdata_YouTube($http_client,<devID name>,null,$_yt_dev_key);
$feed_url = urlencode("http://gdata.youtube.com/feeds/api/users/default/uploads/?start-index=1&max-results=1");
$videoFeed = $yt->getVideoFeed();
if(count($videoFeed) > 0)
{
$videoEntry = $videoFeed[0];
echo var_dump($videoEntry);
}
Anyone have any idea if I'm doing something wrong?
------------------ UPDATE ----------------------
I have the solution to getting youtube video feed for the user in the authenticated session. $videoFeed = $yt->getuserFeed("default");
Though, I'm still looking at how to get the uploader name from this so that I can perform further video listings directly from javascript (like I had done with my old C#/Asp .Net web app).
----------- A RATHER ROUGH SOLUTION ------------
Well, this isn't exactly an elegant solution, but it's what I have working.
The following will extract the youtube username from a VideoEntry object...
$videoFeed = $yt->getuserUploads("default");
if(count($videoFeed) > 0)
{
$videoEntry = $videoFeed[0];
$v_dump = var_export($videoEntry, true);
$check_for = "http://gdata.youtube.com/feeds/api/users/";
$pos = strpos($v_dump,$check_for);
$start_pos = $pos + strlen($check_for);
$user_name = "";
for($i = $start_pos; $i < strlen($v_dump); $i++)
{
if(($v_dump[$i] == '/') or ($v_dump[$i] == '?'))
break;
$user_name .= $v_dump[$i];
}
echo $user_name;
}
Basically, I'm parsing through the entire video entry variable string representation for the first generic part of the feed url, then getting the next token in the url which is the username. In other words, I'm locating this url in the var dump and parsing the username out:
http://gdata.youtube.com/feeds/api/users/<username>/uploads
If anyone can find a better and cleaner way to do it, that'd be awesome. Parsing a large string like this seems to be a very dirty way of doing this.
See http://code.google.com/apis/youtube/2.0/developers_guide_php.html#Enabling_user_interaction
It says,
Note: Using the string default instead of a username retrieves the profile of the currently authenticated user.
So you can get all the user data about the currently authenticated user using this.
Hope it helps!
Try this:
$user = $yt->getUserProfile('default');
$username = $user->getUserName();
I need to find the number of indexed pages in google for a specific domain name, how do we do that through a PHP script?
So,
foreach ($allresponseresults as $responseresult)
{
$result[] = array(
'url' => $responseresult['url'],
'title' => $responseresult['title'],
'abstract' => $responseresult['content'],
);
}
what do i add for the estimated number of results and how do i do that?
i know it is (estimatedResultCount) but how do i add that? and i call the title for example this way: $result['title'] so how to get the number and how to print the number?
Thank you :)
I think it would be nicer to Google to use their RESTful Search API. See this URL for an example call:
http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:stackoverflow.com&filter=0
(You're interested in the estimatedResultCount value)
In PHP you can use file_get_contents to get the data and json_decode to parse it.
You can find documentation here:
http://code.google.com/apis/ajaxsearch/documentation/#fonje
Example
Warning: The following code does not have any kind of error checking on the response!
function getGoogleCount($domain) {
$content = file_get_contents('http://ajax.googleapis.com/ajax/services/' .
'search/web?v=1.0&filter=0&q=site:' . urlencode($domain));
$data = json_decode($content);
return intval($data->responseData->cursor->estimatedResultCount);
}
echo getGoogleCount('stackoverflow.com');
You'd load http://www.google.com/search?q=domaingoeshere.com with cURL and then parse the file looking for the results <p id="resultStats" bit.
You'd have the resulting html stored in a variable $html and then say something like
$arr = explode('<p id="resultStats"'>, $html);
$bottom = $arr[1];
$middle = explode('</p>', $bottom);
Please note that this is untested and a very rough example. You'd be better off parsing the html with a dedicated parser or matching the line with regular expressions.
google ajax api estimatedResultCount values doesn't give the right value.
And trying to parse html result is not a good way because google blocks after several search.
Count the number of results for site:yourdomainhere.com - stackoverflow.com has about 830k
// This will give you the count what you see on search result on web page,
//this code will give you the HTML content from file_get_contents
header('Content-Type: text/plain');
$url = "https://www.google.com/search?q=your url";
$html = file_get_contents($url);
if (FALSE === $html) {
throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}
$arr = explode('<div class="sd" id="resultStats">', $html);
$bottom = $arr[1];
$middle = explode('</div>', $bottom);
echo $middle[0];
Output:
About 8,130 results
//vKj
Case 2: you can also use google api, but its count is different:
https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=ursitename&callback=processResults
https://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=site:google.com
cursor":{"resultCount":"111,000,000","
"estimatedResultCount":"111000000",