I am trying to get all links of contains specific url page on a given page using PHPQuery. I am using the PHP support syntax of PHPQuery.
include_once 'phpQuery.php';
$url = 'http://www.phonearena.com/phones/manufacturer/';
$doc = phpQuery::newDocumentFile($url);
$urls = $doc['a'];
foreach ($urls as $url) {
echo pq($url)->attr('href') . '<br>';
}
The code above works . But it shows all the links
I want to show only those containing "/phones/manufacturer/".
I tried this but it shows nothing:
include_once 'phpQuery.php';
$url = 'http://www.phonearena.com/phones/manufacturer/';
$doc = phpQuery::newDocumentFile($url);
$urls = $doc['a'];
foreach ($urls as $url) {
echo pq($url)->attr('href:contains("/phones/manufacturer/")') . '<br>';
}
Use below coding get all urls from that site,
$doc = new DOMDocument();
#$doc->loadHTML(file_get_contents('http://www.phonearena.com/phones/manufacturer/'));
$ahreftags = $doc->getElementsByTagName('a');
foreach ($ahreftags as $tag) {
echo "<br/>";
echo $tag->getAttribute('href');
echo "<br/>";
}
exit;
Try this, a little italian guide, jquery documentation
include_once 'phpQuery.php';
$url = 'http://www.phonearena.com/phones/manufacturer/';
$doc = phpQuery::newDocumentFile($url);
$urls = $doc['a[href*="/phones/manufacturer/"]'];
foreach ($urls as $url) {
echo pq($url)->attr('href') . '<br>';
}
Related
I'm trying to write a script that list all the image URL from an specific URL. I used foreach in order to scan several pages, but I think os not working well.
This is my code:
<?php
include('simple_html_dom.php');
$array = array($page, $page2);
$page = "https://www.dllo.dev";
$page2 = "https://www.dllo2.dev";
$html = new simple_html_dom();
$html->load_file($array);
$images = array();
foreach($html->find('img') as $element) {
$images[] = $element->src;
}
reset($images);
echo "URL $array:<br /><br />";
foreach ($images as $out) {
$url = "$base$out";
echo "$url, ";
}
It's partially working, but only with the first URL ($page)... Any idea?
:D
let's say we load the source code of this question and we want to find the url alongside "childUrl"
or goto this site source code and search "childUrl".
<?php
$sites_html = file_get_contents("https://stackoverflow.com/questions/46272862/how-to-find-urls-under-double-quote");
$html = new DOMDocument();
#$html->loadHTML($sites_html);
foreach() {
# now i want here to echo the link alongside "childUrl"
}
?>
Try this
<?php
function extract($url){
$sites_html = file_get_contents("$url");
$html = new DOMDocument();
$$html->loadHTML($sites_html);
foreach ($html->loadHTML($sites_html) as $row)
{
if($row=="wanted_url")
{
echo $row;
}
}
}
?>
you can use regex:
try this code
$matches = [[],[]];
preg_match_all('/\"wanted_url\": \"([^\"]*?)\"/', $sites_html, $matches);
foreach($matches[1] as $match) {
echo $match;
}
this will print all urls with wanted_url tag
I want to find a specific url in a html page and get it's part.
the url is in this page :
http://site1.com/games/arcade/139173-angry-birds-friends-1-7-0.html`
and is like
http://download.site2.org/?server=2&apkid=com.rovio.angrybirdsfriends&ver=1.7.0
I want 3 parts of it:
2
com.rovio.angrybirdsfriends
1.7.0
My code:
$html = file_get_contents("http://site1.com/games/name/139173-angry-birds-friends-1-7-0.html");
preg_match("/download(.*)/", $html, $results)
echo = $results[0];
Is this what you are looking for?
$url = 'http://download.site2.org/?server=2&apkid=com.rovio.angrybirdsfriends&ver=1.7.0';
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query, $params);
echo $params['server'], PHP_EOL;
echo $params['apkid'], PHP_EOL;
echo $params['ver'], PHP_EOL;
Output:
2
com.rovio.angrybirdsfriends
1.7.0
Update
// Read HTML
$html = file_get_contents(
'http://getandroidapp.org/games/arcade/'
. '139173-angry-birds-friends-1-7-0.html'
);
// Turn HTML into a DOM document
$dom = new DOMDocument();
#$dom->loadHTML($html); // Mute warnings
// Find anchor ...
foreach ($dom->getElementsByTagName('a') as $link) {
$href = $link->getAttribute('href');
// ... having a query part that starts with 'server='
if (preg_match('#\?server=#', $href)) {
$url = $href;
// Parse query string from href
$query = parse_url($url, PHP_URL_QUERY);
parse_str($query, $params);
// Display values
echo $params['server'], PHP_EOL;
echo $params['apkid'], PHP_EOL;
echo $params['ver'], PHP_EOL;
// One is enough
break;
}
}
Output:
2
com.rovio.angrybirdsfriends
1.7.0
It is not totally fool-proof but maybe good enough in your case.
I found this code to check for links on an URL.
<?php
$url = "http://example.com";
$input = #file_get_contents($url);
$dom = new DOMDocument();
$dom->strictErrorChecking = false;
#$dom->loadHTML($input);
$links = $dom->getElementsByTagName('a');
foreach($links as $link) {
if ($link->hasAttribute('href')) {
$href = $link->getAttribute('href');
if (stripos($href, 'shows') !== false) {
echo "<p>http://example.com" . $href . "</p>\n";
}
}
}
?>
Works good, it shows all the links that contains 'shows'.
For example the script above find 3 links, so i get:
<p>http://example.com/shows/Link1</p>
<p>http://example.com/shows/Link2</p>
<p>http://example.com/shows/Link3</p>
Now the thing i try to do is to check those urls i just fetched also for links that contains 'shows'.
To be honest i'm a php noob, so i don't know where to start :(
Regards,
Bart
Something like:
function checklinks($url){
$input = #file_get_contents($url);
$dom = new DOMDocument();
$dom->strictErrorChecking = false;
#$dom->loadHTML($input);
$links = $dom->getElementsByTagName('a');
foreach($links as $link) {
if ($link->hasAttribute('href')) {
$href = $link->getAttribute('href');
if (stripos($href, 'shows') !== false) {
echo "<p>" . $url . "/" . $href . "</p>\n";
checklinks($url . "/" . $href);
}
}
}
}
$url = "http://example.com";
checklinks($url);
Make it recursive - call the function again in the function itself.
I'm trying to dynamically add a link to the beginning of all the links in an RSS feed.
So far I have this which looks to me like it should work. What am I missing here?
<?php
$id = $_GET['id'];
$url = $_GET['url'];
$xml = new DOMDocument();
$xml->load("$url");
foreach($xml->getElementsByTagName('a') as $link) {
$link->setAttribute('href', 'http://$id.refsite/url/' . $link->getAttribute('href'));
}
echo $xml->saveXML();
?>
edit : .. this section doesn't appear to be doing anything
foreach($xml->getElementsByTagName('a') as $link) {
$link->setAttribute('href', 'http://$id.refsite/url/' . $link->getAttribute('href'));
}
try to use removeAttribute and after setAttribute the href like :
$get_url = $link->getAttribute('href');
$newURL= "http://$id.refsite/url/".$get_url;
//remove and set href attribute
$link->removeAttribute('href');
$link->setAttribute("href", $newURL);
Just answered my own question.
This is what I was trying to do
<?php
$id = $_GET['id'];
$url = $_GET['url'];
$page = file_get_contents("$url");
$pagefixed = str_replace("http://","http://$id.refsite/url/","$page");
echo $pagefixed;
?>
sometimes you just have a moment, lol