Im using Curl with simple html dom to scrape a website and in order to fix relative links I insert a base tag like this:
foreach($html->find('head') as $f) {
$f->innertext = "<base href='$url'>" . $f->innertext;
}
Where $url is the website Im scraping. The problem is that the links are physically outputted like this:
link
While I need the full url in the link like so:
link
How can I achieve this?
append the url each time you are setting it.
$base_url = "http://www.somewebsite.com/";
foreach($html->find('head') as $f) {
$f->innertext = "<base href='$base_url$url'>" . $f->innertext;
}
Try to get the base URL like this:
<?php
$baseURL = "http://" . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'];
?>
then prepend $baseURL to your href
Related
I am making a redirect page on Wordpress. The PHP will return back to the website homepage. But I can't get back the home page url.
My php file path xampp\htdocs\wordpress\return.php.
And here is my code:
$url = "$_SERVER[HTTP_HOST]";
header('Refresh: 3;url=' . $url);
echo get_option('home');
echo $url;
The $url is localhost8080/wordpess/return.php.
I want to go url : local:8080/wordpress from url : localhost8080/wordpess/return.php.
How can I get back the url local:8080/wordpress?
Thx
Wordpress has a built-in function for that: wp_redirect(see doc)
require_once( dirname(__FILE__) . '/wp-load.php' ); // first you need to load WordPress libraries if you are in an external file.
wp_redirect( home_url() );
exit; // don't forget to exit after a redirection
From what I understand, you're trying to redirect your page from localhost:8080/wordpess/return.php to localhost:8080/wordpess/ using -
$url = "$_SERVER[HTTP_HOST]";
header('Refresh: 3;url=' . $url);
What you need to do is change your $url variable to the location where you want to redirect, which is -
$url = "http://localhost:8080/wordpess/";
header('Refresh: 3; url =' . $url);
Hope that's what you were looking for.
EDIT -
If you don't want to hard code the URL, you can try the following -
$url = "/wordpess";
header("Refresh: 3; url = http://" . $_SERVER['HTTP_HOST'] . $url);
From my understanding of your question, you want to go back one level from the current page. This is it?
If so, you can accomplish that by doing some string manipulation as follows:
<?php
// Given that your current url is in the '$url' var
$url = 'localhost8080/wordpess/return.php';
// Find the position of the last forward slash
$pos = strrpos($url, '/');
// Get a substring of $url starting at position 0 to $pos
// (if you want to include the slash, add 1 to the position)
$new_url = substr($url, 0, $pos + 1);
// Then you can have the redirection code using the $new_url variable
...
Please let me know if I misunderstood.
Hope it helps. Cheers.
I'm scraping some html from a webite using php simple html dom, which include several images. However the images is not pointing correctly to the website. For example below is a example of one of the images where you can see it is no pointing to the website. Is it possible to dynamically change the urls to point to the website for instance
http://www.url.com/bilder/flags_long/United States.gif
html example
<img src="/bilder/flags_long/United States.gif" align="absmiddle" title="United States" alt="United States" border="0">
sample code:
include('simple_html_dom.php');
$sum_gosu = file_get_html("http://www.gosugamers.net/counterstrike/news/30995-starladder-is-back-with-the-thirteenth-edition-of-starseries");
$gosu_full = $sum_gosu->find("//div[#class='content light']/div[#class='text clearfix']/div", 0);
How about concatenating the actual URL you fetched the document from and the relative image paths. Just to give an idea (this is not tested and you should definitely do some checks whether the image src attribute is relative or maybe absolute in some cases):
<?php
$url = 'http://www.url.com/';
$html = file_get_html($url);
$images = array();
foreach($html->find('img') as $img) {
// Option 1: Fill your images array (in case you only need the images)
$images[] = rtrim($url, '/') . '/' . ltrim($img->src, '/');
// Option 2: Update $img->src inside your $html document
$img->src = rtrim($url, '/') . '/' . ltrim($img->src, '/');
}
?>
UPDATE According your sample code my example could look like follows:
<?php
include('simple_html_dom.php');
$sum_gosu_url = "http://www.gosugamers.net/counterstrike/news/30995-starladder-is-back-with-the-thirteenth-edition-of-starseries";
$sum_gosu = file_get_html($sum_gosu_url);
$gosu_full = $sum_gosu->find("//div[#class='content light']/div[#class='text clearfix']/div", 0);
foreach($gosu_full->find('img') as $img) {
$img->src = $sum_gosu_url . $img->src;
}
?>
After that the img src attributes inside your $gosu_full document should be fixed and resolvable (downloadable by a client). Hope that helps and that I'm actually understanding your problem :)
$url="http://www.url.com";
$Chtml = file_get_html($url);
$imgurl=Chtml->find("img",0)->src;
echo $url.$imgurl;
Hello I'm currently working with php to generate a menu with a own build CMS system.
I'm making a dynamic link with : $url = $_SERVER['HTTP_HOST'].$_SERVER['REQUEST_URI']."/";
Than I'm adding . $row_menu['page_link'] from the database. At first it works perfect:
as example =
$row_menu['page_link'] = page2;
$url . $row_menu['page_link'];
it will return as example : http://example.com/page2
But when I click again, it adds page2 again like : http://example.com/page2/page2
How do i prevent this?
Thanks in advance!
Because at first time your $_SERVER['REQUEST_URI'] will be like http://example.com but when the user click on the link then the value of $_SERVER['REQUEST_URI'] would become http://example.com/page2.That's why it is appending two times.
Instead you can use HTTP_REFERER like
$url = $_SERVER['HTTP_REFERER'].$row_menu['page_link'];
Considering that your $_SERVER['HTTP_REFERER'] will results http://example.com.Also you can try like
$protocol = 'http';
$url = $protocol .'//'. $_SERVER['HTTP_HOST'] .'/'. $row_menu['page_link'];
REQUEST_URI will give you whatever comes after example.com, so leave that out all together.
$url = $_SERVER['HTTP_HOST'] . "/" . $row_menu['page_link'];
You can find a full list of the $_SERVER references here.
Try this:
$requested_uri = $_SERVER['REQUESTED_URI'];
$host = $_SERVER['HTTP_HOST'];
$uri_segments = explode('/',$requested_uri);
$row_menu['page_link'] = 'page2';
if($row_menu['page_link'] == $uri_segments[sizeof($uri_segments)-1]) {
array_pop($uri_segments);
}
$uri = implode('/',$uri_segments);
$url = 'http://'.$host.'/'.$uri.'/'.$row_menu['page_link'];
echo $url;
I am trying to pass a URL parameter as follows:
myscript.php?video=some.site.com/vids/v1.wnv
I am using the following in my PHP file
// get url parameter for video selected
$jAp = JFactory::getApplication();
$jInput = $jAp->input;
$video = $jInput->get('video');
header("Location: http://".$video
but when I echo out $video I get the following
some.site.comvidsv1.wmv
There is no http:// in what is echoed out.
How can I do this?
Try this,
header("Location: http://http://".$video);
With Joomla, you can add filters, such as STIRNG, HTML and so on.
Try the following:
$video = $jInput->get('video', '', 'RAW' );
Then you can use the following:
header("Location: " . $video);
For more information on JInput, have a read of the following:
http://docs.joomla.org/Retrieving_request_data_using_JInput
I'm trying to change a value in a string that's holding my current URL. I'm trying to get something like
http://myurl.com/test/begin.php?req=&srclang=english&destlang=english&service=MyMemory
to look like
http://myurl.com/test/end.php?req=&srclang=english&destlang=english&service=MyMemory
replacing begin.php for end.php.
I need the end.php to be stored in a variable so it can change, but begin.php can be a static string.
I tried this, but it didn't work:
$endURL = 'end.php';
$beginURL = 'begin.php';
$newURL = str_ireplace($beginURL,$endURL,$url);
EDIT:
Also, if I wanted to replace
http://myurl.com/begin.php?req=&srclang=english&destlang=english&service=MyMemory
with
http://newsite.com/end.php?req=&srclang=english&destlang=english&service=MyMemory
then how would I go about doing that?
Assuming that you want to replace the script filename of the url, you can use something like this :
<?php
$endURL = 'end.php';
$url ="http://myurl.com/test/begin.php?req=&srclang=english&destlang=english&service=MyMemory";
$pattern = '/(.+)\/([^?\/]+)\?(.+)/';
$replacement = '${1}/'.$endURL.'?${3}';
$newURL = preg_replace($pattern , $replacement, $url);
echo "url : $url <br>";
echo "newURL : $newURL <br>";
?>
How do you want them to get to end.php from beigin.php? Seems like you can just to a FORM submit to end.php and pass in the variables via POST or GET variables.
The only way to change what page (end.php, begin.php) a user is on is to link them to another page from that page, this requires a page refresh.
I recently made a PHP-file for this, it ended up looking like this:
$vars = $_SERVER["QUERY_STRING"];
$filename = $_SERVER["PHP_SELF"];
$filename = substr($filename, 4);
// for me substr removed 'abc/' in the beginning of the string, you can of course adjust this variable, this is the "end.php"-variable for you.
if (strlen($vars) > 0) $vars = '?' . $vars;
$resultURL = "http://somewhere.com" . $filename . $vars;