I am using simple_html_dom script to gather certain information from external pages.
The script I have been using worked pretty well until today. I tried to look for where the error came from and it seems the file_get_html function is not working anymore, only on certain URLs.
The basic code I am using is:
<?php include_once('../simple_html_dom.php');
echo file_get_html('http://www.hltv.org/match/2295100-')->plaintext; ?>
When executing this, I have a blank page on my OVH shared server. It seems the page returns a 503 error, even if the page actually exists. I am able to extract the content of the page from other servers (like AWS) however. What troubles me is that it's been working for 4 months now without any issue.
I made sure the MAX_FILE_SIZE of the simple_html_dom was increased but could not find any solution to this.
Any idea to solve the issue?
Thanks!
Try this:
<?php
include_once('../simple_html_dom.php');
$result = file_get_html('http://www.hltv.org/match/2295100-');
echo $result->plaintext;
?>
Related
This is a really weird situation that I can't explain.
I use simple HTML DOM and am trying to get the full code of this page:
http://ronilocks.com/
The thing is, I'm getting only part of what's actually on the page.
For instance: look at the page source code and see all the script tags that are in the plugins folder. There are quite a few.
When I check the same with the string I get back from simple HTML DOM none of them are there. Only wp-rocket.
(I used a clean file_get_html() and a file_get_contents() too and got the same result)
Any thoughts?
Thanks!
Edit: Is it possible that wp-rocket (installed on the page being scrapped) knows that the page is being scrapped and shows something different?
include 'simple_html_dom.php';
$html = file_get_html('http://ronilocks.com/');
echo count($html->find('a'));
// 425
I get 425. This looks right to me.
I´m trying a very simple php script which is about calling a json data through api calling on a online https link using MAMP.
However if I use the following code I have blank results:
<?php
$cnmkt = "https://api.coinmarketcap.com/v1/ticker/?limit=50";
$json = file_get_contents($cnmkt);
$fgc = json_decode($json,true);
echo $fgc[1]['percent_change_7d'];
?>
But if i copy/paste the content of the https link into a test.json file locally, substituting the https link with the test.json file on $cnmkt variable, the same exact script works properly.
I know i´m missing something very obvious, if someone could help me that would be very much appreciated, thanks.
Stefano
The script is working fine. I get an expected result of 4.63
Disable your AV/firewall and check again.
I dont know how to explain my need, and neither which key words to use to find a solution on google, so i'll give an url to be more clear:
check an IP (click on: Check your current IP address)
I'ld like, by using this website for example, getting somes informations after all the processus are terminated.
I tried with "file_get_contents" and with "cURL functions" but i did not find a way to do it, i always get the original source code.
Any idea ?
EDIT:
<body onLoad="setTimeout('get_my_blacklist()', 60000)">
...
...
<?php
echo '<iframe id="my_iframe" src="http://multirbl.valli.org/lookup/'.$ip.'.html">';
?>
...
...
<script>
function get_my_blacklist()
{
//function to get the content after somes secondes.
}
</script>
Here is the new code i tried thank to #Ludovic for is iframe idea.
Still working on it, i'll tell you if its working or not to solve my issue.
Edit2: Whatever how i try, i didnt find a way to get the containt of my frame window.. And even if i'ld succeed, i dont know how i can update my database if do it with JQuery/Javascript
First the page should have been construct by server script like PHP, at this step you have all IP requested then the page is modified by JQuery script who seems to query each IP.
The second step is an asynchronous script so you can't know when the page is effectively finished to construct.
I had a big PHP script written out to scrape images from this site: "http://www.mcso.us/paid/", but when it didn't work I butchered my code to simply echo the whole page.
I found that the table with the image links I want doesn't show up. I believe it's because the remote site uses ASP to generate the table. Is there a way around this? Am I wrong? Please help.
<?php
include("simple_html_dom.php");
set_time_limit(0);
$baseURL = "http://www.mcso.us/paid/";
$html = file_get_html($baseURL);
echo $html;
?>
There's no obvious reason why them using ASP would cause this, have you tried navigating the page with JavaScript turned off? It's a more likely scenario that the tables are generated through JS.
Do note that the search results are retrieved through ajax ( page http://www.mcso.us/paid/default.aspx ) by making a POST request, you can use cURL http://php.net/manual/en/book.curl.php , use chrome right-click-->inspect element---> network and make a search you will see all the info there (post variables etc ...)
I'm creating a form where the user should be able to enter any text (used to change articles on the site), Html, JavaScript or literally anything is allowed to type in and post, and so far everything worked. But today I suddenly got this strange error.
When I try to save text with Html to a MySQL database like this:
google
nothing goes wrong, but when I try it like this:
<img src="http://www.google.com/" />
The page does not load (forbidden error) and the database does not contain any of the text is should contain (the Html).
Instead the page shows the following error:
Forbidden
You do not have permission to access this document.
The same problem occurs when I try to post the following data:
src="http:
Why do I get a forbidden error when the post contains that specific piece of text, whats going on here?
Code I'm using:
if($_SERVER['REQUEST_METHOD']=="POST" && !empty($_POST['save'])){
$text = mysql_real_escape_string($_POST['textarea']);
$title = mysql_real_escape_string($_POST['title']);
$query = "INSERT INTO articles (text, title) VALUES ('".$text."','".$title."')";
When I remove the MySQL query I still get the error so it has nothing to do with the database. PHP safe mode is on, could that make a difference?
How can this be fixed?
Edit: Tried the complete application on my xampp server and it did not show the error, but on my hosting server I use the script in a password protected map. could that be the problem? Anyway I'm going to contact my hosting company.
It sounds a bit like mod_security, switched on and in its most aggressive mode, and it thinks you're trying to hack the site. The reason I say it only sounds a bit like that is because no-one should normally configure it to check POST data because that causes far too many false positives. But check the error log(s) as it will probably be listed there if it's that. If so you'll need to turn it off in the hosting settings or nag your host to do it.
Also try a bare minimum script: <?php var_dump($GLOBALS); ?> to see if the data reaches PHP at all.
try:
if($_POST && !empty($_POST['save'])){
$text = mysql_real_escape_string(htmlentities($_POST['textarea']));
$title = mysql_real_escape_string(htmlentities($_POST['title']));
Send this into your db
<a href=www.google.com>Google</a>
or
When called from db
echo "http://".$row_TabelName['RowName'];
It should solve your issue.
if you wish to use the following
base64encode() and insert ,after read base64decode()