Simplehtmldom - limit content size for get_html? - php

I'm using simplehtmldom to get the title of some links and wondering if I can limit the size of the downloaded content? Instead of downloading the whole content just the first 20 lines of code to get the title.
Right now I'm using this:
$html = file_get_html($row['current_url']);
$e = $html->find('title', 0);
$title = $e->innertext;
echo $e->innertext . '<br><br>';
thanks

Unless I've missed something, that's not the way file_get_html works. It's going to retrieve the contents of the page.
In other words, it would have to read the entire page in order to find what it's looking for in the next part.
Now, if you were to use:
$section = file_get_contents('http://www.the-URL.com/', NULL, NULL, 0, 444);
You could probably isolate the first 20 lines of html, so long as the page you are getting is always the same from the <!DOCTYPE html> to the </head><body> or <title></title>.
Then you could grab the first 20 lines, or so, again as long as the amount of Head is the same.
Then use:
$html = str_get_html($section);
And then from there use your 'Find'
$html->find('title', 0);
EDIT:
include('simple_html_dom.php');
$the_url = 'http://www.the-URL.com/';
// Read 444 characters starting from the 1st character
$section = file_get_contents($the_url, NULL, NULL, 0, 444);
$html = str_get_html($section);
if (!$e = $html->find('title', 0)) {
// Read 444 characters starting from the 445th character
$section = file_get_contents($the_url, NULL, NULL, 444, 888);
$html = str_get_html($section);
$e = $html->find('title', 0);
}
$title = $e->innertext;
echo $title . '<br><br>';

Related

Printing text file contents into columns

I have this text file:
https://drive.google.com/file/d/0B_1cAszh75fYSjNPZFRPb0trOFE/view?usp=sharing
I can print it using the following code:
$file = fopen("gl20160630.txt","r");
while(! feof($file))
{
echo fgets($file). "<br />";
}
fclose($file);
But it looks like this:
I want the contents of this text file to be separated into four columns -Line, Description, Legacy GL Code and Closing Balance. If any one of these columns is empty it should remain empty. I just want to print those lines that start with ====>
Could you please help me find a way to print the text file like the way I want?
It's actually pretty simple, since your file has a strict number of character for each column.
All you need to do is a substr on each line starting by '====>{line}', then you can read each column by there position in the file.
Here is an example using your file :
$file = fopen("gl20160630.txt","r");
while(! feof($file))
{
$fullLine = fgets($file);
$line = substr($fullLine, 5, 4);
if (is_numeric($line)) {
$liability = trim(substr($fullLine, 10, 30));
$legacy = trim(substr($fullLine, 40, 39));
$balance = trim(substr($fullLine, 79, 15));
if ($liability != null && $legacy != null && $balance != null)
echo $line." ".$liability." ".$legacy." ".$balance."\n";
}
}
fclose($file);
You can see that all I do is:
check if the character in the column 'Line' are numbers
then I get all the other element
I 'clean' them by getting rid of unwanted characters (spaces, ...) with trim
After that, I check that all elements are filed
And I finally display them
I hope that this will help you, have a nice day ;)

Linebreak on text to create paragraphs php pdo

Is there a way to line break a certain sentences on the result text on pdo to break them in to two or more paragraphs each results..trying to achieve something like this pic
While my output was this one
I tried the answers in similar topic like this one Multiple paragraphs
using the code similar to
$content = $row['content'];
$breakpoint = round($content.length / 2); // half of the string length
$first = substr($content, 0, $breakpoint);
$second = sbustr($content, $breakpoint);
But it gives me a "undefined constant length" error Y.Y
my code block to load two results column names "details" and "more_details"
<?php if(isset($_GET['page'])):?>
<?php foreach($courses as $row):?>
<h1><?php echo $row['Fullname'];?></h1>
<?php echo $slug;?>
<hr><p>
<?php echo $row['details'];?>
</p>
<br>
<p>
<?php echo $row['more_details'];?>
</p>
<?php endforeach;?>
found an answer thanks to forbs but i'm willing for another method if there would be another way
answer
$content = $row['content'];
$breakpoint = round(strlen($content)/ 2); // half of the string length
$first = substr($content, 0, $breakpoint);
$second = substr($content, $breakpoint);
Well here's the answer for undefined constant length.
That's because content.length is Javascript, and you are in php.
$breakpoint = round(strlen($content)/ 2);
That should fix it, though that's gotta look very strange to be broken in the middle of a word.

PHP: Get specific content of a website

I want to get specific content of a website into an array.
I have approx 20 sites to fetch the content and output in other ways i like.Only the port is always changing (not 27015, its than 27016 or so...)
This is just one: SOURCE-URL of Content
For now, i use this code in PHP to fetch the Gameicon "cs.png", but the icon varies in length - so it isn't the best way, or? :-/
$srvip = '148.251.78.214';
$srvlist = array('27015');
foreach ($srvlist as $srvport) {
$source = file_get_contents('http://www.gametracker.com/server_info/'.$srvip.':'.$srvport.'/');
$content = array(
"icon" => substr($source, strpos($source, 'game_icons64')+13, 6),
);
echo $content[icon];
}
Thanks for helping, some days are passed from my last PHP work :P
You just need to look for the first " that comes after the game_icons64 and read up to there.
$srvip = '148.251.78.214';
$srvlist = array('27015');
foreach ($srvlist as $srvport) {
$source = file_get_contents('http://www.gametracker.com/server_info/'.$srvip.':'.$srvport.'/');
// find the position right after game_icons64/
$first_occurance = strpos($source, 'game_icons64')+13;
// find the first occurance of " after game_icons64, where the src ends for the img
$second_occurance = strpos($source, '"', $first_occurance);
$content = array(
// take a substring starting at the end of game_icons64/ and ending just before the src attribute ends
"icon" => substr($source, $first_occurance, $second_occurance-$first_occurance),
);
echo $content['icon'];
}
Also, you had an error because you used [icon] and not ['icon']
Edit to match the second request involving multiple strings
$srvip = '148.251.78.214';
$srvlist = array('27015');
$content_strings = array( );
// the first 2 items are the string you are looking for in your first occurrence and how many chars to skip from that position
// the third is what should be the first char after the string you are looking for, so the first char that will not be copied
// the last item is how you want your array / program to register the string you are reading
$content_strings[] = array('game_icons64', 13, '"', 'icon');
// to add more items to your search, just copy paste the line above and change whatever you need from it
foreach ($srvlist as $srvport) {
$source = file_get_contents('http://www.gametracker.com/server_info/'.$srvip.':'.$srvport.'/');
$content = array();
foreach($content_strings as $k=>$v)
{
$first_occurance = strpos($source, $v[0])+$v[1];
$second_occurance = strpos($source, $v[2], $first_occurance);
$content[$v[3]] = substr($source, $first_occurance, $second_occurance-$first_occurance);
}
print_r($content);
}

Get date from earthtool - PHP & XML parsing

I found this web service which provides the date time of a timezone. http://www.earthtools.org/timezone-1.1/24.0167/89.8667
I want to call it & get the values like isotime with php.
So I tried
$contents = simplexml_load_file("http://www.earthtools.org/timezone-1.1/24.0167/89.8667");
$xml = new DOMDocument();
$xml->loadXML( $contents );
AND also with
file_get_contents
With file_get_contents it gets only a string of numbers not the XML format. Something like this
1.0 24.0167 89.8667 6 F 20 Feb 2014 13:50:12 2014-02-20 13:50:12 +0600 2014-02-20 07:50:12 Unknown
Nothing worked. Can anyone please help me that how can I get the isotime or other values from that link using PHP?
Everything works):
$url = 'http://www.earthtools.org/timezone-1.1/24.0167/89.8667';
$nodes = array('localtime', 'isotime', 'utctime');
$cont = file_get_contents($url);
$node_values = array();
if ($cont && ($xml = simplexml_load_string($cont))) {
foreach ($nodes as $node) {
if ($xml->$node) $node_values[$node] = (string)$xml->$node;
}
}
print_r($node_values);

How to get correctly content and avoid breaking html tags using strip_tags with substr?

In my page I have some post previews from RSS feeds. Every post preview shows about 300 characters. When a user clicks on expanding button, then the #post-preview is replaced with the #post. The #post shows the rest of the post.
Everything fine with this but the format of the #post is not good, not readable. So I thought of allowing <br><b><p> tags, it will make it ok to be read. Because I don't want the user to be distracted, I want the tags to be allowed after the 300 chars.
With the following method, it is possible to break some tags where the $start ends and $rest starts. This means no good readable output.
$start = strip_tags(substr($entry->description, 0, 300));
$rest = strip_tags(substr($entry->description, 300), '<b><p><br>');
$start . $rest;
My question is how can I keep $start and $rest the same (no tags) until the 300 char, and after that $rest will show the formatted post? Are there any other ways of doing this?
Here is an example of a RSS feed structure (from view page source).
<item><guid isPermaLink="false"></guid><pubDate></pubDate><atom:updated></atom:updated><category domain=""></category><title></title><description></description><link></link><author></author></item>
I am looking for a way that does not kill performance.
Something like:
$start = substr($entry->description, 0, 300);
if(($pos = stripos($start, "<")) !== false) {
$start = strip_tags(substr($start, 0, $pos));
$rest = substr($entry->description, $pos);
}
else {
$start = strip_tags($start);
$rest = substr($entry->description, 300);
}
Ok, it's just a concept. Gets first 300 chars and checks for broken tag. If broken cut before it and get $rest from this point. If not broken just strip and get rest. There is at least 1 problem:
you never now the length of the $start(after strip_tags could be nothing left), could use loop with length checking but eeee... efficiency
EDIT
Ok, get it:
$start = "";
$chars = 400;
while(strlen($start) < 300) {
$start = strip_tags(substr($rss, 0, $chars));
$chars += 50;
}
$pos = stripos($rss, substr($start, strlen($start) - 50));
$rest = substr($rss, $pos+50);
Ok, little nasty and there are some cases on which it fails(with repetable text probably:D), tested on Ideone

Categories