Simple HTML DOM parsing isnt working - php

I am trying to learn HTML DOM parsing at my localhost. But I crash right in the beggining.
1) I have made a new project
2) I have downloaded files from simplehtmldom.sourceforge.net
3) I have put this code inside my HTML project
<!DOCTYPE html>
<!--
To change this license header, choose License Headers in Project Properties.
To change this template file, choose Tools | Templates
and open the template in the editor.
-->
<html>
<head>
<meta charset="UTF-8">
<title></title>
</head>
<body>
<?php
// put your code here
$html = file_get_html('http://www.google.com/');
?>
</body>
</html>
4) I get this error, when I run the script:
Fatal error: Uncaught Error: Call to undefined function file_get_html() in C:\xampp\htdocs\PhpProject1\index.php:15 Stack trace: #0 {main} thrown in C:\xampp\htdocs\PhpProject1\index.php on line 15
Am I misunderstanding something here?

I think PHP DOM Parser Library No Longer Support to PHP latest Version so, You Need to Use cURL for that It will help you.
<?php
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, "http://www.google.co.in");
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
$result = curl_exec($curl);
curl_close($curl);
print $result;
?>
Try this..!

file_get_html does not belong to php.
To use it you need to download and include PHP Simple HTML DOM Parser.
Then you can use it like this
include_once('simple_html_dom.php');
$html = file_get_html('http://www.google.co.in');

You need to include the DOM parser PHP library file in your HTML code using PHP include function.
<html>
<head>
<meta charset="UTF-8">
<title></title>
</head>
<body>
<?php
include_once('domparser.php');// use your file location
// put your code here
$html = file_get_html('http://www.google.com/');
?>
</body>
</html>

Related

If you can't use html inside of php, then why can you call an open ended html script within php tags?

If we cannot use html code in php due to the php engine's inability to parse html code, then why can we include an open ended (no closing html tag) html script within php tags?
I've tried replacing the include call with the bare contents of the included file, but this triggers an unexpected end of file error (which makes sense, since php isn't able to parse the html).
register.php:
<?php
$page_title = 'Register';
// the following script echoes $page_title as title, links to stylesheet, and opens body
include ('includes/header.html');
header.html:
<!DOCTYPE HTML>
<html lang = "en">
<head>
<meta charset="UTF-8">
<title> <?php echo $page_title ; ?> </title>
<link rel = "stylesheet" href="includes/style.css">
</head>
<body>
<header> <h1>Page Header</h1></header>
I expected consistency between:
a) include('includes/header.html')
and
b) simply inserting the header.html code.
Error message from b) was standard for when you insert html code within php:
Parse error: syntax error, unexpected '<', expecting end of file in C:\Abyss Web Server\htdocs\register.php on line 6
include is not just some stupid preprocessor macro. It will not simply paste the contents of one file into another. It is a language construct, which will "move into" the other file, process it as it was a file of it's own, while maintaining the context of the parent file.
When a file is included, parsing drops out of PHP mode and into HTML
mode at the beginning of the target file, and resumes again at the
end. For this reason, any code inside the target file which should be
executed as PHP code must be enclosed within valid PHP start and end
tags. - https://www.php.net/manual/en/function.include.php
This is also a reason, why you can catch parse errors in the outer file, if the inner file cannot be parsed properly.
Another important nuance is this:
It is possible to execute a return statement inside an included file in order to terminate processing in that file and return to the script which called it. Also, it's possible to return values from included files. You can take the value of the include call as you would for a normal function.
This however does not apply to function definitions, which will be processed irregardless of any return statements.
The closing PHP tag is only needed if you want to exit PHP mode into HTML in the same file. PHP will exit the PHP mode automatically at the end of every file, even the included ones.
You need to "turn off" PHP when you want to simply output HTML otherwise PHP is going to try and treat it as PHP code and it will fail as you've seen. For the situation you describe the simplest answer is to simply leave PHP while the contents of what were in header.html are output.
<?php
$page_title = 'Register';
?>
<!DOCTYPE HTML>
<html lang = "en">
<head>
<meta charset="UTF-8">
<title> <?php echo $page_title ; ?> </title>
<link rel = "stylesheet" href="includes/style.css">
</head>
<body>
<header> <h1>Page Header</h1></header>
<?php # Turn PHP back on for whatever else is in register.php

php web page not loading

I am not an expert but I am no noob at PHP, yet for whatever reason I am stomped as to why my document will not load. Here is my code.
<?php include 'header.php'; ?>
<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Home</title>
</head>
<body>
<p>Hello everyone</p>
</body>
</html>
When I pull out the PHP portion the HTML loads fine. Here is the code in my header.php file.
<?php
<href="index.php">Home</a>
?>
I have tried this on two different hosts, both of which are hosting other PHP websites and still getting issues. I have also validated it with W3Schools and another online PHP validator. Both didn't find any errors. Any help would be greatly appreciated.
Enable errors to see errors, this way:
ini_set('error_reporting', E_ALL);
ini_set('display_errors', true);
This code is a PHP error:
<?php
<href="index.php">Home</a>
?>
Try change to:
<?php
echo '<href="index.php">Home</a>';
?>
This:
<?php
<href="index.php">Home</a>
?>
Is no valid PHP. This would, however work:
Home
Inside of the PHP-tags you can only use PHP - no HTML. Also, <href> is no HTML tag.
Look at this question How to get useful error messages in PHP? to find out, how to enable error messages in PHP.

Process PHP and print to string

I've got a PHP page (content.php), containing plain HTML and content delivered by PHP variables
<html>
<head>
</head>
<body>
<h1><?php echo $title; ?></h1>
</body>
</html>
Then there is another PHP page, where I need the contents of content.php in a String, but already processed by the PHP parser. I already tried the file_get_contents() function, but this gives the raw output (PHP not processed). What I'm looking for, is this:
$var contents = somefunction('contents.php') with content:
<html>
<head>
</head>
<body>
<h1>Title</h1>
</body>
</html>
Any help much appreciated!
Try:
ob_start();
include ('contents.php');
$html = ob_get_clean();
ob_end_clean();
Have you tried include? Alternatively, if you need to get it from "outside" (like if you loaded it in your browser) use file_get_contents with the full http://example.com/filename.php URL.
either i am missing something badly here or else instead of using a function, in your second page why not just directly assign the contents to a string like this:
$my_String = "<html>
<head>
</head>
<body>
<h1>". $title ."</h1>
</body>
</html>";

Is there a standards-compliant way to start a PHP session and echo a JavaScript script in one include() statement?

I have two scripts that I call with two PHP include() calls. The first starts a session / sets cookies, the second loads one of two JavaScript scripts. To keep things valid, I've been using the two calls but I'd like to just combine them into one.
Current setup (simplified):
<? include "session.php" ?>
<!DOCTYPE HTML>
<html>
<head>
<? include "scripts.php" ?>
...
What I'd like:
<? include "session_and_scripts.php" ?>
<!DOCTYPE HTML>
<html>
<head>
...
But it's invalid markup. Now if it really doesn't matter, I'd like to do it this way. If there are serious repercussions, then I'm thinking of just echoing a DOCTYPE in the included PHP file, which I'd rather not do.
So which is better: echo the DOCTYPE, use include() twice, or use include() once and have invalid markup?
EDIT - The whole script (session and javascript) should ideally be fully implementable with one line of code (e.g. the one include())
Use ob_start at first to avoid problems with session_start
<?php ob_start();?>
<!DOCTYPE HTML>
<html>
<head>
<?php include "session_and_scripts.php"; ?>
A way that uses only 1 1file and no additional instructions:
<?php include "session_and_scripts.php" ?>
<!-- more head-stuff-->
</head>
<body>
<!--more content-->
session_and_scripts.php should do the following:
<?php
//do the session stuff
?>
<!DOCTYPE HTML>
<html>
<head>
<script type="text/javascript">
//some javascript
</script>
(But I would'nt say it's a good approach)
But it's invalid markup. Now if it really doesn't matter, I'd like to do it this way. If there are serious repercussions, then I'm thinking of just echoing a DOCTYPE in the included PHP file, which I'd rather not do.
Assuming that you do not want to have a valid markup, there is no problem, the only restriction is that session_start is called before any kind of "echo"...
Assuming you want a valid markup using only one include and without echoing the DOCTYPE from the included file, you can save the script text into a php variable and echo it in the main page after the inclusion
//main page
<? include "session_and_scripts.php" ?>
<!DOCTYPE HTML>
<html>
<head>
<?php echo $script;?>
// session_and_scripts.php
<?php
session_start();
$script = '<blablabla>';

How to change HTML title in PHP without breaking the XHTML markup validation?

Here is the structure of the web site:
PHP index file
//my class for analyzing the PHP query
$parameter = new LoadParameters();
//what this does is it accesses the database
//and according to the query, figures out what should be
//loaded on the page
//some of the things it sets are:
// $parameter->design - PHP file which contains the design
// HTML code of the page
// $parameter->content - Different PHP file which should be loaded
// inside the design file
$parameter->mysqlGetInfo($_SERVER['QUERY_STRING']);
//load the design file
include($parameter->design);
PHP design file
Just the generic structure. Obviously it has a lot more design elements.
<html>
<head>
...
</head>
<body>
<?php
//this loads the content into the design page
include($parameter->content);
?>
</body>
</html>
Question
So here is the problem I experience. The $parameter->content file is a dynamic PHP file, meaning the content also changes according to the query.
For instance if I have a image pages with queries like ?img=1 and ?img=2, my LoadParameter class will only look at the img part of the query and will know that the content of the page should be image.php. image.php however will look at the query again and figure out exactly what image to load.
This causes issues for me because I want to have a different <title></title> for different images. So my solution was just to set the <title></title> element in the content page. This works but it breaks the XHTML markup validation at W3C because it makes the structure of the site to be the following:
<html>
<head>
...
</head>
<body>
...
<title>sometitle</title>
...
</body>
</html>
And having <title></title> within <body></body> is not allowed.
So how can I change the title without breaking the XHTML markup validation?
Note: I can't use javascript because then Search engines would not be able to see the title of the page. I need to do it directly in PHP.
Thanx in advance.
why not do a second include to perform the title in the proper place?
<html>
<head>
<?php
inlcude($parameter->title);
?>
...
</head>
<body>
<?php
//this loads the content into the design page
include($parameter->content);
?>
</body>
</html>
Can't you just change the PHP code so that you can do something like:
<html>
<head>
<title><? print($parameter->title); ?></title>
</head>
<body>
<?php
//this loads the content into the design page
include($parameter->content);
?>
</body>
</html>
I'd move all of the <head> code into a 'common function' called something like html_head($title) and then have it put the title where it belongs.
Then simply call that function from within the pages and it's fixed.
Don't forget to include the <body> tag in that function, otherwise it won't work!
Elaborating ;)
function html_head($title) {?>
<html>
<head>
<title><?=$title?></title>
<!-- Put whatever you want... here! -->
</head>
<body>
<?}
Then in $parameter->content, call html_head("Title")
It would be easier if $parameter->content could be included without displaying its HTML code, but instead have a $parameter->display (or similar) function that displays the HTML code. That way, you can include the PHP code at the beginning of the file and not worry about being unable to access the title.
<?php
require_once($parameter->content);
?>
<!DOCTYPE html>
<html lang="en" dir="ltr">
<head>
<meta charset="UTF-8" />
<title><?php echo $parameter->title; ?></title>
</head>
<body>
<?php
echo $parameter->display;
?>
</body>
</html>
This is how I solved the issue.
I changed the PHP design to something like:
//get the content PHP file
//inside the file I set the following variables
//which are used below:
//$parameter->title - the string which contains the title
//$parameter->html - the string which contains the HTML content
include($parameter->content);
//string which will contain the html code of the whole page
$html = <<<EndHere
<html>
<head>
<title>
EndHere;
//add title
$html .= $parameter->title;
$html .= <<<EndHere
</title>
</head>
<body>
EndHere;
//add the content of the page
$html .= $parameter->html;
$html .= <<<EndHere
</body>
</html>
EndHere;
//output the html
echo $html;
And here is the basic structure of the Content PHP file. Since the only page which can possibly include the file is the my design page, I can reference $parameter in it.
//set the title
$parameter->title = "sometitle";
//set the content HTML
$parameter->html = "some HTML here";
It's not a very clean solution but it works fine.

Categories