I want to try and get the latest movie I checked on the IcheckMovies site and display it on my website. I don't know how, I've read about php_get_contents() and then getting an element but the specific element I want is rather deep in the DOM-structure. Its in a div in a div in a list in a ...
So, this is the link I want to get my content from: http://www.icheckmovies.com/profiles/robinwatchesmovies and I want to get the first title of the movie in the list.
Thanks so much in advance!
EDIT:
So using the file_get_contents() method
<?php
$html = file_get_contents('http://www.icheckmovies.com/profiles/robinwatchesmovies/');
echo $html;
?>
I got this html output. Now, I just need to get 'Smashed' so the content of the href link inside the h3 inside a div inside a div inside a list. This is where I don't know how to get it.
...
<div class="span-7">
<h2>Checks</h2>
<ol class="itemList">
<li class="listItem listItemSmall listItemMovie movie">
<div class="listImage listImageCover">
<a class="dvdCoverSmall" title="View detailed information on Smashed (2012)" href="/movies/smashed/"></a>
<div class="coverImage" style="background: url(/var/covers/small/10/1097928.jpg);"></div>
</div>
<h3>
<a title="View detailed information on Smashed (2012)" href="/movies/smashed/">Smashed</a>
</h3>
<span class="info">6 days ago</span>
</li>
<li class="listItem listItemSmall listItemMovie movie">
<li class="listItem listItemSmall listItemMovie movie">
</ol>
<span>
</div>
...
There are some libraries which could help you!
One I've used for the same purpose, a long time ago, is this: http://simplehtmldom.sourceforge.net/
I hope it help you!
follow steps to achieve this
STEP1:-
First get the contents using file_get_contents in a php file
ex: getcontent.php
<?php
echo file_get_contents("http://www.icheckmovies.com/movies/checked/?user=robinwatchesmovies ");
?>
STEP2:-
CALL the above script using ajax call and add the content to a visibility hidden field in the html.
ex:
$('#hidden_div').html(response);
html:-
<html>
<body>
<div id='hidden_div' style='visibility:hidden'>
</div>
</body>
</html>
STEP3:-
now extract the id what ever you want.
What you are asking for is called as web scraping ,I have done this a few months back, the process goes like this,
Make a HttpRequest to the site from which you need the content,check
the php class for it
Use a DOM parse library for handling the downloaded page (it would be in html),simple HTLM DOM would be a good choice
Extract your required information
Here are some tutorials for you,
HTML Parsing and Screen Scraping with the Simple HTML DOM
Library
Beginning web page scraping with php
SO Posts:
HTML Scraping in Php
And best of all Google is your friend just search for "PHP scraping"
Related
I have referred to the solution given in this answer but failed to make it work. I have an index.php page which says:
<a href="page.php#anchor-name">
However, this link always direct to page.php and not to #anchor-name div on that page.
Is this so because I'm trying on localhost? Whats the problem that this works on HTML but not php? Whats the solution to make this work on php?
The relevant section of the code:
<a class="metro-box blue-gradient normal" href="Services.php#Ecommerce">
<div class="metro-box-icon"><i class="icon-user"></i></div>
<div class="metro-box-title"><h5>Cloud Utilities</h5></div>
</a>
I want the hyperlink to land on this div on Services.php:
<div class="three-fourth last-in-row" id="#ECommerce">
You are dealing with named anchors.
In this, you can give link to a particular section in the page.
E.g.
Say, I have following divs.
<div id="first">Bla Bla</div>
And this div is far below header, you can give a link to it using hash.
Go to First
By clicking on the link, your respective div will get focused.
I'm trying to get the content of the div with id result_box in google translate using simple_html_dom, but it returns nothing. I tried to return another div's content and it worked perfectly. looks like the problem is just with that div.
result_box is the div that the translation will appear on.
here is my code
$googleSearch="the contente i want to translate";
$googlePage="https://translate.google.com/#en/ar/$googleSearch";
$Chtml = file_get_html($googlePage);
$gt = $Chtml->find('#result_box ',0)->plaintext;
echo '<br>'.$gt.'<br>';
What could be the cause? and how can I solve it?
If this can't be done with simple html dom, is there any alternative ways to do it?
Note that I don't want to use Google translate API
The result_box element does not have any plain text:
<div style="zoom:1" dir="ltr">
<div id="tts_button">
<span id="result_box" class="short_text" lang="en">
<span class="hps">spring</span>
</span>
</div>
I don't know what $Chtml->find('#result_box',0) is doing, but probably you need to iterate to the next <span> and get the text from there
EDIT:
Furthermore, I do not know whether the trailing blank on the Id would probably prevent simple_hmtl_dom from finding anything:
$Chtml->find('#result_box ',0) vs. $Chtml->find('#result_box',0)
<base href="http://www.w3schools.com/" target="_blank">
How can i add the above base URL for the particular div section using php like
If i have two div section
This section links will work with above base URl link
<div>
help
</div>
This section links will work with websites URl link
<div>
help
</div>
I'm trying "simple html dom" to fetch some content from some other website paste it under one div section that has relative links. That can be converted to direct link like "/images/image.png" to "www.example.com/images/image.png"
<div class='normalUrl'>
Help
</div>
<div class='otherBaseUrl'>
Help
</div>
This is a hardcoded way just to show how the end result can be achieved. I'm sure you want to make this more generic to fit your goal.
There is no way to do this directly with html. You can do it with some server side languages like PHP like following.
abc
abc
or
You can also do it with javascript ( client side )
I'm trying to get some attributes for HTML tags in a web page.
<html>
<head>
<title>test page</title>
</head>
<body>
<div id="header" class="clearit" role="banner">
<div id="headerWrapper">
<ul id="primaryNav" role="navigation">
<li id="musicNav" class="navItem">
Music
</li>
<li id="listenNav" class="navItem">
Radio
</li>
<li id="eventsNav" class="navItem">
Events
</li>
<li id="chartsNav" class="navItem">
Charts
</li>
<li id="communityNav" class="navItem">
Community
</li>
<li id="originalsNav" class="navItem">
Originals
</li>
</ul>
</div>
</div>
</body>
</html>
For example, I need the actual height and width for #headerWrapper and compare it with #musicNav in my PHP script. Since PHP is server-side, I can't get these attributes so I'm thinking to append Javascript code to calculate these attributes and store them in a JSON file like in this code:
<script type="text/javascript">
document.ready(function() {
var JSONObject= {
"tagname":"headerWrapper",
"height":$("#headerWrapper").height(),
"width":$("#headerWrapper").width()
},
{
"tagname":"musicNav",
"height":$("#musicNav").height(),
"width":$("#musicNav").width()
}
});
});
</script>
Then I'd like to read it in the php file that contains my algorithm to extract visual features from web pages. So I need to render the web page with appended Javascript using some browser. I'm using exec to send the new file to Firefox, like this:
exec('"C:\Program Files (x86)\Mozilla Firefox\firefox.exe" "http://localhost/Autoextractor/test.html" 2> errors.txt');
And Firefox opens in taskmanager but does not dispaly, the page is not rendered, and my appended Javascript code is not executed.
safe_mode=off - disabled_functions deleted from php.ini and when executing
exec("whoami");
The result is my user (note: my user in Administrator group) and I did try wscript with no result.
Does anyone have any idea why it's not working, or has another solution to get the dimensions of HTML tags?
Simply running a browser won't allow you to read any data back from it, so forget about using system.
You can use Selenium Webdriver to control a browser with PHP, run JavaScript, then return the result.
When you write your real JavaScript, you will need to fix the syntax errors that appear in the example you included in the question.
Keep in mind that the size of elements on screen will depend on factors such as installed fonts, chosen font size, browser, window size, etc. You can get a result for a browser running on your system, but you can't depend on it to be a universal result.
"have another solution to get Get dimension of HTML tags?"
Something wrong with Firebug/Inspect, which will give you rendered offsets with a few simple operations.
Run your code in a console if you want to do it programmatically, though you'll still need firebug/Inspect to find the right selectors (which really obviates the ability to do any of this automatically). Trying to log it all... well, it sounds like you're trying to keep a database... perhaps you should set one up.
This might be a problem that you need to add more context to get a useful response.
Hi I was wondering what php code I can use that would allow my page to contain more than one image that changes. I would like the php code to output the HTML tag. Can anyone help me with this? Thanks!
you need to output all images to the browser from PHP, then use Javascpt to change the pictures. PHP is for server side scripts, the change you are talking about requires Client Side procxessing, ie Javascript. Do a search for jQuery image changer for some ideas
The kind of mechanism you are referring to is often called a slideshow or a carousel, and it is done with javascript. Google "jQuery slideshow" or "jQuery carousel" and you'll find plenty of those.
If you need something simple, I suggest you the tiny the jQuery plugin tiny carousel.
Scroll to the How To section to know how to set it up.
Then, to change the images with php, you will need to do something like this:
<div id="slider-code">
<a class="buttons prev" href="#">left</a>
<div class="viewport">
<ul class="overview">
<?php foreach(glob('slideshow/*.jpg') as $imagePath): ?>
<li><img src="$imagePath" /></li>
<?php endforeach; ?>
</ul>
</div>
<a class="buttons next" href="#">right</a>
</div>