I create website for RSS reader.
I want to add new feature in it.
For this I want to test that webpage(url)
contain video or not??
Can any one tell me that how to do this??
I just want to test that the url or webpage contain video or not??
I believe the only way to do this is to scan the source code of the page through PHP and look for likely embedded video codes. Those can vary though, so you may need to look for several patterns. Some video sharing sites embed them via iframes though, and that will be harder to deal with.
There are plenty of examples on the web on how to parse a site's generated source code, but be careful of violating their terms of use.
Related
I'm building a web app where users can build custom web pages that pull content from other web pages. I know of a few options for doing this, and I'm not sure which is best, and if there are better solutions out there. Right now, I could:
Use iframes, which will (sort of) accomplish what I want, but will force the client to download and render all the web content, which seems slow. I've heard a lot of people say iframes are passe and should not be used, etc.
Use a library like wkhtmltopdf, which will render the html on the server side and generate a pdf image of it. This would work nicely, but the result is just an image, so text won't be selectable, links won't be clickable, etc. Also, I've heard that you can get in legal trouble for hosting other people's web content on your site without permission.
Use something like phpquery to literally scrape content off of other sites. This option could have the same legal issues as the above option.
Has anyone done anything like this, or does anyone have any thoughts?
The cleanest solution would be send off a http request server side, then render the html into your page as you require, this will also require changing all the urls of content and links to be absolute
eg:
<img src="\images\banner.png">
will work on the remote server, but once inside your page, the image will not exist. The most workable solution would be limit the functionality to images and links, then do a find / replace with regex to match relative urls and add the source address to it.
You will however run into legal issues if you are resending other peoples content from your server, even just html.
Using an iframe would be the quick dirty solution and probably have the least legal ramifications, as the browser sends a normal request to the site for the content.
I'd recommend DocRaptor for generating PDF files from HTML. It works in a similar fashion as wkhtmltopdf, but produces fully functional PDF files.
Here's a link to its homepage:
http://docraptor.com/
And a link to its API documentation:
http://docraptor.com/documentation
We are downloading images to our computers when we open new webpages. For example: If a webpage has an image(image.jpg), our computer downloads it while we are surfing that page.
Some webpages are using ajax methods. For example: You don't see an image on the page's source codes, however your computer downloads an image. Because, if you click a link on that page, ajax will be showing that image...
Let me show an example:
<div id="ajax_will_load_image_here"></div>
Okay, how can php curl see (or download) that image? Curl can't see that image when I try to use preg_match function. Actually there is an image. I want to download that image by using php curl. Any advice?
If i understand the question correctly there is no convinient way of doing that.
Your crawler/spider would have to parse the website and evaluate javascript.
There are libraries for that but support is very limited.
There are however methods where an actual browser is used to evaulate the page (without displaying it but setting proper environment variables like resolution etc).
Then the generated source including javascript dom modifications is available.
This is for example how the google search previews are generated.
But if you require user interaction it gets pretty specific and complicated.
I am sorry to dissapoint you, but using curl and preg metch the old school way we used to when javascript was not yet so common wont work.
However for most legit use cases this is more than sufficient and websites are today more and more designed to be non-javascript compliant. Especially the content for crawling purposes. It is a must in search engine optimization, and which website doesnt want that?
well what I'm actually trying to do is to figure out how BEEMP3.COM works.
Because of the site's speed, I doubt they scrape other sites/sources on the spot.
They probably use some sort of database (PostgreSQL or MySQL) to store the "results" and then just query the search terms.
My question is how do you guys think they crawl/spider or actually get the mp3 files/content?
They must have some algorithm to spider the internet OR use google's index of mp3 trick to find hosts with the raw mp3 files.
Any comments and tips or ideas are appreciated :)
QueryPath is a great tool for building a web spider.
I'm guessing they find MP3s using a combination approach - they have a list of "seed sites" (gathered from Google, Usenet or manually inserted) that they use as a starting points for the search and then set spiders running against them.
You need to write a script that will:
Take a webpage as a starting point
Fetch the webpage data (use cURL)
Use a regular expression to extract (a) any links (b) any links to mp3 files
Place any MP3 links into a database
Add the list of links to other webpages to a queue for processing through the above method
You'll also need to re-check your MP3 links regularly to erase any bad links.
Alternatively you can crawl MP3 spiders like beemp3.com and extract all direct download links and save them to your data base. you need only two file
I. Simple html Dom.
II. An application that can take extracted links to your database.
Check what i did in http://kenyaforums.com/bongomp3_external_link_search_engine_at_kenyaforums_com.php
You keep on asking in case of any contradiction.
I am wondering if there is a script similar to the facebook status update thing,
What I mean, is when for ex. I paste a youtube/other video site/image/link it automatically detects the contents of the page and associates an embed code with it (if its a video)..
So I'm wondering if there is a ready script that has a large database of websites and can associate video site url's with embed codes.
I could actually do something like that by myself but the problem is that I want to support a lot of websites, like facebook does.
Please help me find a solution.
Thanks.
Take a look at the Embedly API: http://api.embed.ly/
It gets the embed code for a lot of the popular video sites out there and also for some images sites. I highly recomend it,
you can try it out here.
You really wouldn't have to scan that large of a database. With videos, you could keep track of maybe the top three or so uploading sources (youtube, vimeo, metcafe...) and their embed codes.
As for images and links, those are pretty easy to detect and don't require any special embed code. By pretty easy, I mean very simple. Just use a simple regular expression to search for a link in their post.
If it's a picture, you can easily tell by looking at the file extension of the link (jpg, png, gif, etc.). If so, do whatever is proper to embed any old image. If it's just an ordinary old link (doesn't match any of your video sites, or doesn't end in a file extension for an image), just use the link itself.
The only marginally tricky part would be getting the unique embed codes for the video sites. But perhaps there is some external library/api that could do that small part for you (another answerer has provided has a proper API/pre-built library for this). However, images and links are mostly pretty simple.
EDIT It seems I misread your problem, and that you are only looking for pre-built libraries with video embed codes. In that case, the other answer is exactly what you want.
I am developing online video streaming website on PHP.
I need two functionalities:
Need to add title/text at bottom of the video dynamically.
Need to add background music to video dynamically.
Is it possible with PHP or any available open source library?
Can anyone guide me or provide links to this type of library ?
Thanks.
Editing video with PHP is an extremely bad idea. This idea very closely approximates impossible. At best you would need to decode the video which would be brutally slow in php.
If I had to tackle this problem, I would try to add the title and background music in the player, not to the video file itself. If you're streaming the video it is likely that you're using Flash or some other client-side player. You would need to write the player (or perhaps modify an existing one, there are several available) to add another layer over top of the movie for the title, and an audio track.
Slightly more hare-brained, but still easier than rewriting video in php, would be to layer a transparent image generated in php over top of the player using css and javascript, and embedding the audio in the page. This paragraph contains a terrible idea.
php doesn't provide any video oriented stuff to perform, but there is one CMS namely Kaltura, is designed and implemented in doing these kindaa stuff. Search it , download it and play it.
You can make your own player in Flash and add a default watermark. You don't need PHP for that.