im looking for a Solution, to get a Video URL from a VideoWebsite.
Here is a sample video: XXXX
And here is the dynamic link to the video: XXX
Cause its dynamic i cant find the url, or have no clue how the urls get generated in php. I searched for php functions, but had no luck.
You would have to parse (and probably run) the actual SWF file that gets embedded and is responsible for playing the video, and see what requests it makes to get that actual content URL.
Doing this in pure PHP is a highly complex enterprise (if possible at all) and not for the faint of heart. I'm not aware of a ready-made solution that does this.
Though this doesn't answer the technical side of your question, if you're parsing videos from someone else's website, you might be best off contacting the site/content owners and asking how you can properly link to their content.
You could use this link to get the plain player SWF:
http://www.collegehumor.com/moogaloop/moogaloop.swf?clip_id=1946215&autostart=true
where clip_id holds the same id as seen in the url. I guess this is the best compromise as I doubt CollegeHumour wants you to get the plain movie, which also may lead to copyright issues.
Related
This is my first scraper https://scraperwiki.com/scrapers/my_first_scraper_1/
I managed to scrape google.com but not this page.
http://subeta.net/pet_extra.php?act=read&petid=1014561
any reasons why?
I have followed the documentation from here.
https://scraperwiki.com/docs/php/php_intro_tutorial/
And there is no reason why the code should not work.
It looks like you are specifying to find a specific element. Elements change dependent on the site you are scraping. So if it doesn't find the element you are looking for you get no return. Also I would look into creating your own scraping/spidering tool with curl. Not only will you learn a lot but you will find out a lot about how to scrape sites.
Also a side not you might want to consider abiding by the robots.txt file on the website you are scraping from or ask permission before scraping as it is considered impolite.
So, i would like to download youtube videos using a php script. I have googled a lot for now and there where more solutions but the one was using the http://youtube.com/get_video?data url but that is not possible now for a long time. I have found a greasemonkey script which works fine but i don't have a clue how could it work with php.
I have read that i must do something with the info which gives me for example this link:
http://www.youtube.com/get_video_info?video_id=g1SADcP5g1o
The question is what would be the best approach for this?
I would try to get some curl requests going on any of these resources and try to automate it that way.
I have it written in C++, not PHP. But it's not very simple yet not very complicated either. get_video_info output is URL encoded. Decode it and look for the stream_map set of streams. You'll notice a pattern in it. That's your starting point. Contains resolutions and download locations plus extras.
I wouldn't paste the PHP code here even if I had it :) They tend to change it...
I'm fairly new to PHP because i'm an Android programmer but i need to convert a pdf file to html. I don't want to use any external API's because they are way to pricey. Now I would like to use http://www.convertpdftohtml.com to convert my pdf to html. However that site does not have any API and only works manually. According to Tomer W. it is possible to simulate a POST action for the website and doing it automatically. https://stackoverflow.com/questions/9592926/online-pdf-to-html-conversion-api
Now i'm wondering how i would be able to do this. (I don't have a lot of knowledge about PHP) but i know people who might help me to get it working (if i have some kind of pseudo code)
This may soon be unnecessary. Mozilla have incorporated pdf.js into Firefox 19
I'm allowing users to embed content from youtube, vimeo, scribd, flickr, slideshare, etc. and therefore i'm allowing them to paste the embed code in a textbox.
I'm having a hard time figuring out how to:
(a) validate that its indeed a correctly formed embed code and
(b) whether its not any malicious code that the user is trying to get my
system to display.
This is a php website.
I've used htmlpurifier in the past. There are some others, but this one worked the best for me. You can whitelist all allowed code constructs and make the html code standard compliant. It's a good first line of defense against XXS attacks.
The library is quite big and can slow down your code if you don't install it correctly, so read the install docs carefully.
We will be implementing a system where we ask the user to specify the direct URL and we go and subsequently fetch appropriate data from that page.
I have got a list of many web page URLs.. and all of them contain videos. These videos are embed via simple HTML and tags. I can extract these tags by some RegEx techniques.
Now the problem is that majority of them use Javascript to embed these elements! And as they are from different websites.. They dont have any specific pattern.
The only thing i can do now is to make my "PHP execute the Javascript". And i'm stuck in this task..
I want this extraction to be done via PHP script. Ive tried jParser and jTolkenizer but i cant get it to work in this case.
Any help would be appreciated.
Thanks.
The only thing i can do now is to make my "PHP execute the Javascript". And i'm stuck in this task..
Oh goodness, don't do that. The effort you'll spend (and security problems you'll open) is going to be way, way more of a time sink than coding specific identifier code for each site that I'm blindly assuming that you're scraping.