Download a Facebook Video (.mp4) via PHP - php

So for example I have this URL:
http://video.ak.fbcdn.net/hvideo-ak-prn2/v/1032822_578813298845318_1606611618_n.mp4?oh=c3c6a02985213f7c47386f4653792ca6&oe=5200506F&__gda__=1375798216_02752679a44bc4b3c514bee21e000959
How can I download the video source file via PHP?
Note that downloading the URL will not give me the video source!
// does not work:
file_put_contents('video.mp4', 'http://video.ak.fbcdn.net/hvideo-ak-prn2/v/1032822_578813298845318_1606611618_n.mp4?oh=c3c6a02985213f7c47386f4653792ca6&oe=5200506F&__gda__=1375798216_02752679a44bc4b3c514bee21e000959');
// this does not download the video source but instead gets me a file that links to the video hosted on Facebook.

file_put_contents('derp.mp4', file_get_contents('http://video.ak.fbcdn.net/hvideo-ak-prn2/v/1032822_578813298845318_1606611618_n.mp4?oh=c3c6a02985213f7c47386f4653792ca6&oe=5200506F&__gda__=1375798216_02752679a44bc4b3c514bee21e000959'));

This code is more smart, you just have to provide the video link in this code. To get the video link simple right click on video and then click Show Video link or you can directly copy video link from browser URL bar as shown in below image:
Then paste that URL in PASTE_FACEBBOOK_VIDEO_LINK_HERE section of code below
<?php
$options = array('http' => array('user_agent' => 'custom user agent string'));
$context = stream_context_create($options);
$response = file_get_contents('__PASTE_FACEBBOOK_VIDEO_LINK_HERE__', false, $context);
preg_match_all('#\bhttps?://[^,\s()<>]+(?:\([\w\d]+\)|([^,[:punct:]\s]|/))#', strip_tags($response), $match);
$searchword = 'video';
$matches = array_filter($match[0], function($var) use ($searchword) { return preg_match("/\b$searchword\b/i", $var); });
$filename = rand().".mp4";
file_put_contents($filename, fopen(reset($matches), 'r'));
The resultant .mp4 file will look like i.e 24424353.mp4

There is a simple way of doing this. You need to create the functions for the HD and SD quality and then the file getting function which uses curl
function hdLink($curl_content)
{
$regex = '/hd_src:"([^"]+)"/';
if (preg_match($regex, $curl_content, $match)) {
return $match[1];
} else {
return;
}
}
function sdLink($curl_content)
{
$regex = '/sd_src_no_ratelimit:"([^"]+)"/';
if (preg_match($regex, $curl_content, $match1)) {
return $match1[1];
} else {
return;
}
}
function url_get_contents($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.10240');
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
$data = curl_exec($ch);
curl_close($ch);
return $data;
}
So in your HTML you will pass the Facebook video URL to the url_get_contents() function
<?php
require_once("functions.php");
if (!empty($_POST["url"]) ) {
$data = url_get_contents($_POST["url"]);
$hdlink = hdLink($data);
$sdlink = sdLink($data);
if (!empty($sdlink) && !empty($hdlink) ) {?>
<a target="_blank" download data-href="<?php echo $hdlink; ?>" href="<?php echo $hdlink; ?>" class="btn btn-block btn-lg btn-success">Download Video</a>
<?php }
}
?>
Reference: How to develop your own Facebook Video Downloader in 3 Steps on answerbox.net

Related

how to get video id from tiktok video url

i am using two function for get the url or video play
1. for extract the tiktok for video with watermark
public function getDetails()
{
$url = $this->url;
$resp = $this->getContent($url);
$check = explode("\"contentUrl\":\"", $resp);
if (count($check) > 1) {
$video = explode("\"", $check[1])[0];
$videoWithoutWaterMark = $this->WithoutWatermark($url);
$thumb = explode("\"", explode("\"thumbnailUrl\":[\"", $resp)[1])[0];
$username = explode("/", explode("#", explode("\"", explode("\"url\":\"", $resp)[1])[0])[1])[0];
$result = [
'video'=>$video,
'withoutWaterMark'=>$videoWithoutWaterMark,
'user'=>$username,
'thumb'=>$thumb,
'error'=>false,
'message'=>false
];
}
else
{
$result = [
'video'=>false,
'withoutWaterMark'=>false,
'user'=>false,
'thumb'=>false,
'error'=>true,
'message'=>"Please double check your url and try again."
];
}
return $result;
}
private function cUrl($url)
{
$user_agent = 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_USERAGENT, $user_agent);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$result = curl_exec($ch);
curl_close($ch);
return $result;
}
and another function for get the video url without water mark is
private function WithoutWatermark($url)
{
//videi id for example 6795008547961752326
$dd = explode("video/",$url);
$url = "https://api2.musical.ly/aweme/v1/playwm/?video_id=".$dd[1];
return $url;
}
Please help me to find tiktok video id, or any way to create download link of video without watermark. how can i find the video id of the video so i will use this video id for create a download link " https://api2.musical.ly/aweme/v1/playwm/?video_id=v09044b90000bpfdj5q91d8vtcnie6o0";
Your function WithoutWatermark doesn't work.
If you have an url like: tiktok.com/#user/video/123456
then you can make a curl:
$data = cUrl($url)
You'll get a page from tiktok, with regex you can extract url video:
https://v16.muscdn.com/123etc
Then again curl with this above url, the response is bytes and inside with regex you can find something like this vid:yourvideoid

PHP - Simple Html Dom load multiple pages speed

I finally got my script to work but it takes a long time to do the search (via ajax). Basically by entering a keyword, it searches the page and captures all the titles, urls, and thumbnails of the videos. But the problem arose to me to capture the tags that were inside each video, so I had to forcibly access each video to capture the tags, the only way I could think of was to add a loop inside the loop that captures the found videos that is to say:
For each video found -> Capture title, thumbnail, URL -> With captured URL -> Go to that URL and capture your tags.
The code I used is basically the following, I need to know if there is any other method to speed up searches, either by optimizing the code or using another way:
My parse function:
<?php
function dlPage($href) {
$curl = curl_init();
curl_setopt($curl, CURLOPT_SSL_VERIFYPEER, FALSE);
curl_setopt($curl, CURLOPT_HEADER, "Accept-language: en-US");
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_URL, $href);
curl_setopt($curl, CURLOPT_REFERER, $href);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.4 (KHTML, like Gecko) Chrome/5.0.375.125 Safari/533.4");
$str = curl_exec($curl);
curl_close($curl);
// Create a DOM object
$dom = new simple_html_dom();
// Load HTML from a string
$dom->load($str);
return $dom;
}
?>
My script:
$buscartag = str_replace(' ', '+', $_POST['buscartag']);
$urlparse = "https://example.com/?k=".$buscartag;
$paginas = rand(0, 50);
$html = dlPage($urlparse."&p=".$paginas);
$counter = 0;
foreach($html->find('div.video-box') as $videos) {
if ($videos) {
$titulo = $videos->find('div.video-box>p[!class])>a[!class]',0)->attr['title'];
$pathvideo = str_replace('_', '', $videos->attr['id']);
$link = "https://www.example.com/".$pathvideo."/";
$thumb = $videos->find('div.thumb')->innertext
//HERE MY SECOND BUCLE FOR TAGS!!!
$gettags2 = array();
$html_tags = file_get_html($link);
foreach ($html_tags->find('a.nu') as $gettags){
$gettags2[] = $gettags->innertext;
if (!empty($titulo) && !empty($link) && !empty($idvideo) && !empty($urlimagen)){
$counter++;
//here will echo all variables
}}

Manipulate dom with php to scrape data

I am currently trying to manipulate dom throuhg php to extract views from an fb video page. The below code was working until a bit ago. However now it doesnt find the node that contains the views count. This information is inside a div with id fbPhotoPageMediaInfo. What would be the best way to manipulate the dom through php to get views of an fb video page?
private function _callCurl($url)
{
$ch = curl_init();
curl_setopt($ch, CURLOPT_SSL_VERIFYHOST, false);
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (Linux; Android 5.0.1; SAMSUNG-SGH-I337 Build/LRX22C; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/42.0.2311.138 Mobile Safari/537.36');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, false);
curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 20);
curl_setopt($ch, CURLOPT_URL, $url);
$response = curl_exec($ch);
$http = curl_getinfo($ch, CURLINFO_HTTP_CODE);
curl_close($ch);
return array(
$http,
$response,
);
}
function test()
{
$url = "https://www.facebook.com/TaylorSwift/videos/10153665021155369/";
$request = callCurl($url);
if ($request[0] == 200) {
$dom = new DOMDocument();
#$dom->loadHTML($request[1]);
$elm = $dom->getElementById('fbPhotoPageMediaInfo');
if (isset($elm->nodeValue)) {
$views = preg_replace('/[^0-9]/', '', $elm->nodeValue);
} else {
$views = null;
}
} else {
echo "Error!";
}
return isset($views) ? $views : null;
}
Here is what I've determined...
If you var_dump() on $request you can see that it's giving you a 302 code (redirect) rather than a 200 (ok).
Changing CURLOPT_FOLLOWLOCATION to true or commenting it out entirely makes the error go away, but now we're getting a different page from the one expected.
I ran the following to see where I was being redirected to:
$htm = file_get_contents("https://www.facebook.com/TaylorSwift/videos/10153665021155369/");
var_dump($htm);
This gave me a page saying I was using an outdated browser, and needed to update it. So apparently Facebook doesn't like the User Agent.
I updated it as follows:
curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/44.0.2');
That appears to solve the problem.
Personally I prefer to use Simplehtmldom.
FB like other high traffic sites do update their source to help prevent scraping. You may in the future have to adjust your node search.
<?php
$ua = "Mozilla/5.0 (Windows NT 5.0) AppleWebKit/5321 (KHTML, like Gecko) Chrome/13.0.872.0 Safari/5321"; // must be a valid User Agent
ini_set('user_agent', $ua);
require_once('simplehtmldom/simple_html_dom.php'); // http://simplehtmldom.sourceforge.net/
Function Scrape_FB_Views($url) {
IF (!filter_var($url, FILTER_VALIDATE_URL) === false) {
// Create DOM from URL
$html = file_get_html($url);
IF ($html) {
IF (($html->find('span[class=fcg]', 3))) { // 4th instance of span with fcg class
$text = trim($html->find('span[class=fcg]', 3)->plaintext); // get content of span as plain text
$result = preg_replace('/[^0-9]/', '', $text); // replace all non-numeric characters
}ELSE{
$result = "Node is no longer valid."
}
}ELSE{
$result = "Could not get HTML.";
}
}ELSE{
$result = "URL is invalid.";
}
return $result;
}
$url = "https://www.facebook.com/TaylorSwift/videos/10153665021155369/";
echo("<p>".Scrape_FB_Views($url)."</p>");
?>

copy particular div from Flipkart.com web scraping using Curl and Php

I want to copy particular div contain data from flipkart product web page and display it.
<table cellspacing="0" class="specTable">
///// contains /////
</table>
its table value are variable in some web page have 10 tables in same class and some page have more, how i can get all table value from this ?
Also wants to get specific specsValue, is it possible to get it also ?
<td class="specsKey">Brand</td><td class="specsValue">Apple</td>
Web page address: http://www.flipkart.com/apple-iphone-6/p/itme8ra5z7yx5c9j?pid=MOBEYHZ2JHVFHFBG
Sample code
$url = "http://dl.flipkart.com/dl/apple-iphone-6/p/itme8ra5z7yx5c9j?pid=MOBEYHZ2JHVFHFBG";
$response = getPriceFromFlipkart($url);
echo json_encode($response);
/* Returns the response in JSON format */
function getPriceFromFlipkart($url) {
$curl = curl_init($url);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 10.10; labnol;) ctrlq.org");
curl_setopt($curl, CURLOPT_FAILONERROR, true);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$html = curl_exec($curl);
curl_close($curl);
$regex = '/<meta itemprop="price" content="([^"]*)"/';
preg_match($regex, $html, $price);
$regex = '/<h1[^>]*>([^<]*)<\/h1>/';
preg_match($regex, $html, $title);
$regex = '/data-src="([^"]*)"/i';
preg_match($regex, $html, $image);
if ($price && $title && $image) {
$response = array("price" => $price[1], "title" => $title[1], "image" => $image[1]);
} else {
$response = array("status" => "404", "error" => "We could not find the product details on Flipkart $url");
}
return $response;
}
?>
Flipkart now change its interface and you can fetch the product price and all by using Flipkart API.
Currently I'm also using their API.
But I also want to fetch the product details using below curl command, if anyone is doing the same without any problem please share what else i have to add here to fetch the product webpage content, while debugging this by using getinfo() it will return 301 Moved Permanentlywith Status Code 0
$curl_handle=curl_init();
curl_setopt($curl_handle,CURLOPT_URL,<flipkart_url>);
curl_setopt($curl_handle,CURLOPT_CONNECTTIMEOUT,100);
curl_setopt($curl_handle,CURLOPT_RETURNTRANSFER,1);
curl_setopt($curl_handle, CURLOPT_REFERER, 'http://www.flipkart.com/');
curl_setopt($curl_handle, CURLOPT_USERAGENT, 'Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 GTB5');
$str = curl_exec($curl_handle);
$html = new simple_html_dom();
$html->load($str);

Google No-captcha and PHP

I am trying to implement the new version of captcha on my website.
What i did so far:
Inside the FORM:
echo '<div class="g-recaptcha" data-sitekey="XXXXXXXXXXXXXXXXXXXXXXXXXXXX"></div>';
Inside PHP:
$recaptcha = $_POST['g-recaptcha-response'];
if(!empty($recaptcha))
{
$google_url = "https://www.google.com/recaptcha/api/siteverify";
$secret = 'YYYYYYYYYYYYYYYYYYYYYYYYYYY';
$ip = $_SERVER['REMOTE_ADDR'];
$url = $google_url."?secret=".$secret."&response=".$recaptcha."&remoteip=".$ip;
$res = getCurlData($url);
$res = json_decode($res, true);
if($res['success'] == 'false')
{
$captcha_error = "Please re-enter your reCAPTCHA.";
}
}
The getCurlData function:
function getCurlData($url)
{
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($curl, CURLOPT_TIMEOUT, 10);
curl_setopt($curl, CURLOPT_USERAGENT, "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.16) Gecko/20110319 Firefox/3.6.16");
$curlData = curl_exec($curl);
curl_close($curl);
return $curlData;
}
What i want to achieve is to distinguish when the no-Captcha box is checked. I want to throw an error to the user if he/she did not check that box.
So far i only throw an error if the response from Google is "We are not sure if you are human, please proceed to our second level of verification" [if($res['success'] == 'false')].
PS: most of the code is written by Srinivas Tamada. You can find it here.
Thanks in advance.
The response is a JSON object:
{
"success": true|false,
"error-codes": [...] // optional
}
https://developers.google.com/recaptcha/docs/verify
If you parse that JSON you will get something like this:
object(stdClass)[1]
public 'success' => boolean false
public 'error-codes' =>
array (size=1)
0 => string 'missing-input-response' (length=22)
So if response contains an error code with 'missing-input-response' you can tell that user didn't click on checkbox.
I implemented No Captcha without curl in small library I wrote recently, so you can check it out if you want more details:
https://github.com/zoran-petrovic-87/ZorAuth
http://zoran87.blogspot.com/2014/12/zorauth-10b-complete-flexible-no.html

Categories