I have a set of pdf documents and I need to convert the files to excel. When I tried to convert those I could not convert it was stating failed. So I have tried to convert to HTML tag all the data are converting but it was changing the whole document into P tags. I have attached the sample response of HTML tags. Please suggest me some ides for converting pdf to excel in php.
I have used this library
<p class="ft00" style="margin: 0; padding: 0; font-size: 11px; font-family: Times; color: #000000; position: absolute; top: 863px; left: 683px; white-space: nowrap;"><b>10</b></p>
<p class="ft01" style="margin: 0; padding: 0; font-size: 9px; font-family: Times; color: #000000; -moz-transform: matrix( 0, -1, 0.9999875, 0, 0, 0); -webkit-transform: matrix( 0, -1, 0.9999875, 0, 0, 0); -o-transform: matrix( 0, -1, 0.9999875, 0, 0, 0); -ms-transform: matrix( 0, -1, 0.9999875, 0, 0, 0); -moz-transform-origin: left 75%; -webkit-transform-origin: left 75%; -o-transform-origin: left 75%; -ms-transform-origin: left 75%; position: absolute; top: 207px; left: 63px; white-space: nowrap;"><b>THE</b></p>
Library: PdfToHtml
$pdf = new \TonchikTm\PdfToHtml\Pdf($pdf_file , [
'pdftohtml_path' => $bindir,
'pdfinfo_path' => $binexe,
'outputDir' => getcwd().'/excelfile/'.uniqid() // output dir
]);
$newpage = '';
foreach ($pdf->getHtml()->getAllPages() as $page) {
$newpage .= $page . '<br/>';
}
I've never had a use case to convert PDF to Excel, but what other libraries did you try?
PDFTables looks promising:
https://pdftables.com/pdf-to-excel-api#php
https://github.com/pdftables/php-pdftables-api
If it needs to be done with the PDF to HTML library you're using, I don't know of an easy/clean way to do this.
Based on the Question, each page returns a string containing all the paragraph elements?
If returns an array of elements, you should be able to loop over each paragraph and use strip_tags() to get the individual values.
If it's a string, look into using Simple HTML Dom to parse the HTML elements from string format so that you can work with the HTML more easily.
$html = str_get_html('<div id="hello">Hello</div><div id="world">World</div>');
First about my environment:
I'm working in scriptcase v9;
I'm trying to create a pdf report with TCPDF library.
I'm trying to call some CSS in a PHP fetch function, but with no success:
Variant A: I tried to use a big CSS stored at the beginning of the file and I called id's from there;
$_style = "
<style>
html, body {
margin:0;
padding:0
}
table {}
th {
border-bottom: 0.3px solid #D2D2D2;
height: auto;
padding-top: 30px;
padding-bottom: 20px;
text-align: left;
color: #2B2B2B;
}
#summary_tr {padding:0px 0px;}
#summary_tc1 {
width: 5%;
height: auto;
}
#summary_tc2 {
width: 5%;
height: auto;
border-left: 0.5px solid #D2D2D2;
}
#summary_tc3 {
width: 90%;
height: auto;
}
</style>";
function fetch_mresumeright($resright_id)
{
$server = "blabla";
$user = "blabla";
$pass = "blabla";
$db = "blabla";
$output = '';
$con = mysqli_connect($server,$user,$pass,$db) or die("Some error occurred during connection " . mysqli_error($con));
// Write query
$strSQL = "SELECT
summary
FROM
_mresume
WHERE id = ".$resright_id;
// Execute the query.
$query = mysqli_query($con, $strSQL);
if ($query->num_rows > 0)
{
while($row = mysqli_fetch_array($query))
{
$output .= '
<tr id = "summary_tr">
<td id = "summary_td_1"></td>
<td id = "summary_td_2"></td>
<td id = "summary_td_3">'.$row["summary"].'</td>
</tr>';
}
return $output;
}
//Close the connection
mysqli_close($con);
}
$_mresumeright = fetch_mresumeright($var_id);
// create new PDF document
$pdf = new TCPDF(PDF_PAGE_ORIENTATION, PDF_UNIT, PDF_PAGE_FORMAT, true, 'UTF-8', false);
// set document information
$pdf->SetCreator(PDF_CREATOR);
// set default header data
$pdf->SetHeaderData(PDF_HEADER_LOGO, PDF_HEADER_LOGO_WIDTH, PDF_HEADER_TITLE.' 007', PDF_HEADER_STRING);
// set header and footer fonts
$pdf->setHeaderFont(Array(PDF_FONT_NAME_MAIN, '', PDF_FONT_SIZE_MAIN));
$pdf->setFooterFont(Array(PDF_FONT_NAME_DATA, '', PDF_FONT_SIZE_DATA));
// set default monospaced font
$pdf->SetDefaultMonospacedFont(PDF_FONT_MONOSPACED);
// set margins
$pdf->SetMargins(PDF_MARGIN_LEFT, PDF_MARGIN_TOP, PDF_MARGIN_RIGHT);
$pdf->SetHeaderMargin(PDF_MARGIN_HEADER);
$pdf->SetFooterMargin(PDF_MARGIN_FOOTER);
// set auto page breaks
$pdf->SetAutoPageBreak(TRUE, PDF_MARGIN_BOTTOM);
// set image scale factor
$pdf->setImageScale(PDF_IMAGE_SCALE_RATIO);
// set some language-dependent strings (optional)
if (#file_exists(dirname(__FILE__).'/lang/eng.php')) {
require_once(dirname(__FILE__).'/lang/eng.php');
$pdf->setLanguageArray($l);
}
$right_column = <<<EOF
//=======right=========
//===css
$_style
//===html
<table>
$_mresumeright
</table>
EOF;
// write the second column - for A3
$pdf->writeHTMLCell(206, '', '82', '', $right_column, 0, 1, 0, true, '', true);
// reset pointer to the last page
$pdf->lastPage();
Variant B: I declared each ID as a parameter and I called it in my fetch function.
$summary_tr = "
padding:0px 0px;";
$summary_td_1 = "
width: 5%;
height: auto;";
$summary_td_2 = "
width: 5%;
border-left: 0.5px solid #D2D2D2;
height: auto;";
$summary_td_3 = "
width: 90%;
height: auto;";
$_style = "
<style>
html, body {
margin:0;
padding:0
}
table {}
th {
border-bottom: 0.3px solid #D2D2D2;
height: auto;
padding-top: 30px;
padding-bottom: 20px;
text-align: left;
color: #2B2B2B;
}
</style>";
function fetch_mresumeright($resright_id)
{
$server = "blabla";
$user = "blabla";
$pass = "blabla";
$db = "blabla";
$output = '';
$con = mysqli_connect($server,$user,$pass,$db) or die("Some error occurred during connection " . mysqli_error($con));
// Write query
$strSQL = "SELECT
summary
FROM
_mresume
WHERE id = ".$resright_id;
// Execute the query.
$query = mysqli_query($con, $strSQL);
if ($query->num_rows > 0)
{
while($row = mysqli_fetch_array($query))
{
$output .= '
<tr style = "'.$summary_tr.'">
<td style = "'.$summary_td_1.'"></td>
<td style = "'.$summary_td_2.'"></td>
<td style = "'.$summary_td_3.'">'.$row["summary"].'</td>
</tr>';
}
return $output;
}
//Close the connection
mysqli_close($con);
}
$_mresumeright = fetch_mresumeright($var_id);
// create new PDF document
$pdf = new TCPDF(PDF_PAGE_ORIENTATION, PDF_UNIT, PDF_PAGE_FORMAT, true, 'UTF-8', false);
// set document information
$pdf->SetCreator(PDF_CREATOR);
// set default header data
$pdf->SetHeaderData(PDF_HEADER_LOGO, PDF_HEADER_LOGO_WIDTH, PDF_HEADER_TITLE.' 007', PDF_HEADER_STRING);
// set header and footer fonts
$pdf->setHeaderFont(Array(PDF_FONT_NAME_MAIN, '', PDF_FONT_SIZE_MAIN));
$pdf->setFooterFont(Array(PDF_FONT_NAME_DATA, '', PDF_FONT_SIZE_DATA));
// set default monospaced font
$pdf->SetDefaultMonospacedFont(PDF_FONT_MONOSPACED);
// set margins
$pdf->SetMargins(PDF_MARGIN_LEFT, PDF_MARGIN_TOP, PDF_MARGIN_RIGHT);
$pdf->SetHeaderMargin(PDF_MARGIN_HEADER);
$pdf->SetFooterMargin(PDF_MARGIN_FOOTER);
// set auto page breaks
$pdf->SetAutoPageBreak(TRUE, PDF_MARGIN_BOTTOM);
// set image scale factor
$pdf->setImageScale(PDF_IMAGE_SCALE_RATIO);
// set some language-dependent strings (optional)
if (#file_exists(dirname(__FILE__).'/lang/eng.php')) {
require_once(dirname(__FILE__).'/lang/eng.php');
$pdf->setLanguageArray($l);
}
$right_column = <<<EOF
//=======right=========
//===css
$_style
//===html
<table>
$_mresumeright
</table>
EOF;
// write the second column - for A3
$pdf->writeHTMLCell(206, '', '82', '', $right_column, 0, 1, 0, true, '', true);
// reset pointer to the last page
$pdf->lastPage();
None have worked for me.
Any idea where did I got wrong?
Unfortunly my experience in programing is quite limited.
Thank you for your time and help,
Dan
There is a simple example below that is based on your code. It should help you to understand how to use HTML/CSS with TCPDF and how to insert content into the HTML. One of the issues I noticed is that I don't see where the TCPDF library was required in your example, but it may have been done elsewhere. Additionally, you may have not called $pdf->AddPage();. The example should run inside the TCPDF examples directory.
<?php
$styles = <<<EOF
<style>
.section {
color: navy;
}
.row {
background-color: green;
}
.cell-one {
background-color: blue;
color: white;
}
</style>
EOF;
function fetchContent() {
$row = '<tr class="row">
<td class="cell-one">Cell One</td>
<td>Cell Two</td>
</tr>';
return str_repeat($row,20);
}
require_once('../tcpdf.php');
$pdf = new TCPDF(PDF_PAGE_ORIENTATION, PDF_UNIT, PDF_PAGE_FORMAT, true, 'UTF-8', false);
$pdf->AddPage();
$tableRows = fetchContent();
$html = <<<EOF
$styles
<div class="section">
<h1>Lorem Ipsum</h1>
Lorem ipsum dolor sit amet, consectetur adipiscing elit.
</div>
<table>
$tableRows
</table>
EOF;
$pdf->writeHTML($html, true, false, true, false, '');
$pdf->Output('example', 'I');
I'm using PHP & HTML2PDF lib to generate pdf files.But what I'm trying to do is to generate a pdf file with the pageSize (width/height) as html content size. How can I achieve this?
My html content is:
<page format="432x240" orientation="L" backcolor="#FFFFFF" style="font: arial;">
<div class="image">
<span class="firstname">$fname</span>
<span class="lastname">$lname</span>
</div>
The css for image class is:
position: relative;width: 100%; /* for IE 6 */ background-image: url(../img/test.png);height: 240px; width: 432px;top: 50%;
And my PHP code is:
$html2pdf = new HTML2PDF('P', 'A4', 'en', true, 'UTF-8', 0);
$html2pdf->pdf->SetDisplayMode('fullpage');
$contentTpl = $this->renderPartial('template_01', array('fname' => $firstname, 'lname' => $lastname), true);
$html2pdf->writeHTML(utf8_encode($contentTpl));
Here is the solution for this problem:
$html2pdf = new HTML2PDF('P', array($width_in_mm,$height_in_mm), 'en', true, 'UTF-8', array(0, 0, 0, 0));
Width and Height should be in MM. If your using inches convert it to MM.
Formula:
$width_in_mm = $width_in_inches * 25.4;
$height_in_mm = $height_in_inches * 25.4;
Don't round it off. Used the exact conversion even if it has a decimal point.
Hope this answer will solve your problem.
I am trying to use cURL and PHP to scrape proxies off of a webpage. However, when I use cURL all I get is the CSS in the $content. The page uses wordpress so it dynamically loads content but I haven't found anything to help me download the dynamic content. I use wget in linux and the page downloads fine.
<?php
//$source1 = file_get_contents('http://www.new-fresh-proxies.blogspot.com/');
$source1 = get_data("http://www.new-fresh-proxies.blogspot.com/");
$array = array();
$source1 = preg_grep('/\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}\b/', $array);
//download webpage
function get_data($url) {
$options = array(
CURLOPT_RETURNTRANSFER => 1, // return web page
CURLOPT_HEADER => true, // don't return headers
CURLOPT_FOLLOWLOCATION => true, // follow redirects
CURLOPT_ENCODING => "", // handle all encodings
CURLOPT_USERAGENT => "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.13) Gecko/20080311 Firefox/2.0.0.13", // who am i
CURLOPT_AUTOREFERER => true, // set referer on redirect
CURLOPT_CONNECTTIMEOUT => 120, // timeout on connect
CURLOPT_TIMEOUT => 120, // timeout on response
CURLOPT_MAXREDIRS => 50, // stop after 10 redirects
);
$ch = curl_init( $url );
curl_setopt_array( $ch, $options );
$content = curl_exec( $ch );
$err = curl_errno( $ch );
$errmsg = curl_error( $ch );
$header = curl_getinfo( $ch );
curl_close( $ch );
$header['errno'] = $err;
$header['errmsg'] = $errmsg;
$header['content'] = $content;
return $header;
My output:
(
string:203221) HTTP/1.1 200 OK
Content-Type: text/html; charset=UTF-8
Expires: Wed, 06 Feb 2013 22:09:23 GMT
Date: Wed, 06 Feb 2013 22:09:23 GMT
Cache-Control: private, max-age=0
Last-Modified: Wed, 06 Feb 2013 20:39:30 GMT
ETag: "c6675d47-80ec-48ee-9c0f-613c9419f172"
Content-Encoding: gzip
X-Content-Type-Options: nosniff
X-XSS-Protection: 1; mode=block
Content-Length: 47132
Server: GSE
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html dir='ltr' xmlns='http://www.w3.org/1999/xhtml' xmlns:b='http://www.google.com/2005/gml/b' xmlns:data='http://www.google.com/2005/gml/data' xmlns:expr='http://www.google.com/2005/gml/expr'>
<head>
<meta content='text/html; charset=UTF-8' http-equiv='Content-Type'/>
<script type="text/javascript">(function() { var a=window,b="jstiming",d="tick";var e=function(c){this.t={};this.tick=function(c,p,h){h=void 0!=h?h:(new Date).getTime();this.t[c]=[h,p]};this[d]("start",null,c)},f=new e;a.jstiming={Timer:e,load:f};if(a.performance&&a.performance.timing){var g=a.performance.timing,j=a[b].load,k=g.navigationStart,l=g.responseStart;0<k&&l>=k&&(j[d]("_wtsrt",void 0,k),j[d]("wtsrt_","_wtsrt",l),j[d]("tbsd_","wtsrt_"))}
try{var m=null;a.chrome&&a.chrome.csi&&(m=Math.floor(a.chrome.csi().pageT),j&&0<k&&(j[d]("_tbnd",void 0,a.chrome.csi().startE),j[d]("tbnd_","_tbnd",k)));null==m&&a.gtbExternal&&(m=a.gtbExternal.pageT());null==m&&a.external&&(m=a.external.pageT,j&&0<k&&(j[d]("_tbnd",void 0,a.external.startE),j[d]("tbnd_","_tbnd",k)));m&&(a[b].pt=m)}catch(n){};a.tickAboveFold=function(c){var i=0;if(c.offsetParent){do i+=c.offsetTop;while(c=c.offsetParent)}c=i;750>=c&&a[b].load[d]("aft")};var q=!1;function r(){q||(q=!0,a[b].load[d]("firstScrollTime"))}a.addEventListener?a.addEventListener("scroll",r,!1):a.attachEvent("onscroll",r);
})();</script>
<meta content='true' name='MSSmartTagsPreventParsing'/>
<meta content='blogger' name='generator'/>
<link href='http://www.new-fresh-proxies.blogspot.com/favicon.ico' rel='icon' type='image/x-icon'/>
<link href='http://new-fresh-proxies.blogspot.com/' rel='canonical'/>
<link rel="alternate" type="application/atom+xml" title="New Fresh Proxies - Atom" href="http://new-fresh-proxies.blogspot.com/feeds/posts/default" />
<link rel="alternate" type="application/rss+xml" title="New Fresh Proxies - RSS" href="http://new-fresh-proxies.blogspot.com/feeds/posts/default?alt=rss" />
<link rel="service.post" type="application/atom+xml" title="New Fresh Proxies - Atom" href="http://www.blogger.com/feeds/2001908494944967755/posts/default" />
<link rel="EditURI" type="application/rsd+xml" title="RSD" href="http://www.blogger.com/rsd.g?blogID=2001908494944967755" />
<link rel="openid.server" href="http://www.blogger.com/openid-server.g" />
<link rel="openid.delegate" href="http://new-fresh-proxies.blogspot.com/" />
<!--[if IE]> <script> (function() { var html5 = ("abbr,article,aside,audio,canvas,datalist,details," + "figure,footer,header,hgroup,mark,menu,meter,nav,output," + "progress,section,time,video").split(','); for (var i = 0; i < html5.length; i++) { document.createElement(html5[i]); } try { document.execCommand('BackgroundImageCache', false, true); } catch(e) {} })(); </script> <![endif]-->
<title>New Fresh Proxies</title>
<link type='text/css' rel='stylesheet' href='//www.blogger.com/static/v1/widgets/3950009988-widget_css_bundle.css' />
<link type="text/css" rel="stylesheet" href="//www.blogger.com/dyn-css/authorization.css?targetBlogID=2001908494944967755&zx=c6675d47-80ec-48ee-9c0f-613c9419f172"/>
<style id='page-skin-1' type='text/css'><!--
/*
-----------------------------------------------
Theme Name: Harmonika
Theme URL: http://newwpthemes.com/wordpress-theme/harmonika/
Description: Harmonika is a free WordPress theme with options page and supports the post thumbnails. Suitable for any niche.
Author: NewWpThemes.com
Author URI: http://newwpthemes.com/
Version: 1.0
Tags: Ads Ready, Two Columns, Right Sidebar, Fixed Width, Blogging, Options Page
Template Name: Harmonika
Template URI: http://themecraft.net/2010/06/harmonika-blogger-template
Version: 1.0
Author: Theme Craft
Author URI: http://www.themecraft.net
Harmonika Blogger Template comes under a Creative Commons Attribution-Noncommercial-Share Alike 2.5 Malaysia License.
This means it is free to use on your blog, and you must keep the footer link intact, also as a respect to the designer and converter.
We sincerely need your respect to continue our free Blogger template production, thank you.
ThemeCraft.net
----------------------------------------------- */
/* Use this with templates/template-twocol.html */
#navbar-iframe{
height:0px;
visibility:hidden;
display:none;
}
/* -----------------------------------------------------------------------
Blueprint CSS Framework 0.8
http://blueprintcss.org
* Copyright (c) 2007-Present. See LICENSE for more info.
* See README for instructions on how to use Blueprint.
* For credits and origins, see AUTHORS.
* This is a compressed file. See the sources in the 'src' directory.
----------------------------------------------------------------------- */
/* reset.css */
html, body, div, span, object, iframe, h1, h2, h3, h4, h5, h6, p, blockquote, pre, a, abbr, acronym, address, code, del, dfn, em, img, q, dl, dt, dd, ol, ul, li, fieldset, form, label, legend, table, caption, tbody, tfoot, thead, tr, th, td {margin:0;padding:0;border:0;font-weight:inherit;font-style:inherit;font-size:100%;font-family:inherit;}
body {line-height:1.5;}
caption, th, td {text-align:left;font-weight:normal;}
blockquote:before, blockquote:after, q:before, q:after {content:"";}
blockquote, q {quotes:"" "";}
img {max-width: 100%; /* not working in IE6*/}
a img {max-width: 100%; /* not working in IE6*/ border:none;}
/* typography.css */
body {font-size:75%;color:#222;background:#000;font-family:"Helvetica Neue", Arial, Helvetica, sans-serif;}
h1, h2, h3, h4, h5, h6 {font-weight:normal;}
h1 {font-size:3em;line-height:1;margin-bottom:0.5em;}
h2 {font-size:2em;margin-bottom:0.75em;}
h3 {font-size:1.5em;line-height:1;margin-bottom:1em;}
h4 {font-size:1.2em;line-height:1.25;margin-bottom:1.25em;}
h5 {font-size:1em;font-weight:bold;margin-bottom:1.5em;}
h6 {font-size:1em;font-weight:bold;}
h1 img, h2 img, h3 img, h4 img, h5 img, h6 img {margin:0;}
p {margin:0 0 1.5em;}
p img.left {float:left;margin:1.5em 1.5em 1.5em 0;padding:0;}
p img.right {float:right;margin:1.5em 0 1.5em 1.5em;}
a:focus, a:hover {color:#000;}
a {color:#009;text-decoration:underline;}
blockquote {margin:1.5em;color:#666;font-style:italic;}
strong {font-weight:bold;}
em, dfn {font-style:italic;}
dfn {font-weight:bold;}
sup, sub {line-height:0;}
abbr, acronym {border-bottom:1px dotted #666;}
address {margin:0 0 1.5em;font-style:italic;}
del {color:#666;}
pre {margin:1.5em 0;white-space:pre;}
pre, code, tt {font:1em 'andale mono', 'lucida console', monospace;line-height:1.5;}
li ul, li ol {margin:0 1.5em;}
ul, ol {margin:0 1.5em 1.5em 1.5em;}
ul {list-style-type:disc;}
ol {list-style-type:decimal;}
dl {margin:0 0 1.5em 0;}
dl dt {font-weight:bold;}
dd {margin-left:1.5em;}
table {margin-bottom:1.4em;width:100%;}
th {font-weight:bold;}
thead th {background:#c3d9ff;}
tr.even td {background:#e5ecf9;}
tfoot {font-style:italic;}
caption {background:#eee;}
.small {font-size:.8em;margin-bottom:1.875em;line-height:1.875em;}
.large {font-size:1.2em;line-height:2.5em;margin-bottom:1.25em;}
.hide {display:none;}
.quiet {color:#666;}
.loud {color:#000;}
.highlight {background:#ff0;}
.added {background:#060;color:#fff;}
.removed {background:#900;color:#fff;}
.first {margin-left:0;padding-left:0;}
.last {margin-right:0;padding-right:0;}
.top {margin-top:0;padding-top:0;}
.bottom {margin-bottom:0;padding-bottom:0;}
/* forms.css */
label {font-weight:bold;}
fieldset {padding:1.4em;margin:0 0 1.5em 0;border:1px solid #ccc;}
legend {font-weight:bold;font-size:1.2em;}
input.text, input.title, textarea, select {margin:0.5em 0;border:1px solid #bbb;}
input.text:focus, input.title:focus, textarea:focus, select:focus {border:1px solid #666;}
input.text, input.title {width:300px;padding:5px;}
input.title {font-size:1.5em;}
textarea {width:390px;height:250px;padding:5px;}
.error, .notice, .success {padding:.8em;margin-bottom:1em;border:2px solid #ddd;}
.error {background:#FBE3E4;color:#8a1f11;border-color:#FBC2C4;}
.notice {background:#FFF6BF;color:#514721;border-color:#FFD324;}
.success {background:#E6EFC2;color:#264409;border-color:#C6D880;}
.error a {color:#8a1f11;}
.notice a {color:#514721;}
.success a {color:#264409;}
/* grid.css */
.container {width:950px;margin:0 auto;}
.showgrid {background:url(src/grid.png);}
.column, div.span-1, div.span-2, div.span-3, div.span-4, div.span-5, div.span-6, div.span-7, div.span-8, div.span-9, div.span-10, div.span-11, div.span-12, div.span-13, div.span-14, div.span-15, div.span-16, div.span-17, div.span-18, div.span-19, div.span-20, div.span-21, div.span-22, div.span-23, div.span-24 {float:left;margin-right:10px;}
.last, div.last {margin-right:0;}
.span-1 {width:30px;}
.span-2 {width:70px;}
.span-3 {width:110px;}
.span-4 {width:150px;}
.span-5 {width:190px;}
.span-6 {width:230px;}
.span-7 {width:270px;}
.span-8 {width:310px;}
.span-9 {width:350px;}
.span-10 {width:390px;}
.span-11 {width:430px;}
.span-12 {width:470px;}
.span-13 {width:510px;}
.span-14 {width:550px;}
.span-15 {width:590px;}
.span-16 {width:630px;}
.span-17 {width:670px;}
.span-18 {width:710px;}
.span-19 {width:750px;}
.span-20 {width:790px;}
.span-21 {width:830px;}
.span-22 {width:870px;}
.span-23 {width:910px;}
.span-24, div.span-24 {width:950px;margin:0;}
input.span-1, textarea.span-1, input.span-2, textarea.span-2, input.span-3, textarea.span-3, input.span-4, textarea.span-4, input.span-5, textarea.span-5, input.span-6, textarea.span-6, input.span-7, textarea.span-7, input.span-8, textarea.span-8, input.span-9, textarea.span-9, input.span-10, textarea.span-10, input.span-11, te...
Curl wont be able to get it directly since it wont execute javascript. But if its coming from an ajax request, you can make a request to that endpoint directly.
Use dev tools/firebug to see what is happening.
Couple things:
Where is your 'output' coming from? I see no displays in your code ...
I also think your preg_grep statement is incorrect. You're searching a blank array and saving the result to the variable you just pulled your data into. Try:
$array = preg_grep('/\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\:\d{1,5}\b/', $source1);
When I run the code and dump $source1['content'] directly after the get_data call, I get a crap-ton of IP addresses ...
It seems to me like either a timeout or a problem with your Regexp.
Why not stick to file_get_contents like you tried in the first place?
$content = file_get_contents('http://www.new-fresh-proxies.blogspot.com.au');
preg_match_all('/(\d+\.\d+\.\d+\.\d+(:\d+)?)/', $content, $matches);
print_r($matches[1]);
This will print out a list of IPs:
Array
(
[0] => 1.204.168.15:6673
[1] => 1.234.45.130:80
[2] => 1.34.163.101:8080
[3] => 1.34.29.89:8080
[4] => 1.34.8.221:3128
....
Hope that helps.