I need to parse with simpl_html_dom german site.I have a problem with german Umlauts, because utf-8 don't support Umlauts. I know ,if convert text from UTF-8 to UTF-16 or ISO-8859-1 problem solved.I use CURL for get content page. This page have charset ISO-8859-1.I try to set CURLOPT_ENCODING ISO-8859-1 ,but Curl always return the utf-8 text.I don't know what do.
Code of this method.
public function testsec()
{
require_once DIR_SYSTEM.'library'.DIRECTORY_SEPARATOR.'simpleHtml'.DIRECTORY_SEPARATOR.'simple_html_dom.php';
$regexpSecond = "~Möglicherweise.*? Vielen Dank~su";
$headers = array(
"User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0",
"Accept: text/plain",
"Connection: keep-alive",
);
$fp = fopen(DIR_ADMIN.'logCurl.txt','w+');
$head = fopen(DIR_ADMIN.'headers.txt','w+');
$curl = curl_init("http://test.site.com/bla-bla-bla");
curl_setopt($curl, CURLOPT_RETURNTRANSFER,true);
curl_setopt($curl, CURLOPT_ENCODING , "UTF-16");
curl_setopt($curl, CURLOPT_VERBOSE, 1);
curl_setopt($curl, CURLOPT_STDERR, $fp);
curl_setopt($curl, CURLOPT_HEADER ,$headers);
curl_setopt($curl, CURLOPT_WRITEHEADER, $head);
$result = curl_exec($curl);
curl_close($curl);
fclose($fp);
fclose($head);
$html = str_get_html($result);
echo mb_detect_encoding($result); //utf-8
}
Headers response
HTTP/1.1 200 OK
Date: Sun, 03 Jul 2016 05:22:34 GMT
Server: Apache
Set-Cookie: JTLSHOP=c1qv3vafghmf3ih43g5m96epi4; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: max-age=1, private, must-revalidate
Pragma: no-cache
Vary: Accept-Encoding
X-Powered-By: PleskLin
Connection: close
Transfer-Encoding: chunked
Content-Type: text/html; charset=iso-8859-1
UTF-8 support Umlauts.
http://www.periodni.com/unicode_utf-8_encoding.html#german_special_characters
If you want to convert charset, Use ICONV function.
Related
I have a problem logging in to a website with CURL and PHP.
I test with the Firefox add-on HttpRequester and this worked.
Result login:
POST https://www.balatarin.com/sessions
Content-Type: application/x-www-form-urlencoded
session[login]=testeruni&session[password]=123456789&session[remember_me]=1&commit=%D9%88%D8%B1%D9%88%D8%AF&utf8=%E2%9C%93&authenticity_token[![httprequester][1]][1]
-- response --
200 OK
Server: shield
Date: Thu, 19 Jan 2017 13:51:54 GMT
Content-Type: text/html; charset=utf-8
status: 200 OK
X-Frame-Options: SAMEORIGIN
X-XSS-Protection: 1; mode=block
x-ua-compatible: IE=Edge,chrome=1
Etag: W/"7418542e936fbdfe20002faf11876845"
Cache-Control: must-revalidate, private, max-age=0
Set-Cookie: _balat_session_new=BAh7C0kiDHVzZXJfaWQGOgZFRmkD964BSSIPc2Vzc2lvbl9pZAY7AEZJIiUzZGUxMzIyN2ZhZDVmMDUzOGE3OGY0YTRhZDkzNmUyMQY7AFRJIhZpbnB1dF9kZXZpY2VfdHlwZQY7AEZJIgpNT1VTRQY7AEZJIhRob3Zlcl9zdXBwb3J0ZWQGOwBGVEkiCmZsYXNoBjsARm86JUFjdGlvbkRpc3BhdGNoOjpGbGFzaDo6Rmxhc2hIYXNoCToKQHVzZWRvOghTZXQGOgpAaGFzaHsGOgtub3RpY2VUOgxAY2xvc2VkRjoNQGZsYXNoZXN7BjsKSSI22YbYrtiz2Kog2YXYtNiq2LHaqSDahtmG2K8g2KjYp9mE2KfahtmHINi02YjbjNivLgY7AFQ6CUBub3cwSSIQX2NzcmZfdG9rZW4GOwBGSSIxT3krNk5nM1NTM2IreXc4SUtxbW9yN2NmMXQrdUNLWWdubFRRYmpidmtNTT0GOwBG--2c2a72f8ec27564250ba084d97998aefba4af11a; path=/; secure; HttpOnly geo=0
X-Request-Id: 521288561d7cfff0ef8fe8d72080760c
X-Runtime: 0.188862
X-Rack-Cache: miss
Content-Encoding: gzip
Via: 1.1 google
Alt-Svc: clear
Expires: Thu, 19 Jan 2017 13:51:54 GMT
X-Firefox-Spdy: h2
but it does not login with curl in PHP. I tested all headers in my CURL but it does not login, only works with HttpRequester.
public function actionLoggin()
{
$url = 'https://www.balatarin.com/sessions';
$headers[] = 'Content-Type: application/x-www-form-urlencoded';
$headers[] = 'Host: www.balatarin.com';
$headers[] = 'User-Agent: Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:50.0) Gecko/20100101 Firefox/50.0';
$headers[] = 'Referer: https://www.balatarin.com/login';
$params = array(
'session[login]' => 'testeruni',
'session[password]' => '123456789',
'session[remember_me]' => '0',
'commit' => 'ورود',
'utf8' => '✓',
'authenticity_token' => '',
);
//open connection
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_POSTFIELDS, http_build_query($params));
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_COOKIEJAR, 'bala_cookie.txt');
curl_setopt($ch, CURLOPT_COOKIEFILE, 'bala_cookie.txt');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($ch);
curl_close($ch);
echo $result;
}
Here is my cookie file:
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.
www.balatarin.com FALSE / FALSE 0 logged_in 1
#HttpOnly_www.balatarin.com FALSE / TRUE 0 _balat_session_new BAh7CToOcmV0dXJuX3RvMDoMdXNlcl9pZGkDj60BOhJsb2dpbl9yZXRyaWVzMEkiD3Nlc3Npb25faWQGOgZFRkkiJTgwN2ZmMDRjMGUzMzkyMDIyZWY5YzBmZTQxN2FmZWMzBjsIVA%3D%3D--d47dd61bc9900449cca69ebd727041c3946a13ba
www.balatarin.com FALSE / FALSE 0 geo 0
www.balatarin.com FALSE / FALSE 1516368886 corr b8ed93fa279a469a637b
I'm trying to embed Facebook's posts (e.g. video) using oEmbed format. According to Facebook documentation, oEmbed is now supported. I'm trying this PHP code:
$json_post = #file_get_contents('https://www.facebook.com/plugins/video/oembed.json/?url={MY VIDEO URL HERE}');
$oembed = json_decode($json_post);
var_dump($oembed);
I already used the same code for Instagram with success, now I'm getting a NULL result. oEmbed works good if i directly write the URL on the browser. Am i missing something?
Thanks.
Update
I tried with Curl:
$url='https://www.facebook.com/plugins/video/oembed.json/?url=https%3A%2F%2Fwww.facebook.com%2Ffacebook%2Fvideos%2F10153231379946729%2F';
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, TRUE);
//curl_setopt($ch, CURLOPT_NOBODY, TRUE); // remove body
curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
$page = curl_exec($ch);
$httpCode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
print_r($page);
curl_close($ch);
Now i get:
HTTP/1.1 302 Found
Location: https://www.facebook.com/unsupportedbrowser
access-control-allow-method: OPTIONS
Access-Control-Expose-Headers: X-FB-Debug, X-Loader-Length
Access-Control-Allow-Origin: https://www.facebook.com
Vary: Origin
Access-Control-Allow-Credentials: true
Content-Type: text/html
X-FB-Debug: gGcZzyllZadlcn/6jz2HqqouIcDnhTzxzR+etWXhZEnOcditfsaIUw0WjgO3nELHzveRCYa1UM86D3LA/nLnNw==
Date: Wed, 11 Jan 2017 10:18:47 GMT
Connection: keep-alive
Content-Length: 0
HTTP/1.1 200 OK
X-XSS-Protection: 0
public-key-pins-report-only: max-age=500; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18="; pin-sha256="r/mIkG3eEpVdm+u/ko/cwxzOMo1bk4TyHIlByibiA5E="; pin-sha256="q4PO2G2cbkZhZ82+JgmRUyGMoAeozA+BSXVXQWB8XWQ="; report-uri="http://reports.fb.com/hpkp/"
Pragma: no-cache
Cache-Control: private, no-cache, no-store, must-revalidate
Expires: Sat, 01 Jan 2000 00:00:00 GMT
X-Content-Type-Options: nosniff
Strict-Transport-Security: max-age=15552000; preload
X-Frame-Options: DENY
Vary: Accept-Encoding
Content-Type: text/html
X-FB-Debug: zwArox8KyM3BtwLymhiARCTltrrcE/pDqSWdqbHgstXVBEbIXG57Od2MfDnqgqSX5Tj43qoe8uYhphzwoZcXeg==
Date: Wed, 11 Jan 2017 10:18:48 GMT
Transfer-Encoding: chunked
Connection: keep-alive
Still waiting for a reply.
Thank You.
Set the user agent with the curl and try,
$browser = 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.16 (KHTML, like Gecko) \Chrome/24.0.1304.0 Safari/537.16';
curl_setopt($ch, CURLOPT_USERAGENT, $browser);
Here is the answer with file_get_content,
$options = array(
'http'=>array(
'method'=>"GET",
'header'=>"Accept-language: en\r\n" .
"Cookie: foo=bar\r\n" . // check function.stream-context-create on php.net
"User-Agent: Mozilla/5.0 (iPad; U; CPU OS 3_2 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Version/4.0.4 Mobile/7B334b Safari/531.21.102011-10-16 20:23:10\r\n" // i.e. An iPad
)
);
$context = stream_context_create($options);
$json_post = #file_get_contents('https://www.facebook.com/plugins/video/oembed.json/?url=https%3A%2F%2Fwww.facebook.com%2Ffacebook%2Fvideos%2F10153231379946729%2F', false, $context);
$oembed = json_decode($json_post);
var_dump($oembed)
In my first cURL request i upload a file and i set a $_SESSION variable with the name, extension etc. In my second cURL request i want to move the uploaded file from tmp folder to user folder but baddly the $_SESSION variable is empty. why?
first request code code looks like this:
$upload = curl_init();
curl_setopt($upload, CURLOPT_URL, "http://localhost/upload/" );
curl_setopt($upload, CURLOPT_POST, true );
curl_setopt($upload, CURLOPT_RETURNTRANSFER, true );
curl_setopt($upload, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"] );
curl_setopt($upload, CURLOPT_HTTPHEADER, array("Set-Cookie: data=" . urldecode($cookie) ));
curl_custom_postfields($upload, $fields, $files);
$res = curl_exec($upload);
curl_close($upload);
and the second request code: following the first request:
$submit = curl_init();
curl_setopt($submit, CURLOPT_URL, "http://localhost/" );
curl_setopt($submit, CURLOPT_RETURNTRANSFER, true );
curl_setopt($submit, CURLOPT_USERAGENT, $_SERVER["HTTP_USER_AGENT"] );
curl_setopt($submit, CURLOPT_POST, count($fields));
curl_setopt($submit, CURLOPT_POSTFIELDS, $fields_string );
curl_setopt($submit, CURLOPT_HEADER, true);
curl_setopt($submit, CURLOPT_HTTPHEADER, array("Cookie: data=" . urldecode($cookie) ));
$res = curl_exec($submit);
curl_close($submit);
is there any option to keep session alive? is the same problem i meet on AJAX requests when i start using javascript with AJAX i think.
my response header:
HTTP/1.1 200 OK
Date: Wed, 04 Mar 2015 02:24:22 GMT
Server: Apache/2.2.12 (Win32) DAV/2 mod_ssl/2.2.12 OpenSSL/0.9.8k mod_autoindex_color PHP/5.3.0 mod_perl/2.0.4 Perl/v5.10.0
X-Powered-By: PHP/5.3.0
Set-Cookie: PHPSESSID=g7hc328ij8lr63mps6ub44gat2; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Content-Length: 22
Content-Type: text/html
I think you need a cookie jar to keep track of your session:
curl_setopt($ch, CURLOPT_COOKIEJAR, "/tmp/cookies");
i am trying to download a file using box.net using API in php.
As per the documentation i wrote up the code.
but in response i am getting some strange texts.
here's my code:
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "https://api.box.com/2.0/files/3934139624/content ");
curl_setopt($ch, CURLOPT_HEADER, false);
curl_setopt($ch, CURLOPT_HTTPGET,true);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, false);
curl_setopt($ch, CURLOPT_BINARYTRANSFER, true);
curl_setopt($ch, CURLOPT_RETURNTRANSFER,true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, array("Authorization: BoxAuth api_key={MyApikey}&auth_token={Mytoken}"));
$result = curl_exec($ch);
die('DIE');
I am getting response something like this:
PK!Ðòš-[Content_Types].xml ¢( ´UËNÃ0¼#ñ‘¯¨qË!Ô´G¨Dù×Þ´‰mÙÛ×ß³IšA›ˆ†^"EÑÎÌÎÎnÆÓ]žEðA[“°QH¯~ÉŸb üv8¼ãÒƒ,0ØdüF¼VÍ„ÇW‘ßZ¯xj-‹b‚cÑcUWP'L8—i)„óQ?H6Mµeå:'ª¸€sÞJZ˳¸¾) ùdü©Xg=ïH[e‡‡,üõÐfL•¥²°Ò.´0´·uPvÒž¦»v˜3Üis¡Mÿ¤³ÎàÉ×ÿSÝ)"à>»DP*ÜNz0êBI‘Û$мfÞºÀi+zŠ P ´0"f3°£\…ȾTºI S‘ÌõŒ«º¾ÇôWš™¦ÚY igï#µÇX6_Ö]7~ fïØˉÈaoÙ.b*lIÆrj)õ,l0Ï%‘b¬ 6ài¢ÕõDÿ_‹Ž…, ¡ ‰Ïó|uœZ^tÙ¢yǯ;!Y,}{ûCƒ³/h>ÿÿPK!¿hJä1>word/rels/document.xml.rels ¢( ¬”ËNÃ0E÷HüCä=qR q'æ>¾ƒ“‘ˆsµà©WÃ-ŽÌEî›nâ>ðÍqã¨Í§y±3ÆóüükeìE±ty’àÕ³üÍ黦ÏÖ¤KLÏhóÊŸi¾IàˆpzÒŽ¹ç?}xÛxx;ùgïÐ¥f7Yô KéMèwÄÆÇÐEïúÃF§³ß9ètÏ7ÌKWxÐ/žñ¡“ùéâ;W…—Ô•¯bú%B×óù§ìv îã㡈“ô£ 8ÜílìÐqq~x|!Ã4Á1Nâ ñaãVš+¾•ËÓr¤ØLe'õc"ójS“Œ(ñR'»>wbriê’6œ,•ôçPøH†.ÔO«<çµ¼G›[¯ ‹Ÿ~ëÈŒcñ)“ )ò<4/nÌ—ôEÛþßpÄÙ÷æ¬Û?xg«\ÖîЃSäÀ•Ç°tÒ(¾‹³ƒwïg˜³ÕKøŒ;ù¾.†ì, l©´ªµÐm¯]‰ŠTíßnÁ¿·ß¤/ë»–ª”짓6õ“^Qð-wô—Qð]6bé à²#ÆûÍ#¡™˜×Fa'™Â†êMî'ÂÛ¿U*XÆÞ/¾\ÁÜl X5HñKÕ˜sØ8EÌ/!вÃÐeq”µ±dº¨É…⛂R—7ЊU¹iØF:h±FÎç¢àõð¾ôÈ!˜&æ',ADSÈP¸L‘M.úìäpow½Ý(¥Ú·R ãpK0è7^;¿Lë4f¤P3Ì…#M s´ï¡Ü:…(#à(1ß;9|÷S½°T4ϹF²ì%“$åÁf“tÖة⼪R&˜nn†)#éóÒfŽBC?‰ð€()ÄÝ(%LNËñ)V^«ÞÛ¶[5+Í>jÀ£WlŽ÷¡¿)ÁoôFBû›CÕ©ëÜ™ÖI¦æÖQ×ƤHX-ijž^ÀDûs“ …Ø}
Can any one tell me how can i handle such kind of response?
thanks in advance.
As per box.net api documentation:
The response to this request will simply be the complete data of the
file itself.
So all you need to save file content locally.
In response header, you need to check content-type, right now it is XML
$result = curl_exec($ch);
$fp = fopen('test.xml','wb');
fwrite($fp, $result);
fclose($fp);
#GBD following comes in response header:
HTTP/1.1 302 Found
Server: nginx
Date: Wed, 14 Nov 2012 09:11:51 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
Cache-control: private
Location: https://dl.boxcloud.com/bc/1/85f471520cf611a05025a5f/JolueqOGpciD6dgYhecNBoVpYxkvmYe1ZLheZor6BF4DUBIelMQTkFwYIys3nIibNIIEHUp447tBZLaXDzIbNQ,,/a44510a2b21219463fade41d6b36dabf/
Content-Length: 0
HTTP/1.1 200 OK
Server: nginx
Date: Wed, 14 Nov 2012 09:11:52 GMT
Content-Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Content-Length: 19944
Connection: keep-alive
Cache-control: private
Accept-Ranges: bytes
Content-Disposition: attachment;filename="cloud computing proposal.docx";filename*=UTF-8''cloud%20computing%20proposal.docx
X-Content-Type-Options: nosniff
Accept-Ranges: bytes
And saving file in xml also,couldn't b opened.
While I am posting XML content from one server to other server, it is not getting added.
I'm using cURL to post the xml files to another server. But I am getting the following response:
HTTP/1.1 200 OK
Date: Thu, 21 Jul 2011 08:13:02 GMT
Server: Apache/2.2.8 (Ubuntu) PHP/5.2.4-2ubuntu5.6 with Suhosin-Patch mod_ssl/2.2.8 OpenSSL/0.9.8g
X-Powered-By: PHP/5.2.4-2ubuntu5.6
Set-Cookie: PHPSESSID=6846cb7e65f6f6d6d87f163a681f0543; path=/
Expires: Thu, 19 Nov 1981 08:52:00 GMT
Cache-Control: no-store, no-cache, must-revalidate, post-check=0, pre-check=0
Pragma: no-cache
Content-Length: 5721
Content-Type: text/html; charset=UTF-8
This is my code
$file_path= WWW_ROOT.$xmlfilename;
$xmldata = file_get_contents($file_path);
$request = 'http://www.sample.com/someaction';
$postargs = 'xml='.urlencode($xmldata).'&filename='.urlencode($xmlfilename);
// Get the curl session object
$session = curl_init($request);
// Set the POST options.
curl_setopt($session, CURLOPT_POST, true);
curl_setopt($session, CURLOPT_POSTFIELDS, $postargs);
curl_setopt($session, CURLOPT_HEADER, true);
curl_setopt($session, CURLOPT_RETURNTRANSFER, true);
// Do the POST and then close the session
$response = curl_exec($session);
print_r( $response);
Note: allow_url_fopen and curl are enabled in both servers.
Try assigning it like this:
$postargs = array('xml' => urlencode($xmldata), 'filename' => urlencode($xmlfilename))
Both items should then appear in $_POST['xml'] and $_POST['filename'] in the receiving side (or equivalent if not PHP).
EDIT
OK you may need to look at streaming the XML file using CURLOPT_READFUNCTION.
See this for a bit of an example http://zingaburga.com/2011/02/streaming-post-data-through-php-curl-using-curlopt_readfunction/