Parsing PHP with fopen() encoding issue

Parsing PHP with fopen() encoding issue - php

I'm having issues parsing a csv file in php, using fopen() taking in API data.
My code works when I use a URL that displays the csv file in the browser as stated in 1) below. But I get random characters outputted from a URL that ends in format=csv as seen in 2) below.
1) Working URL: Returned expected values
https://www.kimonolabs.com/api/csv/duo2mkw2?apikey=yjEl780lSQ8IcVHkItiHzzUZxd1wqSJv
2) Not Working URL: Returns random characters
https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv
Here is my code: - using URL (2) above
<?php
$f_pointer=fopen("https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/ last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv","r");
while(! feof($f_pointer)){
$ar=fgetcsv($f_pointer);
echo $ar[1];
echo "<br>";
}
?>
Output: For URL mentioned in (2) above:
root#MorryServer:/# php testing.php
?IU?Q?JL?.?/Q?R??/)?J-.?))VH?/OM?K-NI?T0?P?*ͩT0204jzԴ?H???X???# D??K
Correct Output: If I use URL Type as stated in (1)
root#MorryServer:/# php testing.php
PHP Notice: Undefined offset: 1 in /testing.php on line 24
jackpot€2,893,210

This is an encoding problem.
The given file contains UTF-8 chars. These are read by the fgetcsv function, which is binary safe. Line Endings are Unix-Format ("\n").
The output on the terminal is scrumbled. Looking at the headers sent, we see:
GET https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv --> 200 OK
Connection: close
Date: Sat, 11 Jul 2015 13:15:24 GMT
Server: nginx/1.6.2
Content-Encoding: gzip
Content-Length: 123
Content-Type: text/csv; charset=UTF-8
Last-Modified: Fri, 10 Jul 2015 11:43:49 GMT
Client-Date: Sat, 11 Jul 2015 13:15:23 GMT
Client-Peer: 107.170.197.156:443
Client-Response-Num: 1
Client-SSL-Cert-Issuer: /C=GB/ST=Greater Manchester/L=Salford/O=COMODO CA Limited/CN=COMODO RSA Domain Validation Secure Server CA
Client-SSL-Cert-Subject: /OU=Domain Control Validated/OU=PositiveSSL/CN=www.parsehub.com
Mind the Content-Encoding: gzip: fgetcsv working on an URL doesn't obviously handle gzip encosing. The scrumbled String is just the gzipped content of the "file".
Look at the gzip lib of PHP to first deflate that before parsing it.
Proof:
srv:~ # lwp-download 'https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv' data
123 bytes received
srv:~ # file data
data: gzip compressed data, was "tcW80-EcI6Oj2TYPXI-47XwK.csv", from Unix, last modified: Fri Jul 10 11:43:48 2015, max compression
srv:~ # gzip -d < data
"title","jackpot"
"Lotto Results for Wednesday 08 July 2015","€2,893,210"
To get the proper output, minimal changes are need: Just add a stream wrapper:
<?php
$f_pointer=fopen("compress.zlib://https://www.parsehub.com/api/v2/projects/tM9MwgKrh0c4b81WDT_4FkaC/last_ready_run/data?api_key=tD3djFMGmyWmDUdcgmBVFCd3&format=csv","r");
if ( $f_pointer === false )
die ("invalid URL");
$ar = array();
while(! feof($f_pointer)){
$ar[]=fgetcsv($f_pointer);
}
print_r($ar);
?>
Outputs:
Array
(
[0] => Array
(
[0] => title
[1] => jackpot
)
[1] => Array
(
[0] => Lotto Results for Wednesday 08 July 2015
[1] => €2,893,210
)
)

Related

How to get the matched records from graph facebook API?

I have to get the url and image name from returned facebook api response. I have the response results. I have tried to get the image url and image name from the following. Please help me to get the location url and image name
preg_match('/Location: (.*?)\n/', $header, $matches);
output:
HTTP/2 302
x-app-usage: {"call_count":16,"total_cputime":0,"total_time":4}
x-fb-rlafr: 0
location: https://xxxxx.net/v/cccc/cccc/130282202_3518020318246580_4104659942029629494_o.jpg?_nc_cat=104&ccb=2&_nc_sid=9e2e56&_nc_ohc=pErMyD3PYFkAX8b7JiO&_nc_ht=scontent-ort2-1.xx&tp=6&oh=db3843917c53f747c3c3f860ca9144d1&oe=6040C6ED
expires: Sat, 01 Jan 2000 00:00:00 GMT
x-fb-request-id: dddddd
strict-transport-security: max-age=15552000; preload
x-fb-trace-id: dddddd
facebook-api-version: v3.2
content-type: image/jpeg
x-fb-rev: 1003270116
cache-control: private, no-cache, no-store, must-revalidate
pragma: no-cache
access-control-allow-origin: *
x-fb-debug: cvvvvvvvvvvvvvvvvvvvvvvvvvvv
content-length: 0
date: Fri, 05 Feb 2021 06:41:05 GMT
alt-svc: h3-29=":443"; ma=3600,h3-27=":443"; ma=3600
$img_array[$key]['url'] = trim(substr($matches['0'],10)); // to get the location url
// print_r($img_array[$key]['url']);
$img_array[$key]['name'] = substr($b['name'],0,-16); // to get the image name

preg_match('/location: (.*?)\n/', $header, $matches);

cURL HTTP request from WikiMedia API not working

I've read tons of cURL tutorials (I'm using PHP) and there's always the same basic code, which doesn't work for me! No specific errors, just no result.
I want to make a HTTP request from Wikipedia and get the result in JSON format.
Here's the code :
$handle = curl_init();
$url = "http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json";
curl_setopt_array($handle,
array(
CURLOPT_URL => $url,
CURLOPT_RETURNTRANSFER => true
)
);
$output = curl_exec($handle);
if (!$output) {
exit('cURL Error: '.curl_error($handle));
}
$result= json_decode($output,true);
print_r($result);
curl_close($handle);
Would like to know what I'm doing wrong.

Your code is correct but it seems Wikipedia doesn't send back the data when using PHP curl (maybe some headers or other parameters must be set for it to work).
If all you need is to retrieve some data though, you can simply use file_get_contents which works fine:
$output = file_get_contents("http://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json");
echo $output;
Edit:
Just for information, I found what the issue is. When running curl -v on that URL, the following comes up:
* Trying 91.198.174.192...
* Connected to fr.wikipedia.org (91.198.174.192) port 80 (#0)
> GET /w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json HTTP/1.1
> Host: fr.wikipedia.org
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 301 Moved Permanently
< Date: Wed, 17 May 2017 13:54:31 GMT
< Server: Varnish
< X-Varnish: 852298595
< X-Cache: cp3031 int
< X-Cache-Status: int
< Set-Cookie: WMF-Last-Access=17-May-2017;Path=/;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< Set-Cookie: WMF-Last-Access-Global=17-May-2017;Path=/;Domain=.wikipedia.org;HttpOnly;secure;Expires=Sun, 18 Jun 2017 12:00:00 GMT
< X-Client-IP: 86.214.172.57
< Location: https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json
< Content-Length: 0
< Connection: keep-alive
<
* Connection #0 to host fr.wikipedia.org left intact
So what's happening is that the actual content is on the https url, not http, so by requesting https://fr.wikipedia.org/w/api.php?action=query&titles=Albert%20Einstein&prop=info&format=json it should work directly.
The reason it works with file_get_contents is because in this case the redirection is done automatically.

PHP: Need to find certain data from a string

$string = "Response 22: 404 (8345ms), headers: Accept-Ranges=bytes,
Cache-Control=no-cache, no-store, private, Connection=close,
Content-Encoding=gzip, Content-Language=it-it, Content-Length=1674,
Content-Location=index.html.it-it, Content-Type=text/html;
charset=utf-8, Date=Wed, 24 Sep 2014 19:01:30 GMT,
ETag='eb1-50331586750c0;503ac178f62dd', Last-Modified=Tue, 16 Sep 2014
16:35:55 GMT, Server=Apache,
Strict-Transport-Security=max-age=31536000; includeSubDomains,
TCN=choice, Vary=negotiate,accept,accept-language,Accept-Encoding,
X-Frame-Options=SAMEORIGIN, X-UA-Compatible=IE=Edge";
Here I want to grab response number(=> 22), response code(=> 404) and its milli seconds(=> 8345ms).
I think I have to use regex, but I am new to that. Can you please give any suggestions?

Response\s*(\d+):\s*(\d+)\s*\((\S+)?\)
Try this.Get the three groups.See demo.
http://regex101.com/r/qC9cH4/3

extracting field from twitter feed using json_decode

I am trying to grab my twitter feed using the following code:
// Make the request and get the response into the $json variable
$json = $twitter->setGetfield($getfield)
->buildOauth($url, $requestMethod)
->performRequest();
// It's json, so decode it into an array
$result = json_decode($json);
// Access the profile_image_url element in the array
echo $result->created_at;
?>
I get the result of:
Thu Oct 25 18:40:50 +0000 2012
If I try to get the text with:
echo $result->text;
I get this error:
Notice: Undefined property: stdClass::$text in /Library/WebServer/Documents/include/twitter_noformat/items.php on line 35
A partial var_dump of my data format includes this:
{"created_at":"Thu Aug 01 16:12:18 +0000 2013",
"id":362969042497175553,
"id_str":"362969042497175553",
"text":"A warm welcome to our new international students from China, Hong Kong and Germany! http:\/\/t.co\/GLvt3GynJV",
"source":"web",
"truncated":false,"in_reply_to_status_id":null,"in_reply_to_status_id_str":null,"in_reply_to_user_id":null,"in_reply_to_user_id_str":null,"in_reply_to_screen_name":null,"user":
My question is:
created_at gives me a value. id gives me a value. Why doesn't text? I know nothing about JSON btw. I'm not a very advanced programmer, but the pattern looks the same to me.
Edit: Well I found a cool snippet that converted my twitter array to something more readable. The function goes like this:
// It's json, so decode it into an array
$result = json_decode($json);
// Access the profile_image_url element in the array
$pretty = function($v='',$c=" ",$in=-1,$k=null)use(&$pretty){$r='';if(in_array(gettype($v),array('object','array'))){$r.=($in!=-1?str_repeat($c,$in):'').(is_null($k)?'':"$k: ").'<br>';foreach($v as $sk=>$vl){$r.=$pretty($vl,$c,$in+1,$sk).'<br>';}}else{$r.=($in!=-1?str_repeat($c,$in):'').(is_null($k)?'':"$k: ").(is_null($v)?'<NULL>':"<strong>$v</strong>");}return$r;};
echo $pretty($result);
The results now look like this:
statuses_count: 583
lang: en
status:
created_at: Thu Aug 01 21:10:10 +0000 2013
id: 363044004444651522
id_str: 363044004444651522
text: #CalStateEastBay AD Sara Lillevand Judd '86 honored for her work as an athletic adminstrator. http://t.co/WzOqjIDrBw
This is strange because that makes text look like it's part of an object?
I have determined that twitter kicks back an array of objects. Those objects can have a lot of items(?) As I mentioned previously though I can echo $result->created_at; but not text. They are both at the same level of the array.
thanks in advance for your help,
Donovan

Alright here was my solution after a day of research:
$result = json_decode( $json );
echo "Text:" . $result->status->text . "<br />";
Text was a child(?) of status. I could echo created_at because it was used at two levels of the array, which I hadn't seen before. Text was inside the status object I guess.
created_at: Thu Oct 25 18:40:50 +0000 2012
favourites_count: 1
utc_offset: -25200
time_zone: Pacific Time (US & Canada)
geo_enabled: 1
verified:
statuses_count: 583
lang: en
status:
created_at: Thu Aug 01 21:10:10 +0000 2013
id: 363044004444651522
id_str: 363044004444651522
text: #CalStateEastBay AD Sara Lillevand Judd '86 honored for her work as an athletic adminstrator. http://t.co/WzOqjIDrBw

List files in dropbox folder

Im having problems with the dropbox API.
When i try to get the metadata for my folder, i get the data output like this:
{"hash":"10f86b5b7c9c9276501f67a71ecd41c9","thumb_exists":false,"bytes":0,"path":"\/","is_dir":true,"size":"0 bytes","root":"app_folder","contents":[{"revision":7,"rev":"7069e0896","thumb_exists":true,"bytes":19749,"modified":"Tue, 20 Mar 2012 05:06:43 +0000","client_mtime":"Mon, 26 Sep 2011 11:50:43 +0000","path":"\/1_sml.jpg","is_dir":false,"icon":"page_white_picture","root":"dropbox","mime_type":"image\/jpeg","size":"19.3 KB"},{"revision":6,"rev":"6069e0896","thumb_exists":true,"bytes":15797,"modified":"Tue, 20 Mar 2012 05:06:43 +0000","client_mtime":"Mon, 26 Sep 2011 11:51:09 +0000","path":"\/2_sml.jpg","is_dir":false,"icon":"page_white_picture","root":"dropbox","mime_type":"image\/jpeg","size":"15.4 KB"},{"revision":5,"rev":"5069e0896","thumb_exists":true,"bytes":13349,"modified":"Tue, 20 Mar 2012 05:06:43 +0000","client_mtime":"Mon, 26 Sep 2011 11:51:26 +0000","path":"\/3_sml.jpg","is_dir":false,"icon":"page_white_picture","root":"dropbox","mime_type":"image\/jpeg","size":"13 KB"},{"revision":4,"rev":"4069e0896","thumb_exists":true,"bytes":8838,"modified":"Tue, 20 Mar 2012 05:06:43 +0000","client_mtime":"Mon, 26 Sep 2011 11:51:46 +0000","path":"\/4_sml.jpg","is_dir":false,"icon":"page_white_picture","root":"dropbox","mime_type":"image\/jpeg","size":"8.6 KB"},{"revision":3,"rev":"3069e0896","thumb_exists":true,"bytes":99646,"modified":"Tue, 20 Mar 2012 04:57:58 +0000","client_mtime":"Tue, 20 Sep 2011 14:14:26 +0000","path":"\/bg.jpg","is_dir":false,"icon":"page_white_picture","root":"dropbox","mime_type":"image\/jpeg","size":"97.3 KB"}],"icon":"folder"}
My problem is that i would like get the output for each the image/file name only.. But i can find the right way to do it.. i through i could do it this way:
$info = json_encode($dropbox->getMetaData(''));
foreach($info->contents->path as $file){
echo $file;
}
But i get this error:
Warning: Invalid argument supplied for foreach() in /home/djrasmusp/rasmusp.com/db/index.php on line 16
But is there anyone that can give me a helping hand with my problem?

Try json_decode instead of json_encode (and maybe use second parameter to produce associated array instead of stdClass).

I had the same problem, here is my solution:
$metaData = $dropbox->metaData($path);
foreach($metaData['body']->contents as $file){
$f = str_replace("/", "", $file->path);
echo "<li>$f</li>";
}

tested by me :)
require_once('bootstrap.php');
$metaData = $dropbox->metaData();
foreach($metaData['body']->contents as $file)
{
echo "<pre>";
echo $file->path;
}

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Parsing PHP with fopen() encoding issue - php

Related

How to get the matched records from graph facebook API?

cURL HTTP request from WikiMedia API not working

PHP: Need to find certain data from a string

extracting field from twitter feed using json_decode

List files in dropbox folder

Categories

Resources