following the example in the docs, I search in my type like that:
$params = [
'index' => $this->index,
'type' => $this->type,
"scroll" => "30s",
"size" => 10,
'body' => $json
];
$response = $this->es->search($params);
now $response includes 10 results and a _scroll_id. How can I use it to paginate my results? I am looking at this example
https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/_search_operations.html#_scrolling
then the doc suggests to do:
while (isset($response['hits']['hits']) && count($response['hits']['hits']) > 0) {
$scroll_id = $response['_scroll_id'];
$response = $client->scroll([
"scroll_id" => $scroll_id, //...using our previously obtained _scroll_id
"scroll" => "30s" // and the same timeout window
]
);
}
but not clear if I have to do a new request or if the while loop is meant to store all the results in a variable and pass it to the template.
Any practical example to show?
thanks
After first request with scroll parameter, you will receive normal payload (_shards, took, hits, etc) and _scroll_id as well from elasticsearch. In hits you will have total number of found documents, hence you have: items per page and total count of items.
To figure out counts of pages - you need to do simple math.
To receive next batch of data from elasticsearch - run this:
curl 'localhost:9200/_search/scroll ' -d '{
"scroll" : "30s",
"scroll_id" : "cXVlcnlUaGVuRmV0Y2g7NTsxMzU0MDQzNjY6QnNHMjd0bXlTZ3Ftd1dkblRUd3NQZzsxMDE1NzI4Mjc6ajFzVmtLQUdSaEduRWFRVi1GZE05UTsxMDE1NzI4MjY6ajFzVmtLQUdSaEduRWFRVi1GZE05UTsxMDE1ODAzODc6TURuUG5nbzRUVU9NUUFjSERqM2hIQTsxMDE1ODAzODg6TURuUG5nbzRUVU9NUUFjSERqM2hIQTswOw=="
}'
With scroll you can only receive next batch, you can't jump on particular page.
Related
I want to iterate over the objects in a bucket. I REALLY need to paginate this - we have 100's of thousands of objects in the bucket. Our bucket looks like:
bucket/MLS ID/file 1
bucket/MLS ID/file 2
bucket/MLS ID/file 3
... etc
Simplest version of my code follows. I know the value I'm setting into $params['nextToken'] is wrong, I can't figure out how or where to get the right one. $file_objects is a 'Google\Cloud\Storage\ObjectIterator', right?
// temp: pages of 10, out of a total of 100. I really want pages of 100
// out of all (in my test bucket, I have about 700 objects)
$params = [
'prefix' => $mls_id,
'maxResults' => 10,
'resultLimit' => 100,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
while ( $file_objects = $bucket->objects($params) )
{
foreach ( $file_objects as $object )
{
print "NAME: {$object->name()}\n";
}
// I think that this might need to be encoded somehow?
// or how do I get the requested nextPageToken???
$params['pageToken'] = $file_objects->nextResultToken();
}
So - I don't understand maxResults vs resultLimit. It would seem that resultLimit would be the total that I want to see from my bucket, and maxResults the size of my page. But maxResults doesn't seem to affect anything, while resultLimit does.
maxResults = 100
resultLimit = 10
produces 10 objects.
maxResults = 10
resultLimit = 100
spits out 100 objects.
maxResults = 10
resultLimit = 0
dumps out all 702 in the bucket, with maxResults having no effect at all. And at no point does "$file_objects->nextResultToken();" give me anything.
What am I missing?
The objects method automatically handles pagination for you. It returns an ObjectIterator object.
The resultLimit parameter limits the total number of objects to return across all pages. The maxResults parameter sets the maximum number to return per page.
If you use a foreach over the ObjectIterator object, it'll iterate through all objects, but note that there are also other methods in ObjectIterator, like iterateByPage.
Ok, I think I got it. I found the documentation far too sparse and misleading. The code I came up with:
$params = [
'prefix' => <my prefix here>,
'maxResults' => 100,
//'resultLimit' => 0,
'fields' => 'items/id,items/name,items/updated,nextPageToken',
'pageToken' => NULL
];
// Note: setting 'resultLimit' to 0 does not work, I found the
// docs misleading. If you want all results, don't set it at all
// Get the first set of objects per those parameters
$object_iterator = $bucket->objects($params);
// in order to get the next_result_token, I had to get the current
// object first. If you don't, nextResultToken() always returns
// NULL
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
while ($next_result_token)
{
$object_page_iterator = $object_iterator->iterateByPage();
foreach ($object_page_iterator->current() as $file_object )
{
print " -- {$file_object->name()}\n";
}
// here is where you use the page token retrieved earlier - get
// a new set of objects
$params['pageToken'] = $next_result_token;
$object_iterator = $bucket->objects($params);
// Once again, get the current object before trying to get the
// next result token
$current = $object_iterator->current();
$next_result_token = $object_iterator->nextResultToken();
print "NEXT RESULT TOKEN: {$next_result_token}\n";
}
This seems to work for me, so now I can get to the actual problem. Hope this helps someone.
I'm trying to use the Google Sheets API (in PHP using google/apiclient and OAuth 2.0) to get data from a spreadsheet.
This spreadsheet has defined some "Filter views" (defined in Data > Filter views) where for example one of the filter display only rows where the Price column is greater than x.
I'm trying to find a method to get already filtered data using one of the existing Filter views, but I can't find it. Is there any way to do it?
Although I'm not sure about your current script from your question, when you want to retrieve the filtered values from the sheet using the filter view, unfortunately, in the current stage, there are no methods for directly achieving it. So, in this case, it is required to use a workaround. The flow of the workaround is as follows.
Flow of this workaround:
Retrieve the settings of the filter view (filterViews) you want to use.
In this case, the method of "spreadsheets.get" can be used.
Create new basic filter to the sheet you want to use using the retrieved settings of the filter view.
In this case, the method of "spreadsheets.batchUpdate" can be used.
Retrieve the values of rowMetadata of the sheet.
In this case, the method of "spreadsheets.get" can be used.
At the values of rowMetadata, the filtered rows have the property of "hiddenByFilter": true,. Using this, you can retrieve the hidden rows and/or the showing rows.
Delete the created basic filter.
By above flow, the filtered values can be retrieved.
Sample script:
When you have already been able to get and put values for Google Spreadsheet using Sheets API with googleapis for PHP, as a sample script of above workaround, you can use the following script.
$service = new Google_Service_Sheets($client); // Please use $client from your script.
$spreadSheetId = '###'; // Please set the Spreadsheet ID.
$sheetName = 'Sheet1'; // Please set the sheet name.
$filterViewName = 'sampleFilter1'; // Please set the filter view name.
// 1. Retrieve the settings of the filter view (`filterViews`) you want to use.
$sheets = $service->spreadsheets->get($spreadSheetId, ["ranges" => [$sheetName], "fields" => "sheets"])->getSheets();
$sheetId = $sheets[0]->getProperties()->getSheetId();
$filterViews = $sheets[0]->getFilterViews();
$filterView = array();
foreach ($filterViews as $i => $f) {
if ($f->getTitle() == $filterViewName) {
array_push($filterView, $f);
};
};
if (count($filterView) == 0) return;
// 2. Create new basic filter to the sheet you want to use using the retrieved settings of the filter view.
$obj = $filterView[0];
$obj['range']['sheetId'] = $sheetId;
$requests = [
new Google_Service_Sheets_Request(['clearBasicFilter' => ['sheetId' => $sheetId]]),
new Google_Service_Sheets_Request([
'setBasicFilter' => [
'filter' => [
'criteria' => $obj['criteria'],
'filterSpecs' => $obj['filterSpecs'],
'range' => $obj['range'],
'sortSpecs' => $obj['sortSpecs'],
]
]
])
];
$batchUpdateRequest = new Google_Service_Sheets_BatchUpdateSpreadsheetRequest(['requests' => $requests]);
$service->spreadsheets->batchUpdate($spreadSheetId, $batchUpdateRequest);
// 3. Retrieve the values of `rowMetadata` of the sheet.
$sheets = $service->spreadsheets->get($spreadSheetId, ["ranges" => [$sheetName], "fields" => "sheets"])->getSheets();
$rowMetadata = $sheets[0]->getData()[0]->getRowMetadata();
$filteredRows = array(
'hiddenRows' => array(),
'showingRows' => array()
);
foreach ($rowMetadata as $i => $r) {
if (isset($r['hiddenByFilter']) && $r['hiddenByFilter'] === true) {
array_push($filteredRows['hiddenRows'], $i + 1);
} else {
array_push($filteredRows['showingRows'], $i + 1);
};
};
// 4. Delete the created basic filter.
$requests = [new Google_Service_Sheets_Request(['clearBasicFilter' => ['sheetId' => $sheetId]])];
$batchUpdateRequest = new Google_Service_Sheets_BatchUpdateSpreadsheetRequest(['requests' => $requests]);
$service->spreadsheets->batchUpdate($spreadSheetId, $batchUpdateRequest);
print($filteredRows);
Result:
When above script is used for the following sample Spreadsheet,
Before filter view is not set.
After filter view was set.
Result value
From above Spreadsheet, the following result is obtained.
{
"hiddenRows": [2, 3, 5, 6, 8, 9],
"showingRows": [1, 4, 7, 10, 11, 12, 13, 14, 15]
}
hiddenRows is the hidden row numbers.
showingRows is the showingRows row numbers.
Note:
IMPORTANT: In this sample script, when the basic filter is used in the sheet, the basic filter is cleared. Please be careful this. When you test this script, please use the sample Spreadsheet.
In this sample script, the hidden rows and showing rows are retrieved. Using these values, you can retrieve the filtered values.
References:
Method: spreadsheets.get
Method: spreadsheets.batchUpdate
Related thread.
Is there a way to check if a row is hidden by a filter view in Google Sheets using Apps Script?
This thread is for Google Apps Script. Unfortunately, I couldn't find the PHP script for this. So I added the sample script.
I have a field called "arrivalDate" and this field is a string. Each document has an arrivalDate in string format (ex: 20110128). I want my output to be something like this (date and the number of records that have that date):
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
I already have the query to get these results.
I am trying to display aggregated results in PHP from Elasticsearch. I want my output to be something like this:
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
This is what I have so far:
$json = '{"aggs": { "group_by_date": { "terms": { "field": "arrivalDate" } } } }';
$params = [
'index' => 'pickups',
'type' => 'external',
'body' => $json
];
$results = $es->search($params);
However, I don't know how to display the results in PHP. For example, if I wanted to display the total number of documents I would do echo $results['hits']['total'] How could I display all the dates with the number of records they have in PHP?
I'd suggest using aggregations in the same way you construct the query, from my experience it seems to work quicker. Please see the below code:
'aggs' => [
'group_by_date' => [
'terms' => [
'field' => 'arrivalDate',
'size' => 500
]
]
]
Following that, instead of using the typical results['hits']['hits'] you would switch out the hits parts to results['aggregations']. Then access the returning data by accessing the buckets in the response.
For accessing the data from the aggregation shown above, it would likely be something along the lines of:
foreach ($results as $result){
foreach($result['buckets'] as $record){
echo($record['key']);
}
}
There will be a better way of accessing the array within the array, however, the above loop system works well for me. If you have any issues with accessing the data, let me know.
I have written an application that searches tweets for a specific keyword using the twitter API, but i am trying to find a way to display the latest tweets first, i am unable to find a way to sort the tweets received as a response.
I am referring to link https://dev.twitter.com/docs/api/1.1/get/search/tweets and below is my code
I have included all necessary files and set all required parameters
function search(array $query)
{
$toa = new TwitterOAuth(CONSUMER_KEY, CONSUMER_SECRET, ACCESS_TOKEN, ACCESS_TOKEN_SECRET);
return $toa->get('search/tweets', $query);
}
$query = array(
"q" => "Sachin Tendulkar",
"count" => 10,
"result_type" => "popular"
);
$results = search($query);
Any help on this would be appreciated. Thanks
To display the latest tweet, you should use result_type as recent.
$query = array(
"q" => "Sachin Tendulkar",
"count" => 10,
"result_type" => "recent"
);
More about result_type paramater :
mixed: Include both popular and real time results in the response.
recent: return only the most recent results in the response.
popular: return only the most popular results in the response.
I'm creating some analytics script using PHP and MongoDB and I am a bit stuck. I would like to get the unique number of visitors per day within a certain time frame.
{
"_id": ObjectId("523768039b7e7a1505000000"),
"ipAddress": "127.0.0.1",
"pageId": ObjectId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId": "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime": ISODate("2013-09-16T20:20:19.0Z")
}
The field to filter on is uniqueVisitorId and recordedTime.
I've created a database object in PHP that I initialise and it makes me a database connection when the object is constructed, then I have MongoDB php functions simply mapped to public function using the database connection created on object construction.
Anyhow, so far I get the number of visitors per day with:
public function GetUniqueVisitorsDiagram() {
// MAP
$map = new MongoCode('function() {
day = new Date(Date.UTC(this.recordedTime.getFullYear(), this.recordedTime.getMonth(), this.recordedTime.getDate()));
emit({day: day, uniqueVisitorId:this.uniqueVisitorId},{count:1});
}');
// REDUCE
$reduce = new MongoCode("function(key, values) {
var count = 0;
values.forEach(function(v) {
count += v['count'];
});
return {count: count};
}");
// STATS
$stats = $this->database->Command(array(
'mapreduce' => 'statistics',
'map' => $map,
'reduce' => $reduce,
"query" => array(
"recordedTime" =>
array(
'$gte' => $this->startDate,
'$lte' => $this->endDate
)
),
"out" => array(
"inline" => 1
)
));
return $stats;
}
How would I filter this data correctly to get unique visitors? Or would it better to use aggregation, if so could you be so kind to help me out with a code snippet?
The $group operator in the aggregation framework was designed for exactly this use case and will likely be ~10 to 100 times faster. Read up on the group operator here: http://docs.mongodb.org/manual/reference/aggregation/group/
And the php driver implementation here: http://php.net/manual/en/mongocollection.aggregate.php
You can combine the $group operator with other operators to further limit your aggregations. It's probably best you do some reading up on the framework yourself to better understand what's happening, so I'm not going to post a complete example for you.
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year'), 'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
To answer the question entirely:
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year','v'=>'$uniqueVisitorId'), 'c'=>array('$sum'=>1))),
array('$group'=>array('_id'=>array('day'=>'$_id.day','month'=>'$_id.month','year'=>'$_id.year'),'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
Something close to that is what your looking for I believe.
It will reutrn a set of documents that have the _id as the date and then the count of unique visitors for that day irresptive of the of the id, it simply detects only if the id is there.
Since you want it per day you can actually exchange the dat parts for just one field of $dayOfYear I reckon.