How to limit number of items I fetch from DynamoDB with PartiQL? - php

I have this query which I run in PHP:
$result = $client->executeStatement([
'Limit' => 1,
'Statement' => "SELECT * FROM transactions WHERE completed = 0",
]);
I have tried using query function as well but that too supports Limit which is not actually a limit.
$result = $db->query(array(
'TableName' => 'transactions',
'IndexName' => 'completed-index',
'Count' => 1,
'Limit' => 1,
'ScannedCount' => 1,
'KeyConditions' => array(
'completed' => array(
'AttributeValueList' => array(
array('N' => '1')
),
'ComparisonOperator' => 'EQ'
),
),
));
According to their documentation, Limit doesnt necessarily mean a number of matching items:
The maximum number of items to evaluate (not necessarily the number of
matching items). If DynamoDB processes the number of items up to the
limit while processing the results, it stops the operation and returns
the matching values up to that point
Can anyone tell me if there is an actual way to limit the number of rows returned just like we do in SQL databases?

You can only limit how much data is read from disk (pre-filter), not how much is returned (post-filter).
DynamoDB never allows you to request for unbounded work. If DynamoDB allowed you to ask for just 1 row with a filter condition that never matched anything, that would potentially need to read the full database trying to find that 1 row. Behavior like that is what causes relational databases to have issues at scale.
Now, if you're not specifying a filter condition and your query is fully indexed, the amount read will match the amount returned, so the limit should act pretty much like you'd want.
Otherwise you might have to make repeated calls to page the results until you get as many rows as you want.

Related

data escaping remove for specific filed in cakephp

I am using subquery for id field.
$db = $this->AccountRequest->getDataSource();
$subQuery = $db->buildStatement(
array(
'fields' => array('MAX(id)'),
'table' => $db->fullTableName($this->AccountRequest),
'alias' => 'MaxRecord',
'limit' => null,
'offset' => null,
'order' => null,
'group' => array("user_id")
),
$this->AccountRequest
);
$searching_parameters = array(
#"AccountRequest.id IN " => "(SELECT MAX( id ) FROM `account_requests` GROUP BY user_id)"
"AccountRequest.id IN " => "(".$subQuery.")"
);
$this->Paginator->settings = array(
#'fields' => array('AccountRequest.*'),
'conditions' => $searching_parameters,
'limit' => $limit,
'page' => $page_number,
#'group' => array("AccountRequest.user_id"),
'order' => array(
'AccountRequest.id' => 'DESC'
)
);
$data = $this->Paginator->paginate('AccountRequest');
This structure is producing a query is:
SELECT
`AccountRequest`.`id`,
`AccountRequest`.`user_id`,
`AccountRequest`.`email`,
`AccountRequest`.`emailchange`,
`AccountRequest`.`email_previously_changed`,
`AccountRequest`.`first_name`,
`AccountRequest`.`first_namechange`,
`AccountRequest`.`f_name_previously_changed`,
`AccountRequest`.`last_name`,
`AccountRequest`.`last_namechange`,
`AccountRequest`.`l_name_previously_changed`,
`AccountRequest`.`reason`,
`AccountRequest`.`status`,
`AccountRequest`.`created`,
`AccountRequest`.`modified`
FROM
`syonserv_meetauto`.`account_requests` AS `AccountRequest`
WHERE
`AccountRequest`.`id` IN '(SELECT MAX(id) FROM `syonserv_meetauto`.`account_requests` AS `MaxRecord` WHERE 1 = 1 GROUP BY user_id)'
ORDER BY
`AccountRequest`.`id` DESC
LIMIT 25
In the subquery, its add an extra single quote so it's producing an error.
So, How can I remove these single quotes from this subquery?
Thanks
What are you trying to achieve with the sub query?
The MAX(id) just means it will pull the id with the largest value AKA the most recent insert. The sub query is completely redundant when you can just ORDER BY id DESC.
using MAX() will return only one record, if this is what you want to achieve you can replicate by adding LIMIT 1
If the sub query is just an example and is meant to be from another table I would just run the query that gets the most recent id before running the main query. Getting the last inserted id in a separate query is very quick and I cant see much of a performance loss. I think it will result in cleaner code that`s easier to follow to.
edit 1: From the comments it sounds like all your trying to get is a particular users latest account_requests.
You dont need the sub query at all. My query below will get the most recent account record for the user id you choose.
$this->Paginator->settings = array(
'fields' => array('AccountRequest.*'),
'conditions' => array(
'AccountRequest.user_id' => $userID // you need to set the $userID
)
'page' => $page_number,
'order' => array(
'AccountRequest.id DESC' //shows most recent first
),
'limit' => 1 // set however many you want the maximum to be
);
The other thing you cold be meaning is to get multiple entries from multiple users and display them in order of user first and then the order of recent to old for that user. MYSQL lets you order by more than one field, in that case try:
$this->Paginator->settings = array(
'conditions' => array(
'AccountRequest.user_id' => $userID // you need to set the $userID
)
'page' => $page_number,
'order' => array(
'AccountRequest.user_id', //order by the users first
'AccountRequest.id DESC' //then order there requests by recent to old
)
);
If the example data you have added into the question is irrelevant and you are only concerned about how to do nested subqueries it has already been answered here
CakePHP nesting two select queries
However I still think based on the data in the question you can avoid using a nested query.

project the sum of values in a mongo subdocument

I have a Mongo Collection that I'm trying to aggregate in which I need to be able to filter the results based on a sum of values from a subdocument. Each of my entries has a subdocument that looks like this
{
"_id": <MongoId>,
'clientID': 'some ID',
<Other fields I can filter on normally>
"bidCompData": [
{
"lineItemID": "210217",
"qtyBid": 3,
"priceBid": 10.25,
"qtyComp": 0
"description": "Lawn Mowed"
"invoiceID": 23
},
{
<More similar entries>
}
]
}
What I'm trying to do is filter on the sum of qtyBid in a given record. For example, my user could specify that they only want records that have a total qtyBid across all of the bidCompData that's greater than 5. My research shows that I can't use $sum outside of the $group stage in the pipeline but I need to be able to sum just the qtyBid values for each individual record. Presently my pipeline looks like this.
array(
array('$project' => $basicProjection), //fields to project calculated earlier using the input parameters.
array('$match' => $query),
array('$group' => array(
'_id' =>
array('clientID' => '$clientID'),
'count' => array('$sum' => 1)
)
)
I tried having another group and an unwind before the group I presently have in my pipeline so that I could get the sum there but it doesn't let me keep my fields besides the id and the sum field. Is there a way to do this without using $where? My database is large and I can't afford the speed hit from the JS execution.

Mysql like Limit ( records ) in Dynamodb?

I am using aws PHP sdk to querying dynamodb.
Requirement is to fetch records in a batch of 10 items.
When user scroll screen down then fetch 10 previous items and so on.
How can I instruct Dynamodb to fetch only 10 items ? I use Limit option but as per documentations of dynamodb Limit option is the number of records to process , not number of records dynamodb will return.
How can I give limit of number of records to return so that in next run I can use LastEvaluatedKey as ExclusiveStartkey for next 10.
Here is the snippet of my code
// It means get the top $limit posts
$iterator = $dynamo->query(array(
'TableName' => 'posts',
'IndexName' => 'college_id-unix_timestamp-index',
'ScanIndexForward' => false,
'ExclusiveStartKey' => $lastEvaluatedKey,
'KeyConditions' => array(
'college_id' => array(
'AttributeValueList' => array(
array('S' => (string)$collegeId )
),
'ComparisonOperator' => 'EQ'
)
),
'Limit' => (Integer)$limit,
)
);

Compound Indexes on MongoDB

Sorry for my english, I need help on mongodb indexes. I have a capped collection (size: 10GB) with some fields for my application logs.
Example structure: Logs[_id, userId, sum, type, time, response, request]. I have created compound index: [userId,time,type]. I get two arrays are grouped records by userId for today, where 'type' is "null" and "1". And my two query example:
$group = array(
array(
'$match' => array(
'userId' => $userId,
'time' => array(
'$gt' => date("Y-m-d")
),
'type' => array('$ne' => null)
)
),
array(
'$group' => array(
"_id" => '$userId',
"total" => array('$sum' => '$sum'),
"count" => array('$sum' => 1)
),
)
);
$results = $collections->aggregate($group);
$group = array(
array(
'$match' => array(
'userId' => $userId,
'time' => array(
'$gt' => date("Y-m-d")
),
'type' => 1
)
),
array(
'$group' => array(
"_id" => '$userId',
"count" => array('$sum' => 1)
),
)
);
$results2 = $collections->aggregate($group);
If current user has more 100000 documents on collection for today - the speed of my query is very slow (more 10 sec). Give me some advices on creating the right index, please :) Thanks.
Based on the explain that you posted, the correct index is being used (BtreeCursor), it is using only the index (i.e. it is a covered index query - indexOnly is true) and nothing is being matched (n = 0) in this case. So, that all checks out generally, though $ne as a clause in the first example is not going to be very efficient.
However the main issue based on the explain is likely the fact that the index does not appear to be fully in memory. There are 13 yields listed and the most common reason for a query like this to yield is when it has to fault to disk to page something in. Since, as mentioned previously, it is only using the index, those yields imply faults to disk for the index and hence indicate that the whole index is not in memory.
If you re-run the query immediately after this it should be faster (assuming the index can actually fit into available memory) because the index will have been paged in by the first run. If it is still slow on the second run and showing yields, then you either don't have enough memory to hold the index in memory or something else is evicting it from memory and you essentially have memory contention causing performance problems.

zend framework 2 and doctrine query builder

I am trying to write a query using the doctrine query builder based on the parameters i am getting from an array.
Here's my array,
$query = array('field' => 'number,
'from' => '1',
'to' => '100',
'Id' => '2',
'Decimation' => '10'
);
The query that i am trying to write is,
select * from table where (number between 1 AND 100) AND (Id = 2) AND number mod 10 = 0
Here's where i stand now,
if (is_array($parameters['query'])) {
$queryBuilder->select()
->where(
$queryBuilder->expr()->between($parameters['query']['field'], $parameters['query']['from'], $parameters['query']['to']),
$queryBuilder->expr()->eq('Id', '=?1'),
$queryBuilder->expr()->eq($parameters['query']['field'],'mod 10 = 0')
)
->setParameter(array(1 => $parameters['query']['Id']));
}
I just cant wrap my head around this for some reason. Help !! anyone?
Not tested or anything, just directly typed into SO answer box:
$queryBuilder->select('table')
->from('My\Table\Entity', 'table')
->where($queryBuilder->expr()->andX(
$queryBuilder->expr()->between('table.number', ':from', ':to'),
$queryBuilder->expr()->eq->('table.id', ':id'),
))
->andWhere('MOD(table.number, :decimation) = 0')
->setParameters(array(
'from' => $parameters['query']['from'],
'to' => $parameters['query']['to'],
'id' => $parameters['query']['id'],
'decimation' => $parameters['query']['decimation']
));
It does not dynamically let you set which field to apply the where condition to. However, this is most likely a bad idea without whitelisting the values you want to allow. Once this is done a simple modification to the code above (just interpolate the value in place of number in the table.number string).
I kinda got it,
The main idea here was to use the andWhere clause. Multiple where clauses was what i wanted. and the mod operator that doctrine gives. Worked out pretty well. In between the requirement changed as well.

Categories