Slow on PHP mongodb with distinct operation

Slow on PHP mongodb with distinct operation - php

I'm using the mongoDB to store the log of user. In my real-time report, I need to count the distinct user of the table in a specific type. In the beginning, it runs fast, but it become slower when the table becomes bigger.
Here is the code I used:
$connection = new MongoClient();
$result = $collection->distinct('user', array('type' => $type, 'ctime' => array('$gte' => $start)));
$total = count($result);
$total is the total number of unique user
Can anyone suggest me how to improve the query to get the better performance?
Many thanks.

use $collection->ensureIndex(array('user' => 1)); to create index on user field.

Related

Array to string conversion error in Code Igniter when inserting array of data into table in database

I'm trying to insert an array of data into a table in database but an error said Array to string conversion error
This is the post function in my controller, first i post an array of data. The values of the array will be the names, and numbers, they are not id. The id is only kodejdwl. This will be pass to my model
function index_post() {
$data = array(
'kodejdwl' => $this->post('kodejdwl'),
'tahun_akad' => $this->post('kode_tahun_akad'),
'semester' => $this->post('semester'),
'mk' => $this->post('mk'),
'ruangan' => $this->post('ruangan'),
'nama_dosen' => $this->post('nama_dosen'),
'namakelas' => $this->post('nama_kelas'),
'jam_mulai' => $this->post('jam_mulai'),
'jam_selesai' => $this->post('jam_selesai'),
);
}
After the data from above code is passed to the model. I created some new variables which are the id of each the name of the value in the array data. e.g if the value of data['mk'] is Website then the id will be 1 and that id will be stored in variable $kodemk and i do it to each value in the data. Then i created new_data which stores array of the id's which i previously made. Then i insert that array into one table in my database. I thought it would be fine but it said Array to string conversion error. What should i do so i could insert that array into the table in my database?
public function insert($data){
$this->db->select('thn_akad_id');
$tahunakad_id = $this->db->get_where('tik.thn_akad',array('tahun_akad'=>$data['tahun_akad'],'semester_semester_nm'=>$data['semester']))->result();
$this->db->flush_cache();
$this->db->select('kodemk');
$kode_mk = $this->db->get_where('tik.matakuliah',array('namamk'=>$data['mk']))->result();
$this->db->flush_cache();
$ruangan = $this->db->get_where('tik.ruangan', array('namaruang' => $data['ruangan']), 1)->result();
$this->db->flush_cache();
$this->db->select('nip');
$nip_dosen = $this->db->get_where('tik.staff',array('nama'=>$data['nama_dosen']))->result();
$this->db->flush_cache();
$this->db->select('kodeklas');
$kodeklas = $this->db->get_where('tik.kelas',array('namaklas'=>$data['namakelas']))->result();
$this->db->flush_cache();
$this->db->select('kode_jam');
$kode_mk = $this->db->get_where('tik.wkt_kuliah',array('jam_mulai'=>$data['jam_mulai'],'jam_selesai'=>$data['jam_selesai']))->result();
$this->db->flush_cache();
$new_data = array(
'kodejdwl' => $data['kodejdwl'],
'thn_akad_thn_akad_id' => $tahunakad_id,
'matakuliah_kodemk' => $kode_mk,
'ruangan_namaruang' => $ruangan,
'staff_nip' => $nip_dosen,
'kelas_kodeklas' => $kodeklas,
);
$insert = $this->db->insert('tik.jadwal_kul', $new_data);
return $this->db->affected_rows();
}

You probably want to use row() instead of result() because it'll contain only one result that you want. If you want to use result() and store multiple values then you'll have to use implode to concatenate them and store it as a string.
I've written a possible solution for your problem; Some things were missing, so I've mentioned them in the comments. See if this helps you.
public function insert($data){
$this->db->select('thn_akad_id');
$tahunakad_id = $this->db->get_where('tik.thn_akad',array('tahun_akad'=>$data['tahun_akad'],'semester_semester_nm'=>$data['semester']))->row(); // use row here
$this->db->flush_cache();
$this->db->select('kodemk');
$kode_mk = $this->db->get_where('tik.matakuliah',array('namamk'=>$data['mk']))->row();
$this->db->flush_cache();
// remove your_ruangan_column with your desired column name
$this->db->select('your_ruangan_column');
$ruangan = $this->db->get_where('tik.ruangan', array('namaruang' => $data['ruangan']), 1)->row();
$this->db->flush_cache();
$this->db->select('nip');
$nip_dosen = $this->db->get_where('tik.staff',array('nama'=>$data['nama_dosen']))->row();
$this->db->flush_cache();
$this->db->select('kodeklas');
$kodeklas = $this->db->get_where('tik.kelas',array('namaklas'=>$data['namakelas']))->row();
$this->db->flush_cache();
// Not sure where this ↓↓ is being used but you can use it the same way as others
$this->db->select('kode_jam');
// duplicate variable name here ↓↓ (fix this)
$kode_mk = $this->db->get_where('tik.wkt_kuliah',array('jam_mulai'=>$data['jam_mulai'],'jam_selesai'=>$data['jam_selesai']))->row();
$this->db->flush_cache();
$new_data = array(
'kodejdwl' => $data['kodejdwl'],
'thn_akad_thn_akad_id' => $tahunakad_id->thn_akad_id, // {$tahunakad_id} consists an object with the key {thn_akad_id}-- table_column_name
'matakuliah_kodemk' => $kode_mk->kodemk, // ...
'ruangan_namaruang' => $ruangan->your_ruangan_column, // ...
'staff_nip' => $nip_dosen->nip, // ...
'kelas_kodeklas' => $kodeklas->kodeklas // ...
);
$insert = $this->db->insert('tik.jadwal_kul', $new_data);
return $this->db->affected_rows();
}

Your are making a total of 7 separate trips to the database. Best practice recommends that you always minimize your trips to the database for best performance. The truth is that your task can be performed in a single trip to the database so long as you set up the correct INSERT query with SELECT subqueries.
I don't know what your non-English words are, so I will use generalized terms in my demo (I've tested this successfully in my own CI project). I am also going to reduce the total subqueries to 3 to reduce the redundance in my snippet.
$value1 = $this->db->select('columnA')->where('cond1', $val1)->get_compiled_select('childTableA');
$value2 = $this->db->select('columnB')->where('cond2', $val2)->get_compiled_select('childTableB');
$value3 = $this->db->select('columnC')->where('cond3', $val3)->get_compiled_select('childTableC');
return (int)$this->$db->query(
"INSERT INTO parentTable
(column1, column2, column1)
VALUES (
($value1),
($value2),
($value3)
)"
);
// to mirror your affected rows return... 1 will be returned on successful insert, or 0 on failure
Granted this isn't using the ActiveRecord technique to form the complete INSERT query, but this is because CI doesn't allow subqueries in the VALUES portion (say, if you were to use the set() method). I am guessing this is because different databases use differing syntax to form these kinds of INSERTs -- I don't know.
The bottom line is, so long as you are fetching a single column value from a single row on each of these sub-SELECTs, this single query will run faster and with far less code bloat than running N number of individual queries. Because all of the variables involved are injected into the sql string using get_compiled_select() the stability/security integrity should be the same.

Predicting future IDs used before saving to the DB

I am saving a complex dataset in Laravel 4.2 and I am looking for ways to improve this.
$bits has several $bobs. A single $bob can be one of several different classes. I am trying to duplicate a singular $bit and all its associated $bobs and save all of this to the DB with as few calls as possible.
$newBit = $this->replicate();
$newBit->save();
$bobsPivotData = [];
foreach ($this->bobs as $index => $bob) {
$newBob = $bob->replicate();
$newBobs[] = $newBob->toArray();
$bobsPivotData[] = [
'bit_id' => $newBit->id,
'bob_type' => get_class($newBob),
'bob_id' => $newBob->id,
'order' => $index
];
}
// I now want to save all the $bobs kept in $newBobs[]
DB::table('bobs')->insert($newBobs);
// Saving all the pivot data in one go
DB::table('bobs_pivot')->insert($bobsPivotData);
My problem is here, that I cant access $newBob->id before I have inserted the $newBob after the loop.
I am looking for how best to reduce saves to the DB. My best guess is that if I can predict the ids that are going to be used, I can do all of this in one loop. Is there a way I can predict these ids?
Or is there a better approach?

You could insert the bobs first and then use the generated ids to insert the bits. This isn't a great solution in a multi-user environment as there could be new bobs inserted in the interim which could mess things up, but it could suffice for your application.
$newBit = $this->replicate();
$newBit->save();
$bobsPivotData = [];
foreach ($this->bobs as $bob) {
$newBob = $bob->replicate();
$newBobs[] = $newBob->toArray();
}
$insertId = DB::table('bobs')->insertGetId($newBobs);
$insertedBobs = DB::table('bobs')->where('id', '>=', $insertId);
foreach($insertedBobs as $index => $newBob){
$bobsPivotData[] = [
'bit_id' => $newBit->id,
'bob_type' => get_class($newBob),
'bob_id' => $newBob->id,
'order' => $index
];
}
// Saving all the pivot data in one go
DB::table('bobs_pivot')->insert($bobsPivotData);
I have not tested this, so some pseudo-code to be expected.

DynamoDB Count Group By

We are trying to search a dynamodb, and need to get count of objects within a grouping, how can this be done?
I have tried this, but when adding the second number, this doesn't work:
$search = array(
'TableName' => 'dev_adsite_rating',
'Select' => 'COUNT',
'KeyConditions' => array(
'ad_id' => array(
'ComparisonOperator' => 'EQ',
'AttributeValueList' => array(
array('N' => 1039722, 'N' => 1480)
)
)
)
);
$response = $client->query($search);
The sql version would look something like this:
select ad_id, count(*)
from dev_adsite_rating
where ad_id in(1039722, 1480)
group by ad_id;
So, is there a way for us to achieve this? I can not find anything on it.

Trying to perform a query like this on DynamoDB is slightly trickier than in an SQL world. To perform something like this, you'll need to consider a few things
EQ ONLY Hash Key: To perform this kind of query, you'll need to make two queries (i.e. ad_id EQ 1039722 / ad_id EQ 1480)
Paginate through query: Because dynamodb returns your result set in increments, you'll need to paginate through your results. Learn more here.
Running "Count": You can take the "Count" property from the response and add it to the running total as you're paginating through the results of both queries. Query API

You could add a Lambda function triggered by the DynamoDBStream, to aggregate your data on the fly, in your case add +1 to the relevant counters. Your search function would then simply retrieve the aggregated data directly.
Example: if you have a weekly online voting system where you need to store each vote (also to check that no user votes twice), you could aggregate the votes on the fly using something like this:
export const handler: DynamoDBStreamHandler = async (event: DynamoDBStreamEvent) => {
await Promise.all(event.Records.map(async record => {
if (record.dynamodb?.NewImage?.vote?.S && record.dynamodb?.NewImage?.week?.S) {
await addVoteToResults(record.dynamodb.NewImage.vote.S, record.dynamodb.NewImage.week.S)
}
}))
}
where addVoteToResults is something like:
export const addVoteToResults = async (vote: string, week: string) => {
await dynamoDbClient.update({
TableName: 'table_name',
Key: { week: week },
UpdateExpression: 'add #vote :inc',
ExpressionAttributeNames: {
'#vote': vote
},
ExpressionAttributeValues: {
':inc': 1
}
}).promise();
}
Afterwards, when the voting is closed, you can retrieve the aggregated votes per week with a single get statement. This solution also helps spreading the write/read load rather than having a huge increase when executing your search function.

How to clear a table in hbase?

I want to empty a table in hbase... eg: user. Is there any command or function to empty the table without deleting it...
My table structure is :
$mutations = array(
new Mutation( array(
'column' => 'username:1',
'value' =>$name
) ),
new Mutation( array(
'column' => 'email:1',
'value' =>$email
) )
);
$hbase->mutateRow("user",$key,$mutations);
Can someone help me?

If you execute this in HBase shell:
> truncate 'yourTableName'
Then HBase will execute this operations for 'yourTableName':
> disable 'yourTableName'
> drop 'yourTableName'
> create 'yourTableName', 'f1', 'f2', 'f3'

Another efficient option is to actually delete the table then reconstruct another one with all the same settings as the previous.
I don't know how to do this in php, but I do know how to do it in Java. The corresponding actions in php should be similar, you just need to check how the API looks like.
In Java using HBase 0.90.4:
// Remember the "schema" of your table
HBaseAdmin admin = new HBaseAdmin(yourConfiguration);
HTableDescriptor td = admin.getTableDescriptor(Bytes.toBytes("yourTableName");
// Delete your table
admin.disableTable("yourTableName");
admin.deleteTable("yourTableName");
// Recreate your talbe
admin.createTable(td);

Using hbase shell, truncate <table_name> will do the task.
The snapshot of truncate 'customer_details' command is shown below:
where customer_details is the table name

truncate command in hbase shell will do the job for you:
Before truncate:
After truncate:

HBase thrift API (which is what php is using) doesn't provide a truncate command only deleteTable and createTable functionality (what's the diff from your point of view?)
otherwise you have to scan to get all the keys and deleteAllRow for each key - which isn't a very efficient option

For the purposes of this you can use HAdmin. It is an UI tool for Apache HBase administration. There are "Truncate table" and even "Delete table" buttons in alter table page.

Using alter command
alter '<table_name>', NAME=>'column_family',TTL=><number_of_seconds>
here number_of_seconds stands for duration after which data will be automatically deleted.

There's no single command to clear Hbase table, but you can use 2 workarounds: disable, delete, create table, or scan all records and delete each.
Actually, disable, delete and create table again takes about 4 seconds.
// get Hbase client
$client = <Your code here>;
$t = "table_name";
$tables = $client->getTableNames();
if (in_array($t, $tables)) {
if ($client->isTableEnabled($t))
$client->disableTable($t);
$client->deleteTable($t);
}
$descriptors = array(
new ColumnDescriptor(array("name" => "c1", "maxVersions" => 1)),
new ColumnDescriptor(array("name" => "c2", "maxVersions" => 1))
);
$client->createTable($t, $descriptors);
If there's not a lot of data in table - scan all rows and delete each is much faster.
$client = <Your code here>;
$t = "table_name";
// i don't remember, if list of column families is actually needed here
$columns = array("c1", "c2");
$scanner = $client->scannerOpen($t, "", $columns);
while ($result = $client->scannerGet($scanner)) {
$client->deleteAllRow($t, $result[0]->row);
}
In this case data is not deleted physically, actually it's "marked as deleted" and stays in table until next major compact.

Perhaps using one of these two commands:
DELETE FROM your_table WHERE 1;
Or
TRUNCATE your_table;
Regards!

zend framework paginator (Zend_Paginator) results too slow

I have a query that running way too slow. the page takes a few minutes to load.
I'm doing a table join on tables with over 100,000 records. In my query, is it grabbing all the records or is it getting only the amount I need for the page? Do I need to put a limit in the query? If I do, won't that give the paginator the wrong record count?
$paymentsTable = new Donations_Model_Payments();
$select = $paymentsTable->select(Zend_Db_Table::SELECT_WITH_FROM_PART);
$select->setIntegrityCheck(false)
->from(array('p' => 'tbl_payments'), array('clientid', 'contactid', 'amount'))
->where('p.clientid = ?', $_SESSION['clientinfo']['id'])
->where('p.dt_added BETWEEN \''.$this->datesArr['dateStartUnix'].'\' AND \''.$this->datesArr['dateEndUnix'].'\'')
->join(array('c' => 'contacts'), 'c.id = p.contactid', array('fname', 'mname', 'lname'))
->group('p.id')
->order($sortby.' '.$dir)
;
$payments=$paymentsTable->fetchAll($select);
// paginator
$paginator = Zend_Paginator::factory($payments);
$paginator->setCurrentPageNumber($this->_getParam('page'), 1);
$paginator->setItemCountPerPage('100'); // items pre page
$this->view->paginator = $paginator;
$payments=$payments->toArray();
$this->view->payments=$payments;

Please see revised code below. You need to pass the $select to Zend_Paginator via the correct adapter. Otherwise you won't see the performance benefits.
$paymentsTable = new Donations_Model_Payments();
$select = $paymentsTable->select(Zend_Db_Table::SELECT_WITH_FROM_PART);
$select->setIntegrityCheck(false)
->joinLeft('contacts', 'tbl_payments.contactid = contacts.id')
->where('tbl_payments.clientid = 39')
->where(new Zend_Db_Expr('tbl_payments.dt_added BETWEEN "1262500129" AND "1265579129"'))
->group('tbl_payments.id')
->order('tbl_payments.dt_added DESC');
// paginator
$paginator = new Zend_Paginator(new Zend_Paginator_Adapter_DbTableSelect($select));
$paginator->setCurrentPageNumber($this->_getParam('page', 1));
$paginator->setItemCountPerPage('100'); // items pre page
$this->view->paginator = $paginator;
Please see revised code above!

In your code, you are :
first, selecting and fetching all records that match your condition
see the select ... from... and all that
and the call to fetchAll on the line just after
and, only the, you are using the paginator,
on the results returned by the fetchAll call.
With that, I'd say that, yes, all your 100,000 records are fetched from the DB, manipulated by PHP, passed to Zend_Paginator which has to work with them... only to discard almost all of them.
Using Zend_Paginator, you should be able to pass it an instance of Zend_Db_Select, and let it execute the query, specifying the required limit.
Maybe the example about DbSelect and DbTableSelect adapter might help you understand how this can be achieved (sorry, I don't have any working example).

I personally count the results via COUNT(*) and pass that to zend_paginator. I never understood why you'd deep link zend_paginator right into the database results. I can see the pluses and minuses, but really, its to far imho.
Bearing in mind that you only want 100 results, you're fetching 100'000+ and then zend_paginator is throwing them away. Realistically you want to just give it a count.
$items = Eurocreme_Model::load_by_type(array('type' => 'list', 'from' => $from, 'to' => MODEL_PER_PAGE, 'order' => 'd.id ASC'));
$count = Eurocreme_Model::load_by_type(array('type' => 'list', 'from' => 0, 'to' => COUNT_HIGH, 'count' => 1));
$paginator = Zend_Paginator::factory($count);
$paginator->setItemCountPerPage(MODEL_PER_PAGE);
$paginator->setCurrentPageNumber($page);
$this->view->paginator = $paginator;
$this->view->items = $items;

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Slow on PHP mongodb with distinct operation - php

use $collection->ensureIndex(array('user' => 1)); to create index on user field.

Related

Array to string conversion error in Code Igniter when inserting array of data into table in database

Predicting future IDs used before saving to the DB

DynamoDB Count Group By

How to clear a table in hbase?

zend framework paginator (Zend_Paginator) results too slow

Categories

Resources