I need to parse a fairly large XML file (varying between about a hundred kilobytes and several hundred kilobytes), which I'm doing using Xml#parse(String, ContentHandler). I'm currently testing this with a 152KB file.
During parsing, I also insert the data in an SQLite database using calls similar to the following: getWritableDatabase().insert(TABLE_NAME, "_id", values). All of this together takes about 80 seconds for the 152KB test file (which comes down to inserting roughly 200 rows).
When I comment out all insert statements (but leave in everything else, such as creating ContentValues etc.) the same file takes only 23 seconds.
Is it normal for the database operations to have such a big overhead? Can I do anything about that?
You should do batch inserts.
Pseudocode:
db.beginTransaction();
for (entry : listOfEntries) {
db.insert(entry);
}
db.setTransactionSuccessful();
db.endTransaction();
That increased the speed of inserts in my apps extremely.
Update:
#Yuku provided a very interesting blog post: Android using inserthelper for faster insertions into sqlite database
Since the InsertHelper mentioned by Yuku and Brett is deprecated now (API level 17), it seems the right alternative recommended by Google is using SQLiteStatement.
I used the database insert method like this:
database.insert(table, null, values);
After I also experienced some serious performance issues, the following code speeded my 500 inserts up from 14.5 sec to only 270 ms, amazing!
Here is how I used SQLiteStatement:
private void insertTestData() {
String sql = "insert into producttable (name, description, price, stock_available) values (?, ?, ?, ?);";
dbHandler.getWritableDatabase();
database.beginTransaction();
SQLiteStatement stmt = database.compileStatement(sql);
for (int i = 0; i < NUMBER_OF_ROWS; i++) {
//generate some values
stmt.bindString(1, randomName);
stmt.bindString(2, randomDescription);
stmt.bindDouble(3, randomPrice);
stmt.bindLong(4, randomNumber);
long entryID = stmt.executeInsert();
stmt.clearBindings();
}
database.setTransactionSuccessful();
database.endTransaction();
dbHandler.close();
}
Compiling the sql insert statement helps speed things up. It can also require more effort to shore everything up and prevent possible injection since it's now all on your shoulders.
Another approach which can also speed things up is the under-documented android.database.DatabaseUtils.InsertHelper class. My understanding is that it actually wraps compiled insert statements. Going from non-compiled transacted inserts to compiled transacted inserts was about a 3x gain in speed (2ms per insert to .6ms per insert) for my large (200K+ entries) but simple SQLite inserts.
Sample code:
SQLiteDatabse db = getWriteableDatabase();
//use the db you would normally use for db.insert, and the "table_name"
//is the same one you would use in db.insert()
InsertHelper iHelp = new InsertHelper(db, "table_name");
//Get the indices you need to bind data to
//Similar to Cursor.getColumnIndex("col_name");
int first_index = iHelp.getColumnIndex("first");
int last_index = iHelp.getColumnIndex("last");
try
{
db.beginTransaction();
for(int i=0 ; i<num_things ; ++i)
{
//need to tell the helper you are inserting (rather than replacing)
iHelp.prepareForInsert();
//do the equivalent of ContentValues.put("field","value") here
iHelp.bind(first_index, thing_1);
iHelp.bind(last_index, thing_2);
//the db.insert() equilvalent
iHelp.execute();
}
db.setTransactionSuccessful();
}
finally
{
db.endTransaction();
}
db.close();
If the table has an index on it, consider dropping it prior to inserting the records and then adding it back after you've commited your records.
If using a ContentProvider:
#Override
public int bulkInsert(Uri uri, ContentValues[] bulkinsertvalues) {
int QueryType = sUriMatcher.match(uri);
int returnValue=0;
SQLiteDatabase db = mOpenHelper.getWritableDatabase();
switch (QueryType) {
case SOME_URI_IM_LOOKING_FOR: //replace this with your real URI
db.beginTransaction();
for (int i = 0; i < bulkinsertvalues.length; i++) {
//get an individual result from the array of ContentValues
ContentValues values = bulkinsertvalues[i];
//insert this record into the local SQLite database using a private function you create, "insertIndividualRecord" (replace with a better function name)
insertIndividualRecord(uri, values);
}
db.setTransactionSuccessful();
db.endTransaction();
break;
default:
throw new IllegalArgumentException("Unknown URI " + uri);
}
return returnValue;
}
Then the private function to perform the insert (still inside your content provider):
private Uri insertIndividualRecord(Uri uri, ContentValues values){
//see content provider documentation if this is confusing
if (sUriMatcher.match(uri) != THE_CONSTANT_IM_LOOKING_FOR) {
throw new IllegalArgumentException("Unknown URI " + uri);
}
//example validation if you have a field called "name" in your database
if (values.containsKey(YOUR_CONSTANT_FOR_NAME) == false) {
values.put(YOUR_CONSTANT_FOR_NAME, "");
}
//******add all your other validations
//**********
//time to insert records into your local SQLite database
SQLiteDatabase db = mOpenHelper.getWritableDatabase();
long rowId = db.insert(YOUR_TABLE_NAME, null, values);
if (rowId > 0) {
Uri myUri = ContentUris.withAppendedId(MY_INSERT_URI, rowId);
getContext().getContentResolver().notifyChange(myUri, null);
return myUri;
}
throw new SQLException("Failed to insert row into " + uri);
}
Related
I'm trying out performance of a system I'm building, and it's really slow, and I don't know why or if it should be this slow. What I'm testing is how many single INSERT I can do to the database and I get around 22 per second. That sounds really slow and when I tried to do the inserts i a singel big SQL-query I can insert 30000 records in about 0.5 seconds. In real life the inserts is made by different users in the system so the overhead of connecting, sending the query, parsing the query etc. will always be there. What I have tried so far:
mysqli with as little code as possible. = 22 INSERT per second
PDO with as little code as possible. = 22 INSERT per second
Changing the connection host from localhost to 127.0.0.1 = 22 INSERT per second
mysqli without statement object and check for SQL-injection = 22 INSERT per second
So something seams to be wrong here.
System specs:
Intel i5
16 gig ram
7200 rpm diskdrive
Software:
Windows 10
XAMPP, fairly new with MariaDB
DB engine innoDB.
The code I used to do the tests:
$amountToInsert = 1000;
//$fakeData is an array with randomly generated emails
$fakeData = getFakeData($amountToInsert);
$db = new DatabaseHandler();
for ($i = 0; $i < $amountToInsert; $i++) {
$db->insertUser($fakeUsers[$i]);
}
$db->closeConnection();
The class that calls the database:
class DatabaseHandler {
private $DBHOST = 'localhost';
private $DBUSERNAME = 'username';
private $DBPASSWORD = 'password';
private $DBNAME = 'dbname';
private $DBPORT = 3306;
private $mDb;
private $isConnected = false;
public function __construct() {
$this->mDb = new mysqli($this->DBHOST, $this->DBUSERNAME
, $this->DBPASSWORD, $this->DBNAME
, $this->DBPORT);
$this->isConnected = true;
}
public function closeConnection() {
if ($this->isConnected) {
$threadId = $this->mDb->thread_id;
$this->mDb->kill($threadId);
$this->mDb->close();
$this->isConnected = false;
}
}
public function insertUser($user) {
$this->mDb->autocommit(true);
$queryString = 'INSERT INTO `users`(`email`, `company_id`) '
.'VALUES (?, 1)';
$stmt = $this->mDb->prepare($queryString);
$stmt->bind_param('s', $user);
if ($stmt->execute()) {
$stmt->close();
return 1;
} else {
$stmt->close();
return 0;
}
}
}
The "user" table has 4 columns with the following structure:
id INT unsigned primary key
email VARCHAR(60)
company_id INT unsigned INDEX
guid TEXT
I'm at a loss here and don't really know where to look next. Any help in the right direction would be very much appreciated.
Like it's explained in the comments, it's InnoDB to blame. By default this engine is too cautious and doesn't utilize the disk cache, to make sure that data indeed has been written on disk, before returning you a success message. So you basically have two options.
Most of time you just don't care for the confirmed write. So you can configure mysql by setting this mysql option to zero:
innodb_flush_log_at_trx_commit = 0
as long as it's set this way, your InnoDB writes will be almost as fast as MyISAM.
Another option is wrapping all your writes in a single transaction. As it will require only single confirmation from all the writes, it will be reasonable fast too.
Of course, it's just sane to prepare your query only once with multiple inserts but the speed gain is negligible compared to the issue above. So it doesn't count neither as an explanation nor as a remedy for such an issue.
Your test isn't a very good way of judging performance. Why because you are preparing a statement 1000 times. That's not the way prepared statements are supposed to be used. The statement is prepared once and different parameters bound multiple times. Try this:
public function __construct() {
$this->mDb = new mysqli($this->DBHOST, $this->DBUSERNAME
, $this->DBPASSWORD, $this->DBNAME
, $this->DBPORT);
$this->isConnected = true;
$queryString = 'INSERT INTO `users`(`email`, `company_id`) '
.'VALUES (?, 1)';
$this->stmt_insert = $this->mDb->prepare($queryString);
}
and
public function insertUser($user) {
$this->stmt_insert->bind_param('s', $user);
if ($this->stmt_insert->execute()) {
return 1;
} else {
return 0;
}
}
And, you will be seeing a huge boost in performance. So to recap, there's nothing wrong with your system, it's just the test that was bad.
Update:
Your Common Sense has a point about preparing in advance and reusing the prepared statement not giving a big boost. I tested and found it to be about 5-10%
however, There is something that does give a big boost. Turning Autocommit to off!. Inserting 100 records which previously took about 4.7 seconds on average dropped to < 0.5s on average!
$con->autocommit(false);
/loop/
$con->commit();
I'm trying to use a MERGE INTO statement in my php file to update or insert into a MySQL database for a multiplayer game.
Here's a full description of what I'm trying to accomplish:
The php file is called with the following line from a javascript file:
xmlhttp.open('GET', "phpsqlajax_genxml.php?" + "lat=" + lla[0] + "&heading=" + truckHeading + "&lng=" + lla[1] + "&velocity0=" + vel0 + "&velocity1=" + vel1 + "&velocity2=" + vel2 + "&id=" + playerNumber, true);
This will be sending the php file information to update the database with. Either this will be a new player and the first time this information has been sent, meaning that a new row in the database will be created, or it will be a current player who just needs to have their information updated.
If it is a new player the "id" that is sent will be one that doesn't yet exist in the database.
For some reason the database isn't being updated, nor are new rows being added. I'm thinking it's a syntax error because I don't have much experience using MERGE statements. Could someone with experience with this please let me know what I might be doing wrong?
Here is the code before the MERGE INTO statement so you can understand which variables are which:
$id = $_GET['id'];
$lat = $_GET['lat'];
$lng = $_GET['lng'];
$heading = $_GET['heading'];
$velocity0 = $_GET['velocity0'];
$velocity1 = $_GET['velocity1'];
$velocity2 = $_GET['velocity2'];
id is the column heading, $id is the id being passed in
Here is my current MERGE INTO statement in my php file:
MERGE INTO markers USING id ON (id = $id)
WHEN MATCHED THEN
UPDATE SET lat = $lat, lng = $lng, heading = $heading, velocityX = $velocity0, velocityY = $velocity1, velocityZ = $velocity2
WHEN NOT MATCHED THEN
INSERT (id, name, address, lat, lng, type, heading, velocityX, velocityY, velocityZ) VALUES ($id, 'bob', 'Poop Lane', $lat, $lng, 'Poop', $heading, $velocity0, $velocity1, $velocity2)
PHP's database libraries invariably have their various function calls return FALSE if anything failed during the call. Assuming you're on mysql_/mysqli_, then you shoudl be doing something like this:
$sql = "MERGE INTO ....";
$result = mysql_query($sql);
if ($result === FALSE) {
die(mysql_error());
}
It is poor practice to NOT check the return values from database calls. Even if the query string is 100% syntactically valid, there's far too many ways for a query to fail. Assuming everything works is the easiest way to get yourself into a very bad situation. As well, when things do fail, the lack of error handling will simply hide the actual reason for the error and then you end up on SO getting answers like this.
Oh, and before I forget... MySQL doesn't support "MERGE INTO...", so your whole query is a syntax error. Look into using "REPLACE INTO..." or "INSERT ... ON DUPLICATE KEY UPDATE ..." instead.
I have some photos (not big, only 8kb) in mysql database (in my desktop). the field type is blob. i want to export the table to xml file then upload it to my database website. but it not success. Here is what i have done :
Exporting the data to xml (in my computer desktop):
FileStream fs = new FileStream(filename,FileMode.Create,FileAccess.Write,FileShare.None);
StreamWriter sw = new StreamWriter(fs,Encoding.ASCII);
ds.WriteXml(sw); //write the xml from the dataset ds
Then upload the xml from my joomla website. i load the xml before insert it to the database
...
$obj = simplexml_load($filename);
$cnt = count($obj->mydata); //mydata is the table name in the xml tag
for($i=0;$i<cnt;$i++)
{
...
$myphoto = 'NULL';
if(!empty($obj->mydata[$i]->myphoto))
{
$myphoto = base64_code($obj->mydata[$i]->myphoto);
}
//insert to the database
$sqlinsert = "insert into jos_myphoto (id,myphoto) values(".$i.",".$myphoto.")";
...
}
...
it keep telling me 'DB function failed'. when value of $myphoto is null, the query work well but if $myphoto is not null, the error appears. i think there is something wrong with the code
$myphoto = base64_code($obj->mydata[$i]->myphoto).
i try to remove base64_code function but it dont work. How to solve this problem? Sorry for my bad english
Your data may contain which needs escaping put mysql_real_escape_string() function and try
It is always a good habit to store data using this function which save you from sql injection also.
And put quotes around the column data.
$sqlinsert = "insert into jos_myphoto (id,myphoto)
values(".$i.",'".mysql_real_eascape_string($myphoto)."')";
This question already has answers here:
How to find array / dictionary value using key?
(2 answers)
Closed 1 year ago.
With a list of around 100,000 key/value pairs (both string, mostly around 5-20 characters each) I am looking for a way to efficiently find the value for a given key.
This needs to be done in a php website. I am familiar with hash tables in java (which is probally what I would do if working in java) but am new to php.
I am looking for tips on how I should store this list (in a text file or in a database?) and search this list.
The list would have to be updated occasionally but I am mostly interested in look up time.
You could do it as a straight PHP array, but Sqlite is going to be your best bet for speed and convenience if it is available.
PHP array
Just store everything in a php file like this:
<?php
return array(
'key1'=>'value1',
'key2'=>'value2',
// snip
'key100000'=>'value100000',
);
Then you can access it like this:
<?php
$s = microtime(true); // gets the start time for benchmarking
$data = require('data.php');
echo $data['key2'];
var_dump(microtime(true)-$s); // dumps the execution time
Not the most efficient thing in the world, but it's going to work. It takes 0.1 seconds on my machine.
Sqlite
PHP should come with sqlite enabled, which will work great for this kind of thing.
This script will create a database for you from start to finish with similar characteristics to the dataset you describe in the question:
<?php
// this will *create* data.sqlite if it does not exist. Make sure "/data"
// is writable and *not* publicly accessible.
// the ATTR_ERRMODE bit at the end is useful as it forces PDO to throw an
// exception when you make a mistake, rather than internally storing an
// error code and waiting for you to retrieve it.
$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
// create the table if you need to
$pdo->exec("CREATE TABLE stuff(id TEXT PRIMARY KEY, value TEXT)");
// insert the data
$stmt = $pdo->prepare('INSERT INTO stuff(id, value) VALUES(:id, :value)');
$id = null;
$value = null;
// this binds the variables by reference so you can re-use the prepared statement
$stmt->bindParam(':id', $id);
$stmt->bindParam(':value', $value);
// insert some data (in this case it's just dummy data)
for ($i=0; $i<100000; $i++) {
$id = $i;
$value = 'value'.$i;
$stmt->execute();
}
And then to use the values:
<?php
$s = microtime(true);
$pdo = new PDO('sqlite:'.dirname(__FILE__).'/data/data.sqlite', null, null, array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
$stmt = $pdo->prepare("SELECT * FROM stuff WHERE id=:id");
$stmt->bindValue(':id', 5);
$stmt->execute();
$value = $stmt->fetchColumn(1);
var_dump($value);
// the number of seconds it took to do the lookup
var_dump(microtime(true)-$s);
This one is waaaay faster. 0.0009 seconds on my machine.
MySQL
You could also use MySQL for this instead of Sqlite, but if it's just one table with the characteristics you describe, it's probably going to be overkill. The above Sqlite example will work fine using MySQL if you have a MySQL server available to you. Just change the line that instantiates PDO to this:
$pdo = new PDO('mysql:host=your.host;dbname=your_db', 'user', 'password', array(PDO::ATTR_ERRMODE=>PDO::ERRMODE_EXCEPTION));
The queries in the sqlite example should all work fine with MySQL, but please note that I haven't tested this.
Let's get a bit crazy: Filesystem madness
Not that the Sqlite solution is slow (0.0009 seconds!), but this about four times faster on my machine. Also, Sqlite may not be available, setting up MySQL might be out of the question, etc.
In this case, you can also use the file system:
<?php
$s = microtime(true); // more hack benchmarking
class FileCache
{
protected $basePath;
public function __construct($basePath)
{
$this->basePath = $basePath;
}
public function add($key, $value)
{
$path = $this->getPath($key);
file_put_contents($path, $value);
}
public function get($key)
{
$path = $this->getPath($key);
return file_get_contents($path);
}
public function getPath($key)
{
$split = 3;
$key = md5($key);
if (!is_writable($this->basePath)) {
throw new Exception("Base path '{$this->basePath}' was not writable");
}
$path = array();
for ($i=0; $i<$split; $i++) {
$path[] = $key[$i];
}
$dir = $this->basePath.'/'.implode('/', $path);
if (!file_exists($dir)) {
mkdir($dir, 0777, true);
}
return $dir.'/'.substr($key, $split);
}
}
$fc = new FileCache('/tmp/foo');
/*
// use this crap for generating a test example. it's slow to create though.
for ($i=0;$i<100000;$i++) {
$fc->add('key'.$i, 'value'.$i);
}
//*/
echo $fc->get('key1', 'value1');
var_dump(microtime(true)-$s);
This one takes 0.0002 seconds for a lookup on my machine. This also has the benefit of being reasonably constant regardless of the cache size.
It depends on how frequent you would access your array, think it this way how many users can access it at same time.There are many advantages towards storing it in database and here you have two options MySQL and SQLite.
SQLite works more like text file with SQL support, you can save few milliseconds during queries as it located within reach of your application, the main disadvantage of it that it can add only one record at a time (same as text file).
I would recommend SQLite for arrays with static content like GEO IP data, translations etc.
MySQL is more powerful solution but require authentication and located on separate machine.
PHP arrays will do everything you need. But shouldn't that much data be stored in a database?
http://php.net/array
I'm using PHP and MySQL and
I have a table with 3 fields ((ID, Username, PID)).
I want the PID field to contain strings of 8 unique characters.
My solution is to generate the random string in PHP and check if it exists. If it exists then it will generate another string.
Is there any better solution that will save processing time, like a MySQL trigger or something like that?
This will give you a random 8 character string:
substr(str_pad(dechex(mt_rand()), 8, '0', STR_PAD_LEFT), -8);
Found here: http://www.richardlord.net/blog/php-password-security
Or if the username field is unique you could also use:
substr(md5('username value'), 0, 8);
Though it's extremely unlikely, particularly for the md5, neither case guarantees a unique string, so I would probably do something like this:
// Handle user registration or whatever...
function generatePID($sUsername) {
return substr(md5($sUsername), 0, 8);
}
$bUnique = false;
$iAttempts = 0;
while (!$bUnique && $iAttempts < 10) {
$aCheck = $oDB->findByPID(generatePID("username value")); // Query the database for a PID matching whats generated
if (!$aCheck) { // If nothing is found, exit the loop
$bUnique = true;
} else {
$iAttempts++;
}
}
// Save PID and such...
... which would probably only yield 1 'check' query, maybe 2 in unique cases, and would ensure a unique string.
Do the characters need to be random? Or just unique? If they only need to be unique, you could use a timestamp. Basing the value on time will ensure a uniqueness.
If you go another route, you'll have to check your generated value against the database until you end up with a unique value.
Why not do this the correct way and use UUIDs (aka GUIDs), which are always unique, no need to check if they are or not. It may be 36 chars, but you get the benefit of storing them as HEX which saves disk space and increase speed over standard CHAR data.
You can read the comments on the PHP doc for functions that do this.
You can create 8 chars unique string in Mysql in such a way
CAST(MD5(RAND()) as CHAR(8))
My solution is to generate the random string in PHP and check if it exists. If it exists then it will generate another string.
This is the wrong way to do it. The web server will run multiple instances of your code concurrently, and sooner or later, two instances will store the same PID in your database.
The correct way to solve this problem is to make the PID column UNIQUE, and don't bother with any pre-checks. Just run the INSERT query, and check the result.
If the result is a 1062 (ER_DUP_ENTRY) error, generate a new PID and try again.
Any other database error should be dealt with like you normally would.
Perhaps something like this (untested):
<?php
/* $link = MySQLi connection */
if (!($stmt = mysqli_prepare ('INSERT `t` (`ID`, `Username`, `PID`) VALUES (?, ?, ?)'))) {
/* Prepare error */
}
if (!mysqli_bind_param ('iss', $id, $user, $pid) {
/* Bind error */
}
$e = 0;
for ($i = 0; $i < 10; $i++) {
$pid = /* generate random string */;
if (mysqli_stmt_execute ($stmt))
break; /* success */
$e = mysqli_stmt_errno ($stmt);
if ($e !== 1062)
break; /* other error */
}
mysqli_stmt_close ($stmt);
if ($e) {
if ($e === 1062) {
/* Failed to generate unique PID */
} else {
/* Other database error */
}
} else {
/* success */
}
If you're set on 8 characters for the PID value then you'll need something to generate the string and check that it doesn't already exist.
$alphabet = range('A','Z');
// get all the PIDs from the database
$sql = "select PID from mytable";
// save those all to an array
$pid_array = results of query saved to array
shuffle($alphabet);
$pid_offer = array_slice($alphabet,0,8);
while(in_array($pid_offer, $pid_array)){
shuffle($alphabet);
$pid_offer = array_slice($alphabet,0,8);
}
// found uniuqe $pid_offer...
race conditions still exist.
If the string doesn't need to be random, then use the ID value, which is probably an auto-increment integer and start the count for that at 10000000.
Then just do a simple A=1, B=2, C=3 etc replacement on the digits in that number to generate your string.
Your mileage may vary.
--Mark