I'm generating quite a substantial (~50Mb) SQLite3 database in memory that I need to be able to write to disk once the generation of said database is complete. What is the best way of approaching this using PHP?
I have tried creating a structurally identical SQLite3 db on disk, and then using INSERTS to populate it, but it is far too slow. I have also drawn a blank looking at the online PHP SQLite3 docs.
What I have found is the SQLite3 Backup API, but not sure how best to approach interfacing with it from PHP. Any ideas?
The backup API is not available in PHP.
If you wrap all INSERTs into a single transaction, the speed should be OK.
You could avoid the separate temporay database and make the disk database almost as fast by increasing the page cache size to be larger than 50 MB, disabling journaling, and disabling synchronous writes.
Related
I have a large chunk of database available at about 100 GB. The records would be synced with the remote database. Each record must be checked with db online and if not available then to be update/inserted to db. I have tried some methods to increase the speed but it is transferring at too slow a speed. The methods I tried are:
Simple script to match the record and uploading the database to which the speed is very very slow.
Generated MySQL dump then compressed it and transferred to online, then online check and update them. The dump was too big to transfer (it was taking a long time to transfer).
Kindly suggest other methods to transfer the DB.
try mysqldumper :
http://sourceforge.net/projects/mysqldumper/files/
it can take mysql dump and restore, its awesome its speed is good.
Is PHP able to handle sqlite data as inmemory DB?
I have a <50MB database and would like a php script to do SELECTs (and if possible also UPDATEs) to the sqlite without slow disk file reading or writing each time, the script is ran.
With java and c++ I know great use-cases, but how to force PHP to access the sqlite inmemory without reloading the file again and again?
There are multiple ways to do it:
Do nothing, and let the OS cache the database in disk caches / memory buffers. This is good if you have a small database (and <50 MB is small), and if you have lots of memory.
Use a tmpfs and copy your database file in it, then open it in PHP.
Use sqlite://:memory: (but you will start from a blank database).
In memory sql databases might be what you are looking for.
First, what I intend to do is to use memory to store the most recent "user update" records for each user.
I am new to MySQL. How can I create tables in memory?
In official website, it is said that we can set ENGINE = MEMORY when creating table. But the document claims that those tables in memory are always, for read, not for write.
I have simply no idea how to do that.
I am into this problems for a few days. I can't install memcache and any PHP extension in server as I'm not using Virtual Private Server, what I can do is just transfer scripts and files in httpdocs folder... I have also tried using flat files to store data to work as buffer/cache, but I found that I cannot write/create files in server's file directory due to denied permission, and I am not allowed to change this permission.
Using MySQL to buffer may be the only choice left for me. Hope someone can give me some hints.
Thank you.
p.s. I am using Linux Apache server running PHP, with MySQL as DB.
ENGINE = MEMORY tables can be used for both read or write.
The only thing to be careful of is that all data in a memory table disappears when the server crashes, is turned off, rebooted, etc. (As you would expect for an in-memory table.)
You really should read carefully about MEMORY engine of MySQL. The data is stored in RAM so when the server is powered off, or rebooted, the RAM will be cleared, and data will be wiped. MEMORY table should be the fastest accessible table type of MySQL, but only stores temporary data, with no guarantee.
If I understood right, you are trying to make static cache of some sort of data generated from PHP, aren't you? The easiest way is to write them as solid file cache in your www directory, either HTML or JS. If you can't chmod your directory to writable, then store them in MySQL should be fine too, but only if that actually helps.
The idea of cache data is to: reduce SQL queries, reduce disk I/O, reduce code generation. But using MEMORY table costs too much memory usage. Store them in a normal MyISAM table should be fine too, and safe you a lot of background work.
However, there should be 2 things to consider: 1, if the cache does not exist when accessing; 2, if the cache is up-to-date.
Giving your result some sort of key should be a good idea, so the PHP checks for cached date first, if doesn't not exist, generate the cache, then display, or otherwise, display the cache directly.
What is the advantage of PHP data cache ?
Where i can use that , is it good only for browser search or
it is good for data export to csv or text also .
How can i achieve data cache using PHP?
Theres a Bunch of Caches for PHP, for starters i would recommend APC http://php.net/manual/de/book.apc.php since scv and text are just strings every Cache Backend including APC is perfectly for caching them.
If you're generating data and you want to prevent computation of the same data over and over again, a good cache to look into is Memcached. Memcached is a piece of software that you run on your server. It stores anything you give it in key/value format in memory, and returns those values when you ask for them on subsequent requests. It isn't persistent (if your server goes down, everything is wiped out), though this can be useful for debugging or management purposes.
Companies like Digg and Facebook, which both rely heavily on PHP, use Memcached extensively to make sure their respective sites are fast.
Personally, I use Memcached to store things like URL routing information (40ms/request speed increase), feed caching (1-3sec/request speed increase), and social graph caching (300-400ms/request speed increase). Depending on what kind of computation you're performing, you can see various types of increases. Generally, for reasonably sized data sets (i.e.: a 1000+ line CSV file), you'll see pretty substantial increases in speed. Keep in mind, though, that Memcached uses RAM and not disk space for storage, so you can easily run out of memory if it is not properly configured. Placing Memcached on a separate server can help to alleviate this, especially on servers with PHP scripts that use a lot of memory.
Hope this helps!
I need a simple way for multiple running PHP scripts to share data.
Should I create a MySQL DB with a RAM storage engine, and share data via that (can multiple scripts connect to the same DB simultaneously?)
Or would flat files with one piece of data per line be better?
Flat files? Nooooooo...
Use a good DB engine (MySQL, SQLite, etc). Then, for maximum performance, use memcached to cache content.
In this way, you have the ease and reliability of sharing data between processes using proven server software that handles concurrency, etc... But you get the speed of having your data cached.
Keep in mind a couple things:
MySQL has a query cache. If you are issuing the same queries repeteadly, you can gain a lot of performance without adding a caching layer.
MySQL is really fast anyway. Have you load-tested to demonstrate it is not fast enough?
Please don't use flat files, for the sanity of the maintainers.
If you're just looking to have shared data, as fast as possible, and you can hold it all in RAM, then memcached is the perfect solution.
If you'd like persistence of data, then use a DBMS, like MySQL.
Generally, a DB is better, however, if you are sharing a small, mostly static amount of data, there might be performance benefits (and simplicity) of doing it with flat files.
Anything other than trivial data sharing and I would pick a DB however.
1- Where the flat file can be usefull:
Flat file can be faster than a database, but in very specific applications.
They are faster if the data is read from start to finish without any search or write.
If the data dont fit in memory and need to be read fully to get the job done, It 'can' be faster than a database. Also if there is lot more write than read, flat file also shine, most default databases setups will need to make the read queries wait for the write to finish in order maintain indexes and foreign keys. Making the write queries usually slower than simple reads.
TD/LR vesion:
Use flat files for jobs based system(Aka, simple logs parsing), not for web searches queries.
2- Flat files pit falls:
If your going with a flat file, you will need to synchronize your scripts when the file change using custom lock mechanism. Which can lead to slowdown, corruption up to dead lock if you have a bug.
3- Ram based Database ?
Most databases have in memory cache for query results, search indexes, making them very hard to beat with a flat file. Because they cache in memory, making it run entirely from memory is most of the time ineffective and dangerous. Better to properly tune the database configuration.
If your looking to optimize performance using ram, I would first look at running your php scrips, html pages, and small images from a ram drive. Where the cache mechanism is more likely to be crude and hit the hard drive systematically for non changing static data.
Better result can be reach with a load balancer, clustering with a back plane connections up to ram based SAN array. But that's a whole other topic.
5- can multiple scripts connect to the same DB simultaneously?
Yes, its called connection pooling. In php (client side) its the function to open a connection its mysql-pconnect(http://php.net/manual/en/function.mysql-pconnect.php).
You can configure the maximum open connection in php.ini I think. Similar setting on mysql server side define the maximum of concurrent client connections in /etc/mysql/my.cnf.
You must do this in order to take advantage of parrallel processessing of the cpu and avoid php script to wait the query of each other finish. It greatly increase performance under heavy load.
There is also one connection pool/thread pool in Apache configuration for regular web clients. See httpd.conf.
Sorry for the wall of text, was bored.
Louis.
If you're running them on multiple servers, a filesystem-based approach will not cut it (unless you've got a consistent shared filesystem, which is unlikely and may not be scalable).
Therefore you'll need a server-based database anyway to allow the sharing of data between web servers. If you're serious about either performance or availability, your application will support multiple web servers.
I would say that the MySql DB would be better choice unless you have some mechanism in place to deal with locks on the flat files (and some way to control access). In this case the DB layer (regardless of specific DBMS) is acting as an indirection layer, letting you not worry about it.
Since the OP doesn't specify a web server (and PHP actually can run from a commandline) then I'm not certain that the caching technologies are what they're after here. The OP could be looking to do some sort of flying data transform that isn't website driven. Who knows.
If your system has a PHP cache (that caches compiled PHP code in memory, like APC), try putting your data into a PHP file, as PHP code. If you have to write data, there are some security issues.
I need a simple way for multiple
running PHP scripts to share data.
APC, and memcached are both good options depending on context. shared memory may also be an option.
Should I create a MySQL DB with a RAM
storage engine, and share data via
that (can multiple scripts connect to
the same DB simultaneously?)
That's also a decent option, but will probably not be as fast as APC or memcached.
Or would flat files with one piece of
data per line be better?
If this is read-only data, that's a possibility -- but may be slower than any of the options above. Especially if the data is large. Rather than writing custom parsing code, however, consider simply building a PHP array, and include() the file.
If this is a datastore that may be accessed by several writers simultaneously, by all means do NOT use a flat file! Writing to a flat file from multiple processes is likely to lead to file corruption. You can lock the file, but you risk lock contention issues, and long lock wait times.
Handling concurrent writes is the reason applications like mysql and memcached exist.