I'm about to implement a memcached class which can be extended by our database class. However i have looked at many different ways of doing this.
My First question was, what is the point in the Memcached::set() as it seems to replace the value of the key. Does this not defeat the object of caching your results?
My Second question was, technically speaking what is the fastest/best way to update the value of a key without checking the results every time the query is executed. If this happens, the results been retreived would have to be checked constantly so there would be no point in caching as the results would constantly be connecting to the mysql database.
Lastly, What is the best way of creating a key? most people recommend using MD5 however using MD5 on a PDO query would be the same regardless.
I.E.
$key = MD5("SELECT * FROM test WHERE category=?");
the category could produce many different results however the key would be constantly replaced. Is there a best practice for this?
You set a cache entry when you had to read the database, so that next time, you don't have to read the database first. You'd check the cache, and if it was not there, or otherwise out of date, then you fall back to the database read, and reset the key.
As for a key name, it depends very much on the expected values of the category. If if was a simple integer, or string, I'd use a key like test.category:99 or test.category:car. If it was likely to be more, it may be useful to encode it, so there were no spaces in it (say, urlencode).
Finally, if it were any more complex than that - test:category:{MD5(category)}.
Since the key is only a reference to the data and you'll never be using it in any kind of SQL query, putting the value in there is not generally going to be a security issue.
Since you control when the cache is set, if the underlying database entry is changed, it's simple to also update the cache with the new data at the same time - you just have to use the same key.
Related
I need to regularly check many (100.000's) rows and check if their current state is the same as the latest stored version in another database. Is there a way to get some sort of unique value for a row to match them, or would I have to manually check the rows column by column?
The source database is a SQL Server 2005 database and the table doesn't have a timestamp mechanism for create, update and/or delete action. I've looked around to check if there is row information available but the only thing available is a pseudo column %%lockres%% and the row information, but that doesn't provide date and or time information.
I'm limited in my tools, but I have a webserver running Apache and PHP and direct access to the source and destination databases. I only have read permissions on the source database.
What would be the most efficient way to compare the data and maintain performance on the source database.
It's simple. Just create a column in that table name it anything in my case i took token name.
Now if you want this code is generated automatic when user register then als you can use this by :
$token = bin2hex(random_bytes(20));
$sql_query = "INSERT INTO `table_name` (`token`) VALUES ('$token')";
Here bin2hex() funtion means binary to hexadecimal. random_bytes() shows generate random bytes inside that write the length of the random character yopu want to choose.
Or You Can simply run this query in your table
$token = bin2hex(random_bytes(20));
"UPDATE `table_name` SET `token`='$token'";
If till now also your query is not resolved. You may concern to me about that again. Then, I will tell you another method to solve this problem.
Since you don’t have access to database, I’d suggest to use ”alternative” database for storing the information. I can think of few different approches, each with different pros and cons.
Approach 1
For using hashes outside of the table for the modification checks would require always querying all the data again, making it highly slow operation, I’d use separate table to store the hashes, where you could always first check if the value matches and just then update it.
Basically when inserting data, you can calculate the local hash from that data, then compare it to the helper database and if they don’t match, you know that the data is out of sync, and can update the data to the real database and save the new hash to the helper database.
Pros:
Only necessary updates to the real database
Cons:
Slower than using hash value in database
Approach 2
Update the record in real database always. This is simplest solution, and unless you need to update thousands of records at the same time and the remote database can handle the extra load, performance impact shouldn’t be that much. It’s just simple update operations.
Pros:
Simple and easy to do
Cons:
Extra load to real database
Approach 3
Just get the permission to modify the remote database. If you are going to maintain that thing for long time, this may just be the best thing in the future.
Pros:
It will work fastest
Cons:
You need to get permission to modify the table.
While I say database, at simplest it could just be a plain text file, SQLite database or anything, that would allow you handle the local operations.
I have a foreign database that I need to replicate in-house. The DB is hosted by an, uncooperative, vendor. So I have little control over it. I do have the ability to write report reports that I call via API to bring the data down as I move them into a local Oracle DB. However, several of the tables do NOT have unique fields, I can't add one, and in fact, after some investigation it appears the only way to get a unique field is to concatenate the entire row which is not desirable for obvious reasons. I can, however, execute PHP script on the report as its called, so my idea was to create an MD5 hash of the entire row and store that at the beginning of the table as the key.
I realize that it's possible to have collisions but they should be very rare. Given the pitfalls of not having a key, I'd prefer losing a record every few million rows than having thousands of duplicates should we ever have to reflow the DB. Are there any other pitfalls of this method I should be worried about? I'm trying to make the best of a bad situation, I realize that this isn't a perfect solution. I just want to make sure I give my boss the best information regarding the pitfalls of this so we're not suprised down the road. And yes, the vendors getting dumped, that's why we're replicating the DB.
I have just been tasked with recovering/rebuilding an extremely large and complex website that had no backups and was fully lost. I have a complete (hopefully) copy of all the PHP files however I have absolutely no clue what the database structure looked like (other than it is certainly at least 50 or so tables...so fairly complex). All data has been lost and the original developer was fired about a year ago in a fiery feud (so I am told). I have been a PHP developer for quite a while and am plenty comfortable trying to sort through everything and get the application/site back up and running...but the lack of a database will be a huge struggle. So...is there any way to simulate a MySQL connection to some software that will capture all incoming queries and attempt to use the requested field and table names to rebuild the structure?
It seems to me that if i start clicking through the application and it passes a query for
SELECT name, email, phone from contact_table WHERE
contact_id='1'
...there should be a way to capture that info and assume there was a table called "contact_table" that had at least 4 fields with those names... If I can do that repetitively, each time adding some sample data to the discovered fields and then moving on to another page, then eventually I should have a rough copy of most of the database structure (at least all public-facing parts). This would be MUCH easier than manually reading all the code and pulling out every reference, reading all the joins and subqueries, and sorting through it all manually.
Anyone ever tried this before? Any other ideas for reverse-engineering the database structure from PHP code?
mysql> SET GLOBAL general_log=1;
With this configuration enabled, the MySQL server writes every query to a log file (datadir/hostname.log by default), even those queries that have errors because the tables and columns don't exist yet.
http://dev.mysql.com/doc/refman/5.6/en/query-log.html says:
The general query log can be very useful when you suspect an error in a client and want to know exactly what the client sent to mysqld.
As you click around in the application, it should generate SQL queries, and you can have a terminal window open running tail -f on the general query log. As you see queries run by that reference tables or columns that don't exist yet, create those tables and columns. Then repeat clicking around in the app.
A number of things may make this task even harder:
If the queries use SELECT *, you can't infer the names of columns or even how many columns there are. You'll have to inspect the application code to see what column names are used after the query result is returned.
If INSERT statements omit the list of column names, you can't know what columns there are or how many. On the other hand, if INSERT statements do specify a list of column names, you can't know if there are more columns that were intended to take on their default values.
Data types of columns won't be apparent from their names, nor string lengths, nor character sets, nor default values.
Constraints, indexes, primary keys, foreign keys won't be apparent from the queries.
Some tables may exist (for example, lookup tables), even though they are never mentioned by name by the queries you find in the app.
Speaking of lookup tables, many databases have sets of initial values stored in tables, such as all possible user types and so on. Without the knowledge of the data for such lookup tables, it'll be hard or impossible to get the app working.
There may have been triggers and stored procedures. Procedures may be referenced by CALL statements in the app, but you can't guess what the code inside triggers or stored procedures was intended to be.
This project is bound to be very laborious, time-consuming, and involve a lot of guesswork. The fact that the employer had a big feud with the developer might be a warning flag. Be careful to set the expectations so the employer understands it will take a lot of work to do this.
PS: I'm assuming you are using a recent version of MySQL, such as 5.1 or later. If you use MySQL 5.0 or earlier, you should just add log=1 to your /etc/my.cnf and restart mysqld.
Crazy task. Is the code such that the DB queries are at all abstracted? Could you replace the query functions with something which would log the tables, columns and keys, and/or actually create the tables or alter them as needed, before firing off the real query?
Alternatively, it might be easier to do some text processing, regex matching, grep/sort/uniq on the queries in all of the PHP files. The goal would be to get it down to a manageable list of all tables and columns in those tables.
I once had a similar task, fortunately I was able to find an old backup.
If you could find a way to extract the queries, like say, regex match all of the occurrences of mysql_query or whatever extension was used to query the database, you could then use something like php-sql-parser to parse the queries and hopefully from that you would be able to get a list of most tables and columns. However, that is only half the battle. The other half is determining the data types for every single column and that would be rather impossible to do autmatically from PHP. It would basically require you inspect it line by line. There are best practices, but who's to say that the old dev followed them? Determining whether a column called "date" should be stored in DATE, DATETIME, INT, or VARCHAR(50) with some sort of manual ugly string thing can only be determined by looking at the actual code.
Good luck!
You could build some triggers with the BEFORE action time, but unfortunately this will only work for INSERT, UPDATE, or DELETE commands.
http://dev.mysql.com/doc/refman/5.0/en/create-trigger.html
I have a MySQL table with about 9.5K rows, these won't change much but I may slowly add to them.
I have a process where if someone scans a barcode I have to check if that barcode matches a value in this table. What would be the fastest way to accomplish this? I must mention there is no pattern to these values
Here Are Some Thoughts
Ajax call to PHP file to query MySQL table ( my thoughts would this would be slowest )
Load this MySQL table into an array on log in. Then when scanning Ajax call to PHP file to check the array
Load this table into an array on log in. When viewing the scanning page somehow load that array into a JavaScript array and check with JavaScript. (this seems to me to be the fastest because it eliminates Ajax call and MySQL Query. Would it be efficient to split into smaller arrays so I don't lag the server & browser?)
Honestly, I'd never load the entire table for anything. All I'd do is make an AJAX request back to a PHP gateway that then queries the database, and returns the result (or nothing). It can be very fast (as it only depends on the latency) and you can cache that result heavily (via memcached, or something like it).
There's really no reason to ever load the entire array for "validation"...
Much faster to used a well indexed MySQL table, then to look through an array for something.
But in the end it all depends on what you really want to do with the data.
As you mentions your table contain around 9.5K of data. There is no logic to load data on login or scanning page.
Better to index your table and do a ajax call whenever required.
Best of Luck!!
While 9.5 K rows are not that much, the related amount of data would need some time to transfer.
Therefore - and in general - I'd propose to run validation of values on the server side. AJAX is the right technology to do this quite easily.
Loading all 9.5 K rows only to find one specific row, is definitely a waste of resources. Run a SELECT-query for the single value.
Exposing PHP-functionality at the client-side / AJAX
Have a look at the xajax project, which allows to expose whole PHP classes or single methods as AJAX method at the client side. Moreover, xajax helps during the exchange of parameters between client and server.
Indexing to be searched attributes
Please ensure, that the column, which holds the barcode value, is indexed. In case the verification process tends to be slow, look out for MySQL table scans.
Avoiding table scans
To avoid table scans and keep your queries run fast, do use fixed sized fields. E.g. VARCHAR() besides other types makes queries slower, since rows no longer have a fixed size. No fixed-sized tables effectively prevent the database to easily predict the location of the next row of the result set. Therefore, you e.g. CHAR(20) instead of VARCHAR().
Finally: Security!
Don't forget, that any data transferred to the client side may expose sensitive data. While your 9.5 K rows may not get rendered by client's browser, the rows do exist in the generated HTML-page. Using Show source any user would be able to figure out all valid numbers.
Exposing valid barcode values may or may not be a security problem in your project context.
PS: While not related to your question, I'd propose to use PHPexcel for reading or writing spreadsheet data. Beside other solutions, e.g. a PEAR-based framework, PHPExcel depends on nothing.
I just saw the first comment to this question Inserting into a serialized array in PHP and it made me wonder why? Especially seeing that when you use database managed sessions (database based session handling) that is exactly what happens, the session handler inserts a serialized array into a database field.
There's nothing wrong with this in certain contexts. Session management is definitely one of those instances where this would be deemed acceptable. The thing to remember is that if you ever find yourself trying to relate data between the serialized data and any fields in your database you've made a huge design flaw and unfortunately this is something that I have seen people try to do.
Take any "never do x" with a grain of salt as almost any technique can be the correct one in certain circumstances. The advice is usually directed towards noobies who are very apt to misunderstand proper usage and code themselves into a very nasty corner.
How certain are you that you'll never want to get at that data from any platform other than PHP?
I don't know about PHP's form of serialization, but the default binary serialization format from every platform I do know about is inoperable with other platforms... typically it's not a good idea to data encoded for just a single frontend into a database.
Even if you don't end up using any other languages, it means the database itself isn't going to know anything about the information - so you won't be able to query on it etc. Maybe that's not a problem in your case - but it's definitely something to bear in mind.
The main argument against serialized data is that serialized data are hard to search through and impossible to do so efficiently i.e., without retrieving the records in the first place.
Depends on the data. By storing a language-specific data structure in a field you're tied to that language and you're also giving up anything the DB can give you. You won't have indexes on specific fields, can't run simple updates, can't extract partial data, can't have data check, referential integrity and so on.