I need to regularly check many (100.000's) rows and check if their current state is the same as the latest stored version in another database. Is there a way to get some sort of unique value for a row to match them, or would I have to manually check the rows column by column?
The source database is a SQL Server 2005 database and the table doesn't have a timestamp mechanism for create, update and/or delete action. I've looked around to check if there is row information available but the only thing available is a pseudo column %%lockres%% and the row information, but that doesn't provide date and or time information.
I'm limited in my tools, but I have a webserver running Apache and PHP and direct access to the source and destination databases. I only have read permissions on the source database.
What would be the most efficient way to compare the data and maintain performance on the source database.
It's simple. Just create a column in that table name it anything in my case i took token name.
Now if you want this code is generated automatic when user register then als you can use this by :
$token = bin2hex(random_bytes(20));
$sql_query = "INSERT INTO `table_name` (`token`) VALUES ('$token')";
Here bin2hex() funtion means binary to hexadecimal. random_bytes() shows generate random bytes inside that write the length of the random character yopu want to choose.
Or You Can simply run this query in your table
$token = bin2hex(random_bytes(20));
"UPDATE `table_name` SET `token`='$token'";
If till now also your query is not resolved. You may concern to me about that again. Then, I will tell you another method to solve this problem.
Since you don’t have access to database, I’d suggest to use ”alternative” database for storing the information. I can think of few different approches, each with different pros and cons.
Approach 1
For using hashes outside of the table for the modification checks would require always querying all the data again, making it highly slow operation, I’d use separate table to store the hashes, where you could always first check if the value matches and just then update it.
Basically when inserting data, you can calculate the local hash from that data, then compare it to the helper database and if they don’t match, you know that the data is out of sync, and can update the data to the real database and save the new hash to the helper database.
Pros:
Only necessary updates to the real database
Cons:
Slower than using hash value in database
Approach 2
Update the record in real database always. This is simplest solution, and unless you need to update thousands of records at the same time and the remote database can handle the extra load, performance impact shouldn’t be that much. It’s just simple update operations.
Pros:
Simple and easy to do
Cons:
Extra load to real database
Approach 3
Just get the permission to modify the remote database. If you are going to maintain that thing for long time, this may just be the best thing in the future.
Pros:
It will work fastest
Cons:
You need to get permission to modify the table.
While I say database, at simplest it could just be a plain text file, SQLite database or anything, that would allow you handle the local operations.
Related
I'm programming a browser game in which there are spells table, items table.. ect. Each table has thousands of rows. What i do to handle this is the following.
Upon login, i store the entire database in the user's session. That includes only the tables that are not going to be changed by the user's input. For example, the spells table contains only information about the spells. How much damage they deal, what level is required for the player to have that spell, ect. The user only reads that data, never writes to it.
Let's say that the user wants to buy a specific spell. I can't aford the PHP code to go and check each array in the session variable for the spell id. Instead ->
<?php
// Load all database spells
$stmt = $db->prepare("SELECT * FROM spells");
$stmt->execute();
$result = $stmt->fetchAll(\PDO::FETCH_UNIQUE|\PDO::FETCH_ASSOC);
$_SESSION["spells_db"] = $result;
?>
So, what happens is -> i store all database spells into this session variable. Using \PDO::FETCH_UNIQUE|\PDO::FETCH_ASSOC i change the spell array key to the spell ID. This way i already know the spell key.
If i ever need to search for a spell information by id, the id of the spell is also the key of the array row of that spell. So instead using in_array() to make the PHP search every single row of the array, to find which inner array contains the relevant spell ID, i can just tell it which row it is. This way i saved a lot of performance.
But on the other side, each individual user stores the entire database on his session. In time this will cause my website to have scalability issues. I know that it is better to store data in the session, instead making query every time to ask the database if something is changed. In my case, when something gets changed, first i change it in the session, then i change it in the database. And every time an user refreshes the page, session data is displayed. But talking about large data storage like storing the entire database, makes my head blow up. So, any advice on how to deal with this? Thank you for your time.
I suggest you test it first using the database. I suppose its MySQL. It can handle gigabytes of data and millions of rows in a table, fast. The important is indexing. Thousands of rows is not too much for MySQL (assuming you don't have huge rows with several varchar(5000) and such).
(Those keys you were saying should probably by the indexes in your database table, and I have a gut feeling those are your autoincrement primary keys, so they will be selected fast.)
PHP Session data must be stored somewhere too
If you left session storage to default, than the data is stored in a file on disk. That means disk write and those are slower then any modern database (even on SSD) because the databases would cache (into RAM) and optimize.
If you store sessions in RAM and you do have a lot of data, you will definitely run out of RAM.
If you store your session in the database... you know
KISS.
If you are updating both $_SESSION and the database table, that adds complexity, sluggishness, etc. And potential errors. And potential consistency issues.
Assuming that you are fetching one spell from the spells table, that will take about 1ms. And you can have multiple queries running simultaneously.
I suggest you use the database heavily without $_SESSION, time actions, then decide which need speeding up. Then adding indexes, etc might help. Or switching to $_SESSION might be warranted.
Don't get sucked into "premature optimization".
A bigger problem will occur if your game gets popular -- a single server will not suffice. But once you spread the game across multiple servers, $_SESSION becomes unusable -- it is limited to one server.
I am creating an application with a click to call button on an html page.
There will be one person manning the phone. I want this person to be able to set a variable with a boolean value on my server: 1 is available, 0 is unavailable.
I could create a single field SQL table but this feels like overkill, or I could read and write to a text file containing just one character.
What is the most correct way to store a single value?
I know it seems like overkill to use a small database table for this.
If your application already uses a database, this is by far the best way to proceed. Your database technology has all kinds of support for storing data so it doesn't get lost. But, don't stand up a database and organize your application to use it just for this one data point; a file will be easier in that case.
(WordPress does something similar; it uses a table called wp_options containing a lot of one-off settings values.)
I suggest your table contain two columns (or maybe more), agent_id and available. Then, if you happen to add another person taking telephone calls, your app will be ready to handle that growth. Your current person can have agent_id = 0.
If you have a DB set up, I'd use it.
That's what DB's are for, persisting changeable data.. otherwise you are basically writing your own separate DB system for the sake of one setting, which would be uberkill in my eyes!
There is value in consistency and flexibility.. what if I suddenly need to store an expected return time? How do I do this in a text-file, how do I differentiate the column? How do I manipulate the data? MySQL already answers all these questions for you.
As a team member, I'd expect most of my dev colleagues (and new hires) to know how to use MySQL.. I wouldn't want them to have to work with, extend or debug a separate bespoke file persistence system that I had tacked on.
If you are worried about having lots of one row tables dotted about, you could use a single table for miscellaneous singular config variables which need updating regularly.
We have a table like this:
Table: `setting`
Columns: `key_string` VARCHAR, `value` VARCHAR
And could store your variable as
['key_string' => 'telephone_service_available', 'value' => '1']
In this specific case a simple file check (Exist a file or not) is probably the most simple way you can do here. And it also has the benefit to easily check if the file exist or not, you don't have to read file contents.
But if you need just one more information, you have to go a complete other way.
Depends on what you try to do afterwards with the information.
If you use it within a web-application store it in the session.
Or try a flatfile-database like SQLite (no active DBMS needed). Its easy and you can extend it very easy.
Or just a bipolar information with creating a file. If the file is not there is is off.
I have a program that creates logs and these logs are used to calculate balances, trends, etc for each individual client. Currently, I store everything in separate MYSQL tables. I link all the logs to a specific client by joining the two tables. When I access a client, it pulls all the logs from the log_table and generates a report. The report varies depending on what filters are in place, mostly date and category specific.
My concern is the performance of my program as we accumulate more logs and clients. My intuition tells me to store the log information in the user_table in the form of a serialized array so only one query is used for the entire session. I can then take that log array and filter it using PHP where as before, it was filtered in a MYSQL query (using multiple methods, such as BETWEEN for dates and other comparisons).
My question is, do you think performance would be improved if I used serialized arrays to store the logs as opposed to using a MYSQL table to store each individual log? We are estimating about 500-1000 logs per client, with around 50000 clients (and growing).
It sounds like you don't understand what makes databases powerful. It's not about "storing data", it's about "storing data in a way that can be indexed, optimized, and filtered". You don't store serialized arrays, because the database can't do anything with that. All it sees is a single string without any structure that it can meaningfully work with. Using it that way voids the entire reason to even use a database.
Instead, figure out the schema for your array data, and then insert your data properly, with one field per dedicated table column so that you can actually use the database as a database, allowing it to optimize its storage, retrieval, and database algebra (selecting, joining and filtering).
Is serialized arrays in a db faster than native PHP? No, of course not. You've forced the database to act as a flat file with the extra dbms overhead.
Is using the database properly faster than native PHP? Usually, yes, by a lot.
Plus, and this part is important, it means that your database can live "anywhere", including on a faster machine next to your webserver, so that your database can return results in 0.1s, rather than PHP jacking 100% cpu to filter your data and preventing users of your website from getting page results because you blocked all the threads. In fact, for that very reason it makes absolutely no sense to keep this task in PHP, even if you're bad at implementing your schema and queries, forget to cache results and do subsequent searches inside of those cached results, forget to index the tables on columns for extremely fast retrieval, etc, etc.
PHP is not for doing all the heavy lifting. It should ask other things for the data it needs, and act as the glue between "a request comes in", "response base data is obtained" and "response is sent back to the client". It should start up, make the calls, generate the result, and die as fast as it can again.
It really depends on how you need to use the data. You might want to look into storing with mongo if you don't need to search that data. If you do, leave it in individual rows and create your indexes in a way that makes them look up fast.
If you have 10 billion rows, and need to look up 100 of them to do a calculation, it should still be fast if you have your indexes done right.
Now if you have 10 billion rows and you want to do a sum on 10,000 of them, it would probably be more efficient to save that total somewhere. Whenever a new row is added, removed or updated that would affect that total, you can change that total as well. Consider a bank, where all items in the ledger are stored in a table, but the balance is stored on the user account and is not calculated based on all the transactions every time the user wants to check his balance.
I have a data structure type question that I don't really know the answer too. Essentially I have four permission controls (isSecret, canEdit, isActive and hasPage) that I need to store in a for a number of different tables.
I have two solutions in mind, but I'm not sure which is the best performance wise:
Store each permission as a separate row on each table. To me this seems to be the fastest way to access the data when querying, but because PHP will handle permissions 90% of the time, it seems inefficient.
Have a single permissions column where the permission name (sec,edt,act,has) is stored as a comma separated string. This gives me flexibly in the future to introduce new/different permissions, looks neat in my database and is easy to use in both PHP and mySQL (I can use the IN lookup for queries and explode the string and work with it as an array in PHP). This column would be a varchar of 40 characters, allowing to me store up to 10 different permissions (3 letters and a comma)
Option 2 was my preferred solution until I realised that the IN command might be resource intensive. I thought it might take a performance hit using an IN command on every row in my table trying to look for inactive pages to filter out. To solve this, I could just fetch every single row in my column, and then filter the rows out with PHP, but again, I'm not sure how effective this will be.
Ideally I think in my solution sub-columns would be the best solution (a main permissions column and under this 4 sub-columns for each of my permissions) that could then be queried easily (ie. where permission.canEdit = 1)
Results will eventually be cached using memcache (when I am able to figure out how to use it and an effective method for clearing it), but I don't want to have to rely on this.
I think SETs would be what you need
I'm trying to optimize my PHP and MySQL, but my understanding of SQL databases is shoddy at best. I'm creating a website (mostly for learning purposes) which allows users to make different kinds of posts (image/video/text/link).
Here is the basics of what I'm storing
Auto - int (key index)
User ID - varchar
Post id - varchar
Post Type - varchar (YouTube, vimeo, image, text, link)
File Name - varchar (original image name or link title)
Source - varchar (external link or name of file + ext)
Title - varchar (post title picked by user)
Message - text (user's actual post)
Date - int (unix timestamp)
I have other data stored relevant to the post in other tables which I grab with the post id (like user information) but I'm really doubting if this is the method I should be storing information in. I do use PDO, but I'm afraid this format might just be extremely slow.
Would there be any sense in storing the post information in another format? I don't want excessively large tables, so from a performance standpoint should I store some information as a blob/binary/xml/json?
I can't seem to find any good resources on PHP/MySQL optimization. Most information I come across tends to be 5-10 years old, content you have to pay for, too low-level, or just straight documentation which can't hold my attention for more than half an hour.
Databases are made to store 'data', and are fast to retrieve the data. Do not switch to anything else, stick with a database.
Try not to store pictures and video's in a database. Store them on disk, and keep a reference to them in a database table.
Finally, catch up on database normalization, it will help you in getting your database in optimal condition.
What you have seems okay, but you have missed the important bit about indexes and keys.
Firstly, I am assuming that your primary key will be field 1. Okay, no problems there, but make sure that you also stick an index on userID, PostID, Date and probably a composite on UserID, Date.
Secondly, are you planning on having search functions on these? In that case you may need to enable full text searches.
Don't muck around trying to store data in a JSON or other such things. Store it plain and simple. The last thing you want to be doing is trying to extract a field from the database just to see what is inside. If you database can't work it out, it is bad design.
On that note, there isn't anything wrong with large tables. As long as they are indexed nicely, a small table or large table will make very little difference in terms of accessing it (short of huge badly written SQL joins), so worry about simplicity to be able to get the data back from it.
Edit: A Primary Key is lovely way to identify a row by a unique column of some sort. So, if you want to delete a row, in your example, you might specify a delete from yourTable where ID=6 and you know that this will only delete one row as only one row can have ID=6.
On the other hand, an index is different to a key, in that it is like a cheat-sheet for the database to know where certain information is inside the table. For example, if you have an index on the UserID column, when you pass a userID in a query, the database won't have to look though the entire table, it looks at the index and knows the location of all the rows for that user.
A composite index is taking this one step further again, if you know what you will want to constantly query data for both UserID and ContentType, you can add in a composite index (meaning an index on BOTH fields in one index) which will then allow the database to return only the data you specify in a query using both those columns without having to sift through the entire table - nor even sift through all of a users posts to find the right content type.
Now, indexes take up some extra space on the server, so keep that in mind, but if your tables grow to be larger (which is perfectly fine) the improved efficiency is staggering.
At this time, stick with RDMS for now. Once you will be comfortable with PHP and MySQL then may be later on there will be more to learn like NoSQL, MongoDB etc. but for current purpose of yours as every thing has its purpose, this is quite right and will not slow down. Your table schema seems right with few modifications.
User id and Post id will be integer and I think this table is post so post id will be auto incremented and it will be primary key.
Other thing is that you are using 2 fields, filename and source, please note that filename will be file's name that is uploaded but if by source you mean complete path of file then then DB is not the place for storing complete path. Generate path from PHP function. to access that path every time not in DB. Otherwise if you will need to change path then it will be much overhead.
Also you asked about blob etc. Please note that it is better to store file in file system not in db while these fields like blob etc are good when one want to store file in DB table, that I don't recommend here.