In CodeIgniter I am looking for a way to do some post processing on queries on a specific table/model. I can think of a number of ways of doing this, but I can't figure out any particularly nice way that would work well in the long run.
So what I am trying to do is something like this:
I have a table with an serial number column which is stored as an int (so it can be used as AI and PK, which might or might not be a great idea, but that's how it is right now anyway). In all circumstances where this serial number is used (in views, search queries, real world etc.) it is used with an three letter prefix. So I can add this in the view or wherever needed, but I guess my question is more on what would be the best design choice. Is there a good way to add a column ('ABC' + serial) after queries so that it is mostly transparent to the rest of the application? Perhaps something similar to CakePHPs afterFind() hook?
You can do that in the query itself:
SELECT CONCAT(prefix, serial_number) AS prefixed FROM table_name
Related
Basically, I have tons of files with some data. each differ, some lack some variables(null) etc, classic stuff.
The part it gets somewhat interesting is that, since each file can have up to 1000 variables, and has at least 800~ values that is not null, I thought: "Hey I need 1000 columns". Another thing to mention is, they are integers, bools, text, everything. they differ by size, and type. Each variable is under 100 bytes, at all files, alth. they vary.
I found this question Work around SQL Server maximum columns limit 1024 and 8kb record size
Im unfamiliar with capacities of sql servers and table design, but the thing is: people who answered that question say that they should reconsider the design, but I cant do that. I however, can convert what I already have, as long as I still have that 1000 variables.
Im willing to use any sql server, but I dont know what suits my requirements best. If doing something else is better, please tell so.
What I need to do with this data is, look, compare, and search within. I dont need the ability to modify these. I thought of just using them as they are and keeping them as plain text files and reading from, that requires "seconds" of php runtime for viewing data out of "few" of these files and that is too much. Not even considering the fact that I need to check about 1000 or more of these files to do any search.
So the question is, what is the fastest way of having 1000++ entities with 1000 variables each, and searching/comparing for any variable I wish within them, etc. ? and if its SQL, which SQL server functions best for this sort of stuff?
Sounds like you need a different kind of database for what you're doing. Consider a document database, such as MongoDB, or one of the other not-only-SQL database flavors that allows for manipulation of data in different ways than a traditional table structure.
I just saw the note mentioning that you're only reading as well. I've had good luck with Solr on a similar dataset.
You want to use an EAV model. This is pretty common
You are asking for best, I can give an answer (how I solved it), but cant say if it is the 'best' way (in your environment), I had the Problem to collect inventory data of many thousend PCs (no not NSA - kidding)
my soultion was:
One table per PC (File for you?)
Table File:
one row per file, PK FILE_ID
Table File_data
one row per column in file, PK FILE_ID, ATTR_ID, ATTR_NAME, ATTR_VALUE, (ATTR_TYPE)
The Table File_data, was - somehow - big (>1e6 lines) but the DB handled that fast
HTH
EDIT:
I was pretty short in my anwser, lately; I want to put some additional information to my (and still working) solution:
the table 'per info source' has more than the two fields PK, FILE_ID ie. ISOURCE, ITYPE, where ISOURCE and ITYPE dscribe from where (I had many sources) and what basic Information type it is / was. This helps to get a structure into queries. I did not need to include data from 'switches' or 'monitors', when searching for USB divices (edit: to day probably: yes)
the attributes table had more fields, too. I mention here the both fileds: ISOURCE, ITYPE, yes, the same as above, but a slightly different meaning, the same idea behind
What you would have to put into these fields, depends definitely on your data.
I am sure, that if you take a closer look, what information you have to collect, you will find some 'KEY Values' for that
For storage, XML is probably the best way to go. There is really good support for XML in SQL.
For queries, if they are direct SQL queries, 1000+ rows isn't a lot and XML will be plenty fast. If you're moving towards a million+ rows, you're probably going to want to take the data that is most selective out of the XML and index that separately.
Link: http://technet.microsoft.com/en-us/library/hh403385.aspx
So, the context is: I have a site in which many pages may need the information about one table, say for instance, 'films'. This table has many fields, like title, language, year, description, director... And perhaps in one page I need only the title and the id of some rows and in another I also need the description.
So the question is: should I code a database manager (I am using MySQL) that retrieves all the fields of the rows that satisfy a condition (I guess the WHERE clause should be passed as a parameter)? Or should I be able to specify which fields are needed? I thinks this cannot be done easily with mysqli (because prepared statements require to specify beforehand the number of fetched fields), so for this to work I would need to use PDO instead, which I haven't used yet. Is it worth it this last approach? Or there is not really a big difference in performance if I retrieve the whole information about those rows?
Thank you in advance.
Based upon the comments above, My answer to your question(s) is
Retrieving some fields vs all fields isn't a real performance consideration until you are dealing with one or more CLOB/TEXT columns which have a lot of text in them. Good database practice indicates you should always specify which fields are returned from a query.
Any query against any table should have a where clause to restrict the number of rows returned. Especially if you are looking to query exactly one row.
Your question implies you are writing a wrapper layer around the queries to hide this complexity. Don't do this. Get an existing PHP library that does this work for you. See for example: Good PHP ORM Library? . There are a number of subtle issues, like security, which you will overlook.
I am developing an URL bookmark application (PHP and MySQL) for myself. With this application I will store URLs in MySQL.
The question is, should I store URLs in a TEXT column or should I first parse the URL and store its components (host, path, query, fragment) in separate columns in one table? The latter one also gives me the chance of generating statistical data by grouping servers and etc. Or maybe I should store servers in a separate table and use JOIN. What do you think?
Thanks.
I'd go with storing them in TEXT columns to start. As your application grows, you can build up the parsing and analysis functionality if you really want to. From what it sounds like, it's all just pie-in-the-sky functionality right now. Do what you need to get the basic application up and running first so that you have something to work with. You can always refactor and go from there.
The answer depends on how you like to use this data in the future.
If you like to analyze the different parts of the URL splitting them is the way to go.
If not. the INSERT, as well, as the SELECT, will be faster, if you store them in just one field.
If you know the URLs are not longer then 255 Chars, varchar(255) will be better, than text, for performance reasons.
If you seriously thing that you're going to be using it for getting interesting data, then sure, do it as a series of columns. Honestly, I'd say it'd probably just be easier to do it as a single column though.
Also, don't forget that it's easy for you to convert back and forth if you want to later. Single to multiple is just a SELECT;regex;INSERT[into another table]; multiple to single is just a INSERT SELECT with CONCAT.
Users can do advanced searches (they are many possible parameters):
/search/?query=toto&topic=12&minimumPrice=0&maximumPrice=1000
I would like to store the search parameters (after the /search/?) for an email alert.
I have 2 possibilites:
Storing the raw request (query=toto&topicId=12&minimumPrice=0&maximumPrice=1000) in a table with a structure like id, parameters.
Storing the request in a structured table id, query, topicId, minimumPrice, maximumPrice, etc.
Each solution has its pros and cons. Of course the solution 2 is the cleaner, but is it really worth the (over)effort?
If you already have implemented such a solution and have experienced the maintenance of it, what is the best solution?
The better solution should be the best for each dimension:
Rigidity
Fragility
Viscosity
Performance
Daniel's solution is likely to be the cleanest solution, but I get your point about performance. I'm not very familiar with PHP, but there should be some db abstraction library that takes care relations and multiple inserts so that you get the best performance, right? I only mention it because there may not be a real performance issue. DO you have load tests that point to an issue perhaps?
Anyway, if it is between your original 2 solutions, I would have to select the first. Having a table with column names (like your solution #2) is just asking for trouble. If you add new params, you have to modify the table columns. And there is the ever present issue of "what do we put to indicate not selected vs left empty?"
So I don't agree that solution 2 is cleaner.
You could have a table consisting of three columns: search_id, key, value with the two first being the primary key. This way you can reconstruct a particular search if you have the ID of a saved search. This also allows you to expand with additional search keywords without having to actually modify your table.
If you wish, you can also have key be a foreign key to another table containing valid search terms to ensure integrity. Whether you want to do that depends on your specific needs though.
Well that's completely dependent on what you want to do with the data. For the PHP part, you need to process it anyway, either on insertion or selection time.
For really large number of parameters you may save some time with the 1st on the database management/maintenance, since you don't need to change anything about your database scheme.
Daniel's answer is a generic solution, but if you consider performance an issue, you may end up doing too many inserts on the database side for a single search (one for each parameter). Too many inserts is a common source of performance problems.
You know your resources.
I have a database which holds URL's in a table (along with other many details about the URL). I have another table which stores strings that I'm going to use to perform searches on each and every link. My database will be big, I'm expecting at least 5 million entries in the links table.
The application which communicates with the user is written in PHP. I need some suggestions about how I can search over all the links with all the patterns (n X m searches) and in the same time not to cause a high load on the server and also not to lose speed. I want it to operate at high speed and low resources. If you have any hints, suggestions in pseudo-code, they are all welcomed.
Right now I don't know whether to use SQL commands to perform these searches and have some help from PHP also or completely do it in PHP.
First I'd suggest that you rethink the layout. It seems a little unnecessary to run this query for every user, try instead to create a result table, in which you just insert the results from that query that runs ones and everytime the patterns change.
Otherwise, make sure you have indexes (full text) set on the fields you need. For the query itself you could join the tables:
SELECT
yourFieldsHere
FROM
theUrlTable AS tu
JOIN
thePatternTable AS tp ON tu.link LIKE CONCAT('%', tp.pattern, '%');
I would say that you pretty definately want to do that in the SQL code, not the PHP code. Also searching on the strings of the URLs is going to be a long operation so perhaps some form of hashing would be good. I have seen someone use a variant of a Zobrist hash for this before (google will bring a load of results back).
Hope this helps,
Dan.
Do as much searching as you practically can within the database. If you're ending up with an n x m result set, and start with at least 5 million hits, that's a LOT Of data to be repeatedly slurping across the wire (or socket, however you're connecting to the db) just to end up throwing away most (a lot?) of it each time. Even if the DB's native search capabilities ('like' matches, regexp, full-text, etc...) aren't up to the task, culling unwanted rows BEFORE they get sent to the client (your code) will still be useful.
You must optimize your tables in DB. Use a md5 hash. New column with md5, will use index and faster found text.
But it don't help if you use LIKE '%text%'.
You can use Sphinx or Lucene.