currently I have two websites:
1. A website connected to mySQL database in host A.
2. A website connected to Ms. Access database in Host B.
Is there anyway if I update the database in Host B, the database in Host A can be updated automatically?
Thank you. Really appreciate your help.
Two options: - (1) at the database level, with whats commonly called ETL (Extract Transform and Load). In the Microsoft world you'd use SSIS (which comes as part of MS SQL) to move data about. This would be a common approach within an enterpruise, particularly if you have a lot of control over then environment.
(2) some sort of "service" based approach. Maybe you provide some sort of interface (like a web service) so that one application can call the other. The issue with that is that you need to build it into the application - but you seem to be after a database driven solution (?)
Have a think about what you're trying to do and who should be responsible for that - are you sure it's the database?
Regarding your specific technology - I'm not sure about MySQL as I haven't used it much myself; I don't know of any 'easy' way to have MySQL and Access talk to each other, so yopu may have to write something.
The data your exchanging - how much and how often? How timely (can you have one poll the other every hour, or does it need to be 'real-time')?
You could consider using a (new) third system that brokered communcation between the two databases, so that they could remain ignorant of each other and the need to update.
Is it likely you'll have a third database to update later (or a 4th, etc...?)
can you change the database platform to something that is common across both / all sites, and which has some sort of messaging / updating system built in?
Related
I'm currently developping an application which allows doctors to dinamically generate invoices. The fact is, each doctors requires 6 differents database tables, and there could be like 50 doctors connected at the same time and working with the database (writing and reading) at the same time.
What I wanted to know is if the construction of my application fits. For each doctors, I create a personnal Sqlite3 database (all database are secure) which only him can connect to. I'll have like 200 Sqlite database, but is there any problems ? I thought it could be better than using a big MySQL database for everyone.
Is this solution viable ? Will I have problems to deal with ? I never did such an application with so many users, but I thought it could be the best solution
Firstly, to answer your question: no, you probably will not have any significant problems if a single sqlite database is used only by one person (user) at a time. If you highly value certain edge cases, like the ability to move some users/databases to another server, this might be a very good solution.
But it is not a terribly good design. The usual way is to have all data in the same database, and tables having a field which identifies which rows belong to which users. The application code is responsible for maintaining security (i.e. not to let users see data which doesn't belong to them), and indexes in the database (which you should use in all cases, even in your own design) are responsible for making it fast.
There are a large number of tutorials which could help you to make a better database design; a random google result is http://www.profsr.com/sql/sqless02.htm .
I'm going to try to make this as brief as possible while covering all points - I work as a PHP/MySQL developer currently. I have a mobile app idea with a friend and we're going to start developing it.
I'm not saying it's going to be fantastic, but if it catches on, we're going to have a LOT of data.
For example, we'd have "clients," for lack of a better term, who would have anywhere from 100-250,000 "products" listed. Assuming the best, we could have hundreds of clients.
The client would edit data through a web interface, the mobile interface would just make calls to the web server and return JSON (probably).
I'm a lowly cms-developing kinda guy, so I'm not sure how to handle this. My question is more or less about performance; the most I've ever seen in a MySQL table was 340k, and it was already sort of slow (granted it wasn't the best server either).
I just can't fathom a table with 40 million rows (and potential to continually grow) running well.
My plan was to have a "core" database that held the name of the "real" database, so the user would come in and try to access a client's data, it would go to the core database and figure out which database to get the information from.
I'm not concerned with data separation or data security (it's not private information)
Yes, it's possible and my company does it. I'm certainly not going to say it's smart, though. We have a SAAS marketing automation system. Some client's databases have 1 million+ records. We deal with a second "common" database that has a "fulfillment" table tracking emails, letters, phone calls, etc with over 4 million records, plus numerous other very large shared tables. With proper indexing, optimizing, maintaining a separate DB-only server, and possibly clustering (which we don't yet have to do) you can handle a LOT of data......in many cases, those who think it can only handle a few hundred thousand records work on a competing product for a living. If you still doubt whether it's valid, consider that per MySQL's clustering metrics, an 8 server cluster can handle 2.5million updates PER SECOND. Not too shabby at all.....
The problem with using two databases is juggling multiple connections. Is it tough? No, not really. You create different objects and reference your connection classes based on which database you want. In our case, we hit the main database's company class to deduce the client db name and then build the second connection based on that. But, when you're juggling those connections back and forth you can run into errors that require extra debugging. It's not just "Is my query valid?" but "Am I actually getting the correct database connection?" In our case, a dropped session can cause all sorts of PDO errors to fire because the system no longer can keep track of which client database to access. Plus, from a maintainability standpoint, it's a scary process trying to push table structure updates to 100 different live database. Yes, it can be automated. But one slip up and you've knocked a LOT of people down and made a ton of extra work for yourself. Now, calculate the extra development and testing required to juggle connections and push updates....that will be your measure of whether it's worthwhile.
My recommendation? Find a host that allows you to put two machines on the same local network. We chose Linode, but who you use is irrelevant. Start out with your dedicated database server, plan ahead to do clustering when it's necessary. Keep all your content in one DB, index and optimize religiously. Finally, find a REALLY good DB guy and treat him well. With that much data, a great DBA would be a must.
I am developing a web application using php and mysql. This application runs on three different locations.
On internet
Head office
Branch office
Application runs on local server on head office and branch office. Internet connection is not available on every time. Customers placing orders through these three locations. My problem is, I want to synchronize the data among these three databases and keep these three databases up to date. Is there any way to do this?
I'm using SymmetricDS to synchronize databases. It is capable of synchronizing or replicating data between nodes (servers/databases), only pushing or pulling the data you define. It is a software based on Java, it has a steep learning curve, but it really does the job.
SymmetricDS can be set up to push changes from one node to the two other nodes, thus making sure that all three nodes contains the same data. You need to make sure that primary keys are unique keys, and not auto incremented values assigned by the database as this most likely will be an issue across the three different databases you'd like to synchronize.
The software installs triggers on the database, and captures changes when INSERT, UPDATE or DELETE (and other) operations are carried out. These data changes are then invoked on the other nodes. The software needs to run on each location, but does not need an internet connection that is available at all times.
I did worry in the beginning that triggers on all my tables would slow down performance, but this has not been a problem at all. I can't say that we've discovered any issues with performance after the triggers were installed.
Have a look at http://symmetricds.org/ for more details.
Try Schema and Data Comparison tools in dbForge Studio for MySQL.
We also have stand-alone tools:
dbForge Data Compare
dbForge Schema Compare
Is there any difference between CMS and hight traffic websites (like news portals) in logic and database design and optimization (PHP and MySQL)?
I have searched for php site scalability in stackoverflow and memcached is in a majority.
Is there techniques for MySQL optimization? (Im looking for a book for this issue. I have searched in amazon but I dont know what is the best choise.)
Thanks in advance
this isnt so easy to answer.
there are different approaches and a variety of opinions but ill try to cover some common scenarios. but first some basics.
most web applications can be sperated in application and database.
database usage can be seperated into transactional (oltp) and analytical (olap)
in the best case you can just start a number of application servers and distribute traffic among them. they all have a connection to the same database server and can work independently.
this can be however difficult if you have other shared data, sessions etc.
you can accomplish this by simply adding multiple ip adresses to your domain namen in dns.
or you use load balancing techniques to forward the clients do different servers.
application scaling is generally very easy. database is much more complex.
the first thing to do is usually set up one or more replication servers which have the same data as the main database. they can be cascaded but have 1 serous disadvantage. their data is not always up to date. in general not more than some seconds old but it can be more under load. but for many use cases this is fine.
big sites that just display information could just replicate their database to some slave servers, set up some application servers (its a good practice to run one slave and one application server on the same server and let this application server access this database slave) and every is fine.
every olap query can be directed to a slave. olap querys are those that dont modify anything and dont need 100% up 2 date data.
so everything needs to be written to the very same database source server from which every other server gets its copy. for example every comment for an article.
if this bottleneck gets too tight you can go in two dirctions.
sharding
master-master replication
sharding means you decide on the application server where to store and where to fetch your data.
for example every comment that starts with a gets to server a, b-> b and so on.
thats a stupid example but its basically how it is. mostly some internal ids are involved.
if possible its good to shard data so that it can be completely pulled from that server agani.
in the example above, if i wanted to have all comments for an article i would have to ask eveyr server a-z and merge the results. this is inefficitient but possible, because those servers can be replicated. this is called mapping (you could check the famous google map-reduce algorithm whcih basically does just this).
master-master repliation means that you write your data to different master servers and they synchronize each other, and isnt stored seperately like if you do sharding.
this has to be done if your application is not able to decide on its own where to store and fetch data.
you just store to any master server, every server gets everything and everybody is happy?
no... because this involves another serious problem.
conflicts! imagine two users enter a comment. commentA gets stored on serverA, commentB gets stored on serverB. which id should we use. which one comes first?
the best is to design an application that avoids this cases and has different keys and stuff.
but what usually happens is conflict resolving, prioritizing and stuff. oracle has alot of features on this level and mysql is still behind. but trends are going into much more complex data structes like clouds anaway...
well i dont think i explained well but you should at least get some keywords from the text that oyu can investigate further.
Sure, there are all sorts of things you can do to optimize your PHP/MySQL web applications for high traffic websites. However, most of them depend on your specific situation, which you haven't given in your question.
Your database should be well structured regardless of whether you have a high-traffic site or not. If you use an off-the-shelf CMS, this is typically fine. Aside from good application architecture, there is no one-size-fits-all solution.
Many database libraries come setup for multiple database connections - but I've never actually known of an scripting application that needed to connect to two databases during it's run. (compiled, daemon-running languages are a different matter).
I understand having database slaves so that you can spread the load out - but usually on startup only one of them is chosen to handle that scripts needs.
So why would a PHP or Ruby application need to connect to more than one database? Or rather, why would you split your data up among several databases?
The only thing I can think of is bad design from a slowly evolving system that started off in multiple separate parts.
Are you talking about different physical database servers or different databases in the "schema" sense?
Regarding physical servers, If you're using MySQL replication you might write to a master and always read from a slave. This helps split the load among each database.
The simple answer is "scalability".
The ready availability of replication and clustering in a number of database products makes multiple database use a definite 'this must be possible'. Any decent ORM should know how to connect to multiple databases as required.
But even when the main application doesn't connect to more than one, there will often be other needs that do. Report generation, either scripted or ad-hoc, often involve queries that run for a long time. These are best run on database replicants dedicated (and configured) for these queries so they don't disrupt the main application.
Another good use is a type of scripted processing. Many apps will have a regular process that needs to rummage through a large part of the database. Whislt updates obviously have to go to the master, the big read queries can be run off a replicant.
Of course, the obvious need is simple performance. I oversaw a webapp and database that grew from surviving comfortably on one MySQL databse on a 32-bit dual-core machine with 3Gb to needing two 8-core 64-bit servers with 8Gb. Once it reached this stage, it relied on the database handler directing traffic to both servers. We had a window of about 50 minutes in a day where it could survive on just one database.
I have a Ruby application that connects to multiple databases. One database contains user login credentials (which is shared between several other projects). Another database contains archived data that my application tracks and compares (that only my application accesses). Another database contains data regarding physical machine resources which my application uses to generate new data (these resources are used by several different applications). By splitting the data into multiple databases, different applications only access the data that they need to be accessing.
It is all too frequently the case that some of the data you need is stored in The Wrong Database. Sometimes it's personnel records in a PeopleSoft (Oracle) database. Maybe it's Enterprise CRM data on Informix. Or some departmental database stored in MS SQL Server. Whatever it is, it's in a different database, but you still need access (hopefully read-only).
Unless your primary database is magic-based, it isn't going to be able to provide you with remote table access for every other database out there. (Most will only provide remote access to other databases of the same type, eg: MySQL->MySQL.) When that all too frequent situation occurs, you'll have no other option but to have multiple database connections, and be glad that your framework supports it.
I have a site that connects with two databases. One powers the website content (CMS DB) the other drives a web application that runs within the site (large amounts of non-CMS data) In fact, the latter uses replication.
I don't feel that's bad design. If one set of data has no relation to the other, then it makes sense even from a pure organization perspective to house it in a separate DB. Otherwise, people would just put all their tables in one DB.
For added security, I always create two accounts for every database: a read-only account (good for SELECT) and a read-write account (for SELECT, UPDATE, INSERT, DELETE and whatever else I might need). On some pages, I may need to use both accounts, thus I will consume two connections for only one database.
Well, reading from one and writing to another is a very common use case. It's easy and fun to write a data access layer that reads from one connection (reading from the slave), and writes to another (the master). A single script might make multiple reads before writing -- perhaps some lookups are necessary for validation, for instance.
Scripting languages are also frequently used for integration. You might have two off-the-shelf codebases, both of which want to maintain their own database. Your integration code might want to talk to both of them.
In general, you can usually design out of using more than one connection, but in general, I don't see anything fundamentally wrong with using connections to more than one database.
Other reasons to have multiple databases. We have one application that everyone can access. We also have client database that are very differnt from client to client. It is easier to maintain the application that all clients use (and which is maintained by a differnte team) if the client_specific data is separated out to their own databases. It is also easier to move the client to a new server when they become a large enterprise client rather than the smaller clietns who run on a server with many other clients.
Further there are types of data that are transactional and need to be in databases that are set to full recovery mode with full transaction logging. Other data is only populated from imports and does not need transactional logging and which might slow down the system as the log grew enough to handle the 10,000,000 record import. These are often split out to a separate databse so they can be in simple recovery mode as it si not necessary to recover data from the transaction log if there is a problem, it can be easily recoverd by re-running the import.
Then data is split out into datawarehouses which are optimized for data reporting not transactions. Again these reporting databases are usually separate databases (often on separate servers).
Then you have the databases for multiple different COTS applications (we have accounting databases, Credit Card transaction porcessing databases, HR databases, our project management database). A particular website might need to access more than one of these or transfer information from one to the other. Believe me vendors won't let you copy their database structure into one database to rule them all.
We have several hundred databases here on many differnt servers.