Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I will be getting tens of thousands of XML documents that I'll need to query. The queries need to encompass all the XML files, not just querying individual files. For example, I might need:
Return the <name> value from the XML file whose <publish_date> is the most recent
What technologies or approach can I use for this scenario?
Loop through each XML file and execute an XPath? This would be too expensive and not scalable
Consume the XML and insert it into a database that has been modeled to respect the XML's schema? Then just do regular SQL queries to get the data I need?
Use an XML database?
Would XQuery be an option?
This needs to be part of an PHP/MySQL solution.
Take your XML files and insert them into eXist-db. You can insert these easily from PHP by doing either a HTTP POST or PUT against their REST API (depending on your needs). If you insert them into the same collection you can then from PHP do a HTTP GET or POST sending an XQuery that queries all of the documents from the same collection, for example.:
collection("/db/your-collection-of-documents")//name[parent::element()/publish_date gt "2014-006-14"]
If you can be more specific about your XML, I could update this question with the REST URI that you would need to use, and an appropriate XQuery.
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
how can I get all of the tables from http://www.imq21.com/market/summary this website? should I using DOM HTML?
Literally, you cannot get all the tables easily.
However, the table of that web site is not a flash. thus you can GET the entire raw HTML of that site, and applying Regular expression to parse the item one by one, and create your own class to store the output, and add the fetched data to the table your wanted.
Note:
it is not an API, thus you need to visit another link for another table and apply the same technique (Fetch HTML, regular expression, manipulate the data and store in the table) for all tables that you needed.
it maybe illegal or not regular to fetch data from other web site like that :D ... and if it is owned by you, you can simply write an API for your program to fetch the data directly (faster, and relatively lesser amount of work)
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I have different sources like S3 (json files) and API, and I have to bring all the data to a unique format to store the data in DB.
I tried to parse files and API response on my php back-end but it is too slow.
Is there some best practices or advises how I can do it in a right way?
I'm going to do an Interface with all required methods, and Class' for every source which will implement the Interface.
If I will work with hundreds or thousands files (per hour) Is this approach the best way to do it?
P.S. Currently the project is build on top of Symfony2 framework.
I guess you are forced my traditional RDBMS to convert all sources to specific format.
You may use schema-less systems like MongoDB, Cassandra or even JSON type in MySQL 5.7 to store 3 fields: id, source_type and source_json. This way you create several classes that know how to parse the source_type (ex: S3) and use them accordingly
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I'm trying to get a big number of data (about 3M rows) and I have only two options to do that.
Call an API then recover the 3M JSON objects.
Import a CSV file containing the 3M rows.
I didn't test any of these solutions yet to tell which one is best in terms of speed.
If you want to retrieve simple data as lists or rows with some columns the option #2 is the good one, you can read below a set of advantages and disadvantages:
Pros
Less bandwidth needed because JSON needs more syntax characters to keep the format while CSV is as simple as use a character separator
Process data is faster because only needs to split by the separator character while JSON needs to interpret the syntax
Big data technology as Hadoop has an integrated parse for CSV format while needs a specific function for parse JSON (for example using Hive language).
Cons
Unstructured data and more difficult to be read by humans
You have to take care as separator character cannot appear in data fields.
If the data will contain complex data as tuples, arrays and structures JSON are better because:
Keeps a clear and structured format
Doesn't repeat data to reference it because one label could contain multiple data.
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 8 years ago.
Improve this question
I built an android app which actually retrieve my blog post from some custom feed in XML format.
The data in my blog will increase in the next future so i need to change the way of retrieving data.
I decided to build a PHP webservice to query my posts and answer back to the app the necessary data (as JSON encoded data).
There are two different way to build the webservice: build a direct query to the MySQL database or build a query using the php the_content, php the_title, etc functions (WP_query).
It looks like that querying the My SQL database is more flexible but more complicated.
Which is the best solution between the ones above?
Thanks for your support
I would suggest to use the wp_query because the database may evolve with future Wordpress updates. This means you would have to adapt your mysql query but not you wp_query because the change in the mysql database will have been done by the wordpress update itself.
Furthermore you can have a lot of customization using the wp_query and it's more likely you don't even need to reach the database directly.
Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
In the backend of my IOS project, the admin is saving the data into a DB or in an XML file. So whenever he wants, he can simply add an entry.
In the IOS app, I want to retrieve the data.
If I use XML, I can directly parse the XML file, since data are already in XML format (when admin added the value, the XML file got updated).
If I use JSON, I have to connect to the DB, get the result of the query and then encode it into JSON.
So, what do you think would be faster, in terms of the response come into phone.
Is there any other option that I didn't take into account?
I have read all of these similar questions:
JSON and XML comparison [closed],
What's better: Json or XML (PHP) [closed],
JSON or XML: Just Decide (April 2012; by Mark Nottingham)
and many more, but I want to ask something specific for my project.
It depends on lots different things:
amount of data
cpu time needed to generate the data
network bandwith/latency
mobile phone's hardware
...
But because generally mobile network is the bottleneck, probably the less redundant transfer will be the most efficient. And it is json in this case.