indexing excel files in PHP - php

For a customer i'm working on a small project to index a bunch (around 30) Excel spreadsheets. Main goal of the project is to search fast through uploaded excel files. I've googled to find a solution but I didn't found an easy solution yet.
Some options I'm considering:
-Do something manually with PHPExcel and MySQL and store column information using meta tables. Use the FullText options of the table to return search results.
-Use a document store (like MongoDB) to store the files and combine this with ElasticSearch / Solr to get fast results.
-Combination of both, use Solr on the relational database.
I think the second option is a bit overkill, I don't want to spend to much time on this problem. I'd like to hear some opinions about this, other suggestions are welcome :)

I agree with the others. I've done several systems in the past that suck spreadsheets into a database. It is an excellent way of getting a familiar user interface without any programming. I've tended then to make use of email to get the spreadsheets a central location for reading either by MS Access and, in more recent years, read by PHP into a MySQL database.
PHP is particularly good as you can connect it easily to a mail server to automatically read and process the spreadsheets.

Related

How to create log files using php mysql in fastest method?

I need to create a millions of logs using php and mysql or write to excel or to Pdf.
And i need to create this is in fastest method.
I tried the following method to insert data-
$cnt=200000;
for($i=1;$i<=$cnt;$i++)
{
$sql="insert into logs('log_1','log_2','time'") Values('abcdefgh.$i','zyxwvu.$i');
query=mysql_query();
}
But its taking too much time to do the operation. Please help me if anybody know the solutions.
as per my understanding , you don't want the error logs ? you want to submit records in database, excel and PDF and those can be millions or billions records right ?
Well, i had a similar problem months ago and there are several ways to do this, but really don't know what's the quickest:
1.- Archive engine: Use an engine of type archive for the table, this engine was created to store big amounts of data like logs.
2.- MongoDB: I do not test yet this database but i read a lot about it and it seems that works very well on this situations
3.- Files: I thought in this solution when i was trying to fix my problem but it was the worst of all solutions (for me at least, beacuase i need to have the data in database to make some reports, so need to create a daemon to parse the files and store in database)
4.- Database Partitioning: This solution is compatible with the first one (and even with your current engine type), just check this link to create some partitions to your database
FYI: My current solution is Archive engine + Partitions by month
Hope it helps

PHP website without mysql

I am currently working on an existing website that lists products, there are currently a little over 500 products.
The website has a text file for every product and I want to make a search option, thinking of reading all the text files and create an xml document with the values once a day that can be searched.
The client indicated that they wanted to add products and is used to add them using the text files. There might be over 5000 products in the future so I think it's best to do this with mysql. This means importing the current products and create a crud page for products.
Does anyone have experience with a PHP website that does not use MySQL? Is it possible to keep adding text files and just index them once a day even if it would mean having over 5000 products?
5000 seems like an amount that's still managable to index with a daily cron job. As long as you don't plan on searching them real-time, it should work. It's not ideal, but it would work.
Yes, it is very much possible, NOT plausible that you use files for these type of transactions.
It is also better to use XML instead of normal TXTs for the job. 5000 products with what kind of data associated to them might create problems in future.
PS
Why not MySQL?
Mysql was made because file based databases are slow and inaccurate.
Just use mysql. If you want to keep your old txt based database, just build an easy script that will import each file one by one and create corresponding tables in your sql database.
Good luck.
It's possible, however if this is a anything more than simply an online catalog, then managing transaction integrity is horrendously difficult - and that you're even asking the question implies that you are not in a good position to implement the kind of controls required. And as you've already discovered, it doesn't make for easy searching (BTW: mysql's fulltext indexing is a very blunt instrument - it's not a huge amount of effort to implement an effective search engine yourself - or there are excellent ones available off-the-shelf, e.g. mnogosearch)
(as a conicdental point, why XML? It makes managing the data much more complicated than it needs to be)
and create a crud page for products
Why? If the client wants to maintain the data via file uploads and you already need to port the data, then just use the same interface - where the data is stored is not relevant just now.
If there are issues with hosting+mysql, then using SQLite gives most of the benefits (although it wion't scale as well).

Read and write dat file in php

recently i've used Maxmind geoip to locate country & city based on the ip. It has huge content inside the dat files. but retrieving of those records happens within a seconds. so i'm so curious to learn and use the technology in php.
First i've seen some video files are using this .dat extension files and now text information. so what is .dat extension actually? is it possible to read and write in php.
Thanks!
For what I know, dat extension means a generic file in which you could write what you need, in the format you please.
I mean, in every file you could do that, but generally if you find an xml file you assume that inside you find xml formatted text; on the contrary dat files are not recognized as something you can decode with a specific software if you don't know who and how wrote it.
The files will most likely be in a custom format that they developed; if it's open source you could reimplement it in PHP (if it isn't already written in PHP), or maybe access the data through an API.
The speed will come from the fact that it'll be indexed in some way, or it's like "for every record move 100 bytes further into the file".
There's a lot of questions here.
First, the file is a database - it stores data. There are lots of database models - relational, herarchical, object-oriented, vector, hypercube, keystore....there are implementations of all these available off the shelf.
Some databases are more apposite to managing particular data structures than others. Geospatial data is a common specialization - so much so that a lot of other database types will provide vector functionality (e.g. mysql and postgresql which are relational databases).
For most database systems, the application using the services of the database does not access the data file directly - instead access is mediated via another process - this is particularly relevant for PHP since it typically runs as multiple independent processes with no sophisticated file locking functionality.
So if you were looking to implement IP to geography info yourself, I'd recommend sticking to a relational database or a nosql keystore (you don't need the geospatial stuff for forward lookups).
But do bear in mind that IP to geo lookup data is not nearly as accurate/precise as the peolpe selling the products would have you believe. If your objective is to get accurate position information about your users, the HTML5 geolocation API provides much better data - the problem is availability of the functionality on user's browsers.

Why do forums store posts in a database?

From looking at the way some forum softwares are storing data in a database (eg. phpBB uses MySQL databases for storing just about everything) I started to wonder why they do it that way? Couldn't it be just as fast and efficient to use.. maybe xsl with xslt to store forum topics and posts? Or to at least store the posts in a topic?
There are loads of reasons why they use databases and not flat files. Here are a few off the top of my head.
Referential integrity
Indexes and efficient searching
SQL Joins
Here are a couple more posts you can look at for more information :
If i can store my data in text files and easily can deal with these files, why should i use database like Mysql, oracle etc
Why use MySQL over flatfiles?
Why use SQL database?
But this is exactly what databases have been designed and optimized for, storage and retrieval of data. Using a database allows the forum designer to focus on their problem and not worry about implementing storage as well. It wouldn't make sense to ignore all the work that has been done in the database world and instead implement your own solution. It would take more time, be more buggy, and not run as quickly.
Database engines handle all the problems of concurrency. Imagine that, two users try to write in your forum at the same time. If you store the post in files, the first attempt will lock the file so the second has to wait for the first to finish.
Otherwise if you want to search, it's much faster to do it in database than scanning all the files.
So basically, it's not a good idea to store data wich can be modified by useres simultaneously, and searching is much more efficient in database.
Simply, easy access to data. It's a lot easier to find posts between a date, created by a user, or with certain keywords. You could do all of the above with flat file storage, but this would be IO intensive and slow. If you had the idea of storing each post in its own file, you'd then have the problem of running out of disk space, not because of lack of capacity, but because you'd have consumed all the available inodes.
Software such as this usually has a static caching feature - pages that don't change are written out to static HTML files, and those are served instead of hitting the database.
Mixing static caching with relational DB storage provides the best of both worlds.

Database solutions for storing/searching EXIF data

I have thousands of photos on my site (each with a numeric PhotoID) and I have EXIF data (photos can have different EXIF tags as well).
I want to be able to store the data effectively and search it.
Some photos have more EXIF data than others, some have the same, so on..
Basically, I want to be able to query say 'Select all photos that have a GPS location' or 'All photos with a specific camera'
I can't use MySQL (it won't scale well with the massive data size). I thought about Cassandra, but I don't think it lets me query on fields. I looked at SimpleDB, but I would rather: not pay for the system, and I want to be able to run more advanced queries on the data.
Also, I use PHP and Linux, so it would be awesome if it could interface nicely to PHP.
Edit: I would prefer to stick with some form of NoSQL database.
Any ideas?
I also doubt that MySql would have any load problems, but have a look at CouchDB:
Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.
Getting started with PHP and the CouchDB API.
CouchDB: The Definitive Guide
CouchDB basics for PHP developers
I would probably personally stick to MySQL, but if you are looking for a NoSQL style system you might want to look into Solr. That allows things like faceted searches (e.g. tells you how many of your current search result fit into each resolution / format / etc and lets you narrow your search that way).

Categories