Is it bad to use JSON files instead of real databases? - php

I have no access to create/edit any databases but due to the massive amount of content I need some sort of management system which is why I created my own. Here's how it works:
Each blog post has got its own .php file which loads static parts of the website like the header or the menu bar. But there are many category sites that display previews of the respective posts. It would be sooo annoying having to edit the same preview on 10 sites due to a misspelled word or so. That's why I store those previews (not the full content since there's no need to) in a JSON file.
Is that bad practice? Could this lead to long loading times if the number of previews rose? And could I prevent that by creating multiple JSON files?
Thanks for your advice!

If you are working on a small scale, then using JSON files is fine however it would definitely be beneficial to switch to a database management system whenever possible for storage when it comes to PHP (Or the majority of languages for that matter).
It can be considered bad practice if JSON is used to store large amounts of data or if a lot of data is stored in the same file in which case yes, using multiple JSON files rather than one large one is indeed more viable since the input stream when reading the file does not have to go over as much data.

Related

How to use database with CDN to improve peformance

I have question something similar to this question, I am using jwplayer for playing my videos. I have saved my videos in CDN. Due to some requirements, I have to save my subtitle first in cdn and then save both video file url and associated subtitle urls [eng, chinese, japanese etc] in DB.
When I make a Ajax call to retrieve the data in my JS file from PHP file. It is taking more time and it is causing performance issue.
I was wondering if there is any DB option in CDN, so that instead of saving those detail in my db I can directly save this info (associated subtitles of one video file) in CDN. since retrieving from CDN is much faster it will surely improve the performance.
CDNs just bring static information closer to the users, caching that information in points-of-presence (PoPs) around the globe. It is mostly done by web-servers sitting within those PoPs. So whatever you can't retrieve by HTTP GET will likely be a problem. For example, legacy protocol RTMP (also video) is supported by legacy CDNs (Level3/Akamai/EdgeCast), but not by newly formed Cloudflare/Cloudfront and so on, because it requires adds-on to web-server and clutters workflows.
Technically, any static database can be stored in a file, and the file can be cached by a CDN. But then, again, it would be your code that takes care of db->file->db metamorphosis. Therefore, if something is static, you don't really want to use database for it (to be future/CDN-proof). Subtitles are just text files, so let them be files in asset folders. I appreciate that high level architecture might be beyond your control here (due to specific ingesting system for instance), but then the answer is that you won't be able to do what you try, and resulting performance will suffer.
If you have the bucks you can look at Continuent.
http://www.continuent.com/solutions/pricing-and-services

php gallery file i/o or database i/o? the displaying of large image volume on website.

I've been on a project for the past few days and hit a problem displaying large quantities of images (+20gb total ~1-2gb/directory)in a gallery on one area of the site. The site is built on the bootstrap framework. I've been trying to make massive carousels that ultimately do not function fluidly due to combined /images size. Question A: In this situation do I need i/o from a database and store images there-- is this faster than in /images folder on front end?
And b) in my php script i need to -set directories to variables/ iterate through and display images into < li >, but how do I go about putting controls on the memory usage so as to not overload browser? Any additions, suggestions, or alternatives would be greatly appreciated. Im looking for most direct means to end here.
Though the question is a little generic, here are some thoughts in regards to your two questions:
A) No, performance pulling images from a database would most likely be worse than pulling straight from the file system. In general, it is not a good idea to store images or other binary data in databases unless you absolutely have to, because databases can't do much with this information and you are just adding an extra layer on top of the file system that doesn't need to be there. You would, however, want to store paths to images in your database, potentially along with other characteristics such as image dimensions, thumbnail paths, keywords, etc. Then your application would read the entries for the images to return the correct paths to the images.
B) You will almost certainly want to implement some sort of paging if you are displaying many hundreds or thousands of photos. If the final display must be a carousel, you will want to investigate the Javascript that drives it to determine how you could hook in a function that retrieves more results from your PHP application via an AJAX call when it reaches the end or near end of the current listing of images. If you are having problems with the browser crashing due to too many images, you will also want to remove images from the first part of the list of <li>s when you load new ones so that it keeps the DOM under control.
A) It's a bad idea to store that much binary data into a database, even if the DB allows it, you shouldn't use it, it'll also give you much more memory consumption, all your data will be stored in the database's memory space, then copied into PHP's memory space for you to handle, which eats up twice the memory, plus the overhead of running a database server, and querying, etc.. so no, it's slower to use a database, accessing the filesystem directly is faster, if you also use varnish or other front-end caching system, you'll even be able to serve content much faster too.
What I would do is store files on the filesystem, and the best server to handle static serving like that is either G-WAN or NGINX Source, but do your read up and decide for yourself what suits you best. point is, stay away from apache, and probably host all those static files onto a separate server running a lightweight http server
ProTip: Save multiple copies of the same image with scaled down sizes for example 50% and another version with 25% of the original image size, this way you'll be able to send the thumbnails first for quick browsing, then when a user decides to view an image you serve up the 50% or 100% size, depending on their screen size, this way you save yourself bandwidth and memory. you also save a big 3G bill for mobile users.
B) This is where it makes some sense to use a database, you can index all the directories into a database, and use that to store the location of the image in the FS, and perhaps some tags, and maybe even number of views, etc...
and in the forntend you'll implement a scipt that'll fetch for example 50 thumbnails per page then the user can scroll around using some fancy JQuery, and when you need to fetch more, simply get a new result set with 50 more thumbs, etc..
this way you'll save yourself memory, bandwidth and even the users will thank you for such a lightweight browsing experience !
Another tip:
If you want to be able to handle bigger traffic, you might want to consider using a CDN, there are many CDN services that aren't as expensive as Amazon S3, a simple search will give you tons of resources !
Happy hacking !

When loading multiple images from network requests, should I return the image data or a link to the image?

I have an iOS app that lists local places in a table view. Each cell has a picture, text, and subtext.
Each cell's detail view also has multiple pictures of the relevant location, as well as a decent amount of text. JSON is the interchange format.
Currently I am sending bit blobs and constructing it into a jpeg once loaded to the device but I am worried this is intensive on both the device and the server. So I was considering sending a link to the picture and asynchronously downloading each picture, but I am unaware of what repercussions this would have. Especially considering that I am currently using a cheap PHP/MySQL shared hosting plan for the backend.
I am looking for a list of pros and cons for sending the raw image data through JSON vs a link to that image. Any other options for quickly and efficiently populating a view with multiple network images is welcome.
I think the difference is as following:
1- the user will download : (link+image > image) more data stream.
2- if the image is on another server -> might be slower than your server or faster -> affect the image loading speed provided for the user and minimize transmitted data size between your server and the client.
3- if the image is on another server -> do you guarantee that it will be there when your website is up ?
4- loading data using ajax is already an asynchronous method and you don't have to worry about another server if you use it. well, unless your server is as slow as hell then you should consider using another server for the big images as the synchronization won't be your major concern as it is the load you are applying to your server.
if other points come to my mind, I'll post them here ..
I've done a little bit of research into this and asked a few colleagues, so I'll share what tidbits I've come up with.
At some point, the raw image data is going to have to be sent- that is unavoidable.
But I can benefit from lazily loading the image data- so that if my user only looks through 14 tableview cells, I only spend time loading 14 images instead of however many total results are returned from the server (And even less if I implement proper caching).
My solution so far is to return 30 (the number of tableview cells I load at one time) JSON objects, each having an "Image_URL":"..." field and putting those into a dictionary. Then, in cellForRowAtIndexPath:, I check to see if the image for that cell is already cached and if not, I make a request for that picture and update the cell.image in the network callback.
This is pretty simple to do on your own, but SDWebImage seems like a pretty good library for handling corner cases, caching, and other things that aren't covered in a basic implementation. I should note that AFNetworking also includes functionality for asynchronous image downloading.

Database for Content - OK to store HTML?

Basic question is - is it safe to store HTML in a database if I restrict who can submit to it?
I have a pretty simple question. I provide video tutorials and other content. Without spending months writing a proper BBCode parser, I would need to store the HTML so I can have it look exactly the way I want when I grab it from the database.
Basically I plan to store all information in the database about a tutorial series and each episode. I would like to have some formatting for the descriptions for both so I can add multiple paragraphs, ordered and unordered lists, links to required resources, and so on.
I am using PHP and creating my own database. I am using phpMyAdmin to store the information in the table right now. I will use a user with read only rights when I pull the information in the PHP code.
What is the best way to do this? Thank you!
Like others have pointed out there's nothing dangerous about storing HTML in the DB. But when you display it you need to know the HTML is safe. Seeing as you're the only one editing the HTML I see no problem.
However, I wouldn't store HTML at all. If all you need are headings, paragraphs, lists, links, images etc I'd say Markdown is a perfect fit. The benefit with Markdown is that it looks just like normal text (ie you could send your articles as e-mails or save them as txt-documents), it takes up a lot less space than HTML and you don't have to change it once HTML gets updated.
http://michelf.ca/projects/php-markdown/
From the security point of view it is not less secure to store your HTML in a database than storing it anywhere else - if you are the only author of that HTML. But then again if other people can author HTML in your website then it doesn't matter where you store it - only how you sanitize it and how and where you display it.
Now whether or not it is an efficient way to store HTML is a completely different matter. If I were you I would use some decent templating system and store HTML in files.
Storing HTML code is fine. But if it is not from trusted source, you need to check it and allow a secure subset of markup only. HTML Tidy library will help you with that.
Also, you need to count with a future change in website design, so do not use too much markup, only basic tags. To make it look like you want, use global CSS rules and semantically named classes in the markup.
But even better is to use Markdown or another wiki-like syntax. There are nice JS editors for Markdown with real-time preview (like the one here at Stackowerflow), and you can avoid HTML altogether.
My initial answer to "should I store html in a db" is generally no. Sure it's safe if you know what you're storing, but are you really considering best practices when you ask only that question? The true answer is "It depends".
I'm sure there are things like Wordpress that store html in a database, however, as a professional website designer, I like to remember the Separation of Concerns principle. How reusable is storing html in your database for a mobile app? Is your back end now in charge of display as well as data? Do you have many implementation possibilities for a front end or are you now stuck with whatever the back end portrays, what if you want it a different color and you've stacked ul within ul within ul? How easy is the css styling now? How easy is it to change or update that html?
I could be wrong, but even Sitecore and Kentico may store an html template in a database somewhere, but the data associated with that html template is a model, not directly on the html template.
So, when you are considering this question, you may want to store your models one place and your templates another, that way when you say "hey, lets build a mobile app" you can grab your data and go, rather than creating yet another table to store the same data.
I made a really big mistake by storing text data in Mongodb gridFS + compression and using mongodump for daily backup. GridFS is 1GB of textfiles but after backup memory usage rises sometimes 1GB daily after one month 20GB in memory due to how this backup is made.
In mongodb you should do a snapshot of the data folder - rather than do mongodump. The possible reason is that it copies unused data from disk into memory then makes bson dump. So in my case text that was never used for a long time should never be loaded into memory. I think this is how backup works as even right now my Mongodb is using 200MB of ram after run mongodump its can rise to 3GB
So i think the best solution is to use a filesystem for storing HTML files as your even RAID like PERC H700 has many amazing caching features including read ahead. But it has some limitations like network access and with my experiences some data was corrupted in time and needed to run chkdsk for repair as many GB of data was add or removed daily. Also you should consider to use proper raid features like Write trough that prevent data loss when power failure.
Sqlite is not designed to be used with extremely big data so you shouldn't not use it and has missing many caching features.
Not perfect solution is to use MariaDB or its own caching script in nodejs that can use memcached/Linux ramdisk with maybe 1GB of hot cache. Using an internal nodejs caching mechanism after some time can produce many memory leak. So i can use it for network connection and I/O are using filesystem lock and many "HOT" most used files can be programmed to cached in RAM or just leave as is

Easiest and fastest way to template, possibly in a PDF

I have been looking extensively for a simple solution to a not-very-complicated problem.
I have a great deal of data in a sql database which needs to be printed (for example, each entry would have name, address, phone number, etc).
The vast majority of the data on the eventual printed page is static- there would only need to be a small handful of fields that need to be 'variables' in the 'template'. Quite beneficially the areas that the variable data would be dropped into are themselves in both location orientation and dimensions fixed-- so there need be no adjustments to spacing for the other static/redundant data on the page.
I would like to have some form of 'accounting' in the sense that, since the amount of pages printed are going to be on the order of the tens of thousands, I would like to know which sql entries have been printed thus far.
I would not like to 'reinvent the wheel' and write a php front end which loops through arrays and deposits the sql data onto the right place on the page before or after it is rendered as pdf...
I would prefer to print directly from the server (*nix), and would be very enthusiastic if there is a way to do this without actually having to render tens of thousands of individual pdfs. With todays open source software packages, which route is the best to take?
(so far, it is looking like if there isn't a simple way, I am going to need to learn LaTeX, Cheetah, and some python)
Dabo's report writer is a banded reporting engine like Crystal, which takes as input a set of data (output of cur.fetchall(), for example) and a report template (xml string or file), and outputs a PDF or set of PDF's (it can output a stream of bytes instead of writing to a file directly, if desired).
Dabo's main purpose is a desktop-application framework on top of wxPython, but the reporting can be done on the web with no desktop interaction. Though it does help to design the reports using the desktop though using the included report designer.
http://dabodev.com
There will be some installation hurdles and a learning curve, but you'll find this to be an easy task once you are ramped up.

Categories