How to display accented characters entities in a iOS app? - php

In my iOS app I get data from an external PHP script which builds and returns strings using queries on a mySQL database. In this database, texts have HTML entities in them, e.g. Josè is written as
Josè
When I pass these built strings to my app, all the entities are still there but I'd like to transform them into human readable text in my app. I can't find a way to do this.
I saw questions like this one with accepted answers like this but I can't write a line for any of the hundreds of entities that exist. I mean, I could, but I can't believe there is not a way to do this in a more simple way.
Also, since I use said strings in many places from many views through all the app (text views, labels, table view cells, etc) I think it would VERY useful to apply the correct transformation in the PHP script itself, rather than in the app. So my final question is this: which is the correct way to build a string with entities in it so when I load it in my iOS app all the entities are readable characters? Thank you to ANYONE who will help me!

You can use the Google's category GTMNSString+HTML
This category possibly covers most of the HTML entities you might want to convert into human readable format.
Since I'm not able to find the Google code for this category, I'm pointing to an alternative location; MWFeedParser.
GTMNSString+HTML.h
GTMNSString+HTML.m

Related

Store both BBCode and HTML version in database?

On Stackoverflow I've found questions about storing BBCode OR HTML into the database, but what about storing both? For example, I would create posts DB table with two columns: body_bbcode & body_html.
In body_bbcode I would store original post submitted by a user (forum member), and in body_html I would store parsed (HTML) version of that post.
So, for displaying forum posts I would use body_html, but for editing & quoting (replying with quote) I would use body_bbcode.
The reason why I want to do this is because the parser is using regex and without body_html it would need to convert at least 15 forum posts per topic page. Correct me if I'm wrong, but that can cause performance issues?
On the other hand, I didn't see anyone doing like this so I'm wondering what are the disadvantages of this approach, besides taking up more space in the Database?
Also, I am thinking of adding a new column in which I would store plain text version for search purposes, so that the tags themselves aren't searched (for example body_text).
The reason why I want to do this is because the parser is using regex and without body_html it would need to convert at least 15 forum posts per topic page. Correct me if I'm wrong, but that can cause performance issues?
A well designed bbcode regex will not hinder performance in any meaningful way.
Do not create "duplicate" columns for bbcode text and html text.
A major problem you run into with your suggested approach is that you will inevitably change your html code. (E.g., add a class to html links, change iframe dimensions of youtube embeds, etc.) Then you're stuck trying to update the data in the html column which would be problematic.

Parsing incoming email content from a range of set templates

Working on a project that requires incoming email to be parsed, and certain information be extracted and stored in the database. We're using postmarkapp to extract the body content of the email so we only have the text only guts of it, but I'm currently a bit stuck on how to parse the email the most efficient way.
Over time we'll be adding more 'accepted' formats of incoming mail, but to start off with we'll have probably 4 common emails coming in, that is, they'll follow the same format and the information that we want to extract (contact details, id's, links, bio) will be in the same place, (per supported format).
I'm thinking that we'll have an interface that will handle the common tasks, and each supported format will implement that, however just how to get that information is where I'm stuck.
Open to any thoughts and ideas on different methods / technologies to do this, ideally PHP, but if we need to use something else, that's fine.
There is a similar feature on a site that I developed. Our users get emails from their suppliers with pricing. They copy and paste the body of the email into a textarea on our site and click a button. Then we parse the text to find products and prices and stick the info into a database.
To do the parsing, we first have to determine the supplier, like you'll need to do to determine which template was used. We look for certain strings in the text - the supplier's name usually, or a line that's unique to their emails. We do that in a method called something like getParserForText(). That method returns a Parser object which implements a simple interface with a parseText() method.
There's a Parser implementation class for each format. The parseText() method in each class is responsible for getting the data out of the text. We looked for ways of making these elegant and generic and have simply not found a really good way to do that. We're using a combination of regular expressions, splitting the string into smaller sections, and walking through the string.
Pseudocode:
$text = $_POST['emailBody'];
$parser = getParserForText($text);
$result = $parser->parseText($text);
if(count($result["errors"]) > 0)
{
// handle errors
}
else
{
saveToDatabase($result["prices"]);
}
We have no control over the formats the suppliers use, so we have to resort to things like:
split the text into an array of strings around each line with a date (prey_split())
for each element in that array, the first line contains the date, the next three to six lines contain products and prices
pull the date out and then split the string on new lines
for each line, use a regex to find the price ($000.0000) and pull it out
trim the rest of the line to use as the product name
We use a lot of prey_split(), preg_match_all() and explode(). While it doesn't seem to me to be particularly elegant or generic, the system has been very robust. By leaving a little wiggle room in the regular expressions, we've made it through a number of small format changes without needing to change the code. By "wiggle room" I mean things like: Don't search for a space, search for any whitespace. Don't search for a dollar sign and two numbers, search for a dollar sign and any number of numbers. Little things like that.
EDIT:
Here's a question I asked about it a few years ago:
Algorithms or Patterns for reading text
Since it's generated email, It most likely comes in an easily parsable format, such as one line per instruction; key=value. You can then split the lines on the first =-sign and use the key-value pairs that this gives you.
Regular expressions are great for when you don't have control over the incoming data format, but when you do, it's easier to make sure it is parsable without a regexp.
If the format is too complex for such simple parsing, please give an example of a file using the format, so I can make the answer more specific. Same thing if this isn't an answer to what you mean to ask: please give an example of the sort of answer you want.

Reading XML into PHP

I'm trying to determine the best course of action for the display of data for a project I'm working on. My client is currently using a proprietary CMS geared towards managing real estate. It easily lets you add properties, square footage, price, location, etc. The company that runs this CMS provides the data in a pretty straightforward XML file that they say offers access to all of the data my client enters.
I've read up on PHP5's SimpleXML feature and I grasp the basic concepts well enough, but my question is: can I access the XML data in a similar fashion as if I were querying a MySQL database?
For instance, assuming each entry has a unique ID, will I be able to set up a view and display just that record using a URL variable like: http://example.com/apartment.php?id=14
Can you also display results based on values within strings? I'm thinking a form submit that returns only two bedroom properties in this case.
Sorry in advance if this is a noob question. I'd rather not build a custom CMS for my client if for no other reason than they'd only have to login to one location and update accordingly.
Some short answers on your questions:
a. Yes you can access XML data with queries, but using XPath instead of SQL. XPath is for XML what SQL is for databases, working quite different.
b. Yes you can build a php program that receives an id as parameter and uses this for an XPath search on a given XML file.
c. All data in a XML file is a string, so it is no problem to search for or display strings. Even your example id=14 is to handle as a string.
You might be interested in this further information:
http://www.ibm.com/developerworks/library/x-simplexml.html?S_TACT=105AGX06&S_CMP=LP
http://www.ibm.com/developerworks/library/x-xmlphp1.html?S_TACT=105AGX06&S_CMP=LP
PHP can access XML not only via SimpleXML but also with DOM. SimpleXML accesses the elements like PHP-arrays, DOM provides a w3c-DOM-compatible api.
See php.net for other ways to access XML, but they seem not to be appropriate for you.

Why need to use JSON in php and AJAX

I just started doing jQuery last week, and so far I already made some basic systems with ajax, like basic jQuery CRUD and simple chat system without referencing on other's work for I decided to test myself on how far I can do systems alone in jQuery(without JSON and XML yet).
But when I decided to look at other's work (hoping to get/learn good practices and codes out there) many or almost every program that deals with ajax have some JSON in it. So I decided to study and read JSON specially this one, but I guess because it's my first time dealing with it, I'm having a problem sinking it into my brain. Yeah I know it is a "lightweight way of describing hierarchical data", I also know how to make JSON like mixing a literal array and object in JS, and how to dsplay it in js.
But my question is, what's the difference and what's the advantage than not using it?
When I can still get and store data on the server using ajax and database without JSON.
By the way I haven't focus on XML yet because based from my research it's better to use JSON in AJAX.
Can you give me some actual scenario dealing with
s1. ajax php mysql (this with what disadvantages?)
and
s2. ajax php mysql json (this with what advantages?)
I mean, my focus is to send and get data, and I already can do it with s1.
Sorry if you find my question stupid. Tia. :)
Why use JSON? The answer is portability and structure.
JSON is portable because parsers and writers are available for many, many languages. This means that JSON that a PHP script generates can be very easily understood by a JavaScript script. It is the best way to transmit complex structures like arrays and objects, and have it still be compatible with multiple languages.
JSON provides structure because the data you transmit with it can have consistent formatting. This is instead of transmitting back plain-text (i.e. unformatted) data, like comma-separated or delimited data.
Data that is merely delimited (for example, "BookName1,BookName2,BookName3") is more difficult for humans to understand, debug, and work with. If you wanted to debug a response between your server and your browser and the data was delimited (like my example above), you might have a hard time understanding it. Also, if you want to add different data types, provide separate records, etc., then your custom data format becomes more complicated. Eventually, you might end up reinventing JSON.
As a side note, JSON is indeed better than XML. It is much more efficient space-wise. There are no tag names to take up space. Structure is created via nested braces, instead of verbose tags.
Resources
Here is an interesting article on the differences and pros/cons of XML and JSON: http://www.json.org/xml.html
Examples
Per your request, here is an example of encoding JSON with PHP. This is ripped from the docs:
$arr = array ('a'=>1,'b'=>2,'c'=>3,'d'=>4,'e'=>5);
echo json_encode($arr);
Output:
{"a":1,"b":2,"c":3,"d":4,"e":5}
Contrast this to something like this, without JSON:
a,1
b,2
c,3
d,4
e,5
To parse that, you'd have to iterate through each line, split the values yourself, and then create the array. This isn't that difficult, but imagine you have a nested object:
$arr = array ('a'=> array(1,2,3),'b'=> array('a' => 1, 'b' => 2),'c'=>3,'d'=> array(1,2,3,4,5) ,'e'=>5); // etc.
With JSON, it's no different to encode it. Just use json_encode. But, encoding this manually, and then decoding it manually would be significantly more work.
Programming in any sort of programming language, you have several different types of data at your disposal, including the very useful array type.
Interchanging data between Javascript and any server side language can only happen through strings. I.e. you can send and return any text, but there's no way to send a native array or number type.
JSON is an elegant way to express array and other types using only a string. This way you can pass arbitrary data back and forth between different environments and are not limited to pure text. XML solves the same kind of problem, but is often overkill for simple AJAX requests.

How do I design a web interface for browsing text man pages?

I would like to design a web app that allows me to sort, browse, and display various attributes (e.g. title, tag, description) for a collection of man pages.
Specifically, these are R documentation files within an R package that houses a collection of data sets, maintained by several people in an SVN repository. The format of these files is .Rd, which is LaTeX-like, but different.
R has functions for converting these man pages to html or pdf, but I'd like to be able to have a web interface that allows users to click on a particular keyword, and bring up a list (and brief excerpts) for those man pages that have that keyword within the \keyword{} tag.
Also, the generated html is somewhat ugly and I'd like to be able to provide my own CSS.
One obvious option is to load all the metadata I desire into a database like MySQL and design my site to run queries and fetch the appropriate data.
I'd like to avoid that to minimize upkeep for future maintainers. The number of files is small (<500) and the amount of data is small (only a couple of hundred lines per file).
My current leaning is to have a script that pulls the desired metadata from each file into a summary JSON file and load this summary.json file in PHP, decode it, and loop through the array looking for those items that have attributes that match the current query (e.g. all docs with keyword1 AND keyword2).
I was starting in that direction with the following...
$contents=file_get_contents("summary.json");
$c=json_decode($contents,true);
foreach ($c as $ind=>$val ) { .... etc
Another idea was to write a script that would convert these .Rd files to xml. In that case, are there any lightweight frameworks that make it easy to sort and search a small collection of xml files?
I'm not sure if xQuery is overkill or if I have time to dig into it...
I think I'm suffering from too-many-options-syndrome with all the AJAX temptations. Any help is greatly appreciated.
I'm looking for a super simple solution. How might some of you out there approach this?
My approach would be parsing the keywords (from your description i assume they have a special notation to distinguish them from normal words/text) from the files and storing this data as searchindex somewhere. Does not have to be mySQL, sqlite would surely be enough for your project.
A search would then be very simple.
Parsing files could be automated as post-commit-hook to your subversion repository.
Why don't you create table SUMMARIES with column for each of summary's fields?
Then you could index that with full-text index, assigning different weight to each field.
You don't need MySQL, you can use SQLite which has the the Google's full-text indexing (FTS3) built in.

Categories