Squares apperaing in text, want to search & replace - php

I have a site where a client has inserted text. Probably by copying it from a PDF or Word document.
There is a strange character that appears at random places in the inserted text.
The site is full of them so i would like to get rid of them programmatically, either by a search and replace in the database or by filtering via PHP when the value is printed out via str_replace();
The character looks like this, and is not recognized when i search for it in the database, and not even when you search for it with the built in browser search function.
It looks the same in the database as it does when rendered. In some cases it´s like an empty square, in some cases with a questionmark inside it.
I have tried this without luck:
$value = str_replace( '', '', $value );
(The square in first argument seems to be removed when i save)

Related

My php function does what I want but ouput converts characters to ascii

I am using the following function (tried both in my Wordpress Child Theme function file as well as a plugin - works in both cases) to remove hashtags from post titles. The function does what I want but then all titles (post body content is ok just titles) are now showing HTML numbers for characters (i.e., ' = &8217; and & = &038; and - = &8211;).
So this title
That's #testing the & and apostrophe #tagstitle #cats #cat #instagramcats
becomes
That&8217;s testing the &038; and apostrophe
Which removes hashtags as desired but creates the character issue.
function remove_hashtags($string){
return preg_replace('/#(?=[\w-]+)/', '',
preg_replace('/(?:#[\w-]+\s*)+$/', '', $string));
}
add_filter('the_title', 'remove_hashtags');
I've tried adding additional code:
html_entity_decode('the_title', ENT_QUOTES | ENT_XML1, 'UTF-8');
to the function after reading up on PHP (I'm just learning) but it doesn't seem to work and I'm not sure how to use
html_entities($string)
(question update adding more information)
I basically took the code from here - that had exactly what I needed. I just added the last lines for filtering the WordPress Post Titles.
I don't want to make it too complicated a question but ideally I would like to remove the hashtags from the text and actually create post tags from them. I have found several answers for each part I just don't know how to put it all together. Forget that though...I really just want to find out why all of the sudden the ascii numbers are replacing the original punctuation.
If you only want to remove hashtags you could use str_replace("#","",$string);

Removing non-utf characters using php5

I am trying to display data on a webpage that contains non-utf8 characters. A user uploads a tab separated file from a FileMaker databse into our Oracle database via a web form. I then display the data through another user webpage.
One record has a character I've not seen before. It is the letters, 'VT' in a square. Sort of like [VT]. Pasting here, but it probably wont show in this setting. attaching image. If it does not show, the line of text with the character in it looks like this:
'Blue is a color[VT]Blue web page to find more information.[VT]Blue is a nice color'
![vt character](http://www.johndcowan.com/graphics/vt-special-chars.png)
I've tried these PHP functions with no luck:
$answer = iconv("utf-8", "utf-8//ignore", $answer);
$answer = htmlspecialchars($answer);
$answer = strip_tags($answer);
Anyone have an idea how to 86 this character?

Conditional Preg_Match and Replace (PHP)

I have preg_matches running. It works as follows:
Search for starting tag
Search for ending tag
This works; however, the page that I get data back from sometimes does not have data for that tag field. So instead of what should be a normal
<Field1>Data Here</Field1>
shows up as
<Field/>
So as you can see above, if there is no data (rather than not show the tag) it puts one ending tag and changes the tag itself too. Unfortunately, I need to enter "NA" for that data which may or may not be present.
(Note the <Field/> Not </Field>.
I'm curious to any thoughts you might have on being able to accomplish a workaround.
* Search for <field></field>
* Also search for ></field>
* Replace ></field> with <field></field> to match it all up.
Here is what I am using currently:
if(preg_match_all('#<(TicketNbr|Summary|Resolution|Site_Name|date_entered|status_description
|ServiceType)>\\s*(.*?)\\s*</\\1>#is', $resp, $m) ) {
so I figured I could go right into possibly a
preg_realace which I believe will work like match_all just replacing them.
Will a preg_replace work against the above preg_match_all or if I could just tie in a
><field/>
into the preg_match_all.

Actual input contents are not preseving on most of the browsers [FF,MSIE7/8 and etc]

I'm working on one application ( using PHP, javascript ). Below is the short description about my problem statement
There are two forms avaliable on my application, i.e. SourceFrm and targetFrm.
I am taking input on first form i.e. SourceFrm and doing processing on targetFrm.
Below is the input which I am taking from SourceFrm :
1) Enter your data (Identification of this input box id is 'inputdata' ):
2) Enter id ( Identification input box id is id ):
As per above input feed by user I am posting this data to targetFrm for further processiong.
On TargetFrm :
I am simply assigning inputdata value to php varible.
The spaces which are in between of words are getting lost ( more than one spaces converting to one space).
e.g.
User has added below data on input box and submitted
inputdata:
This is my test.
Here observed that user has added 5 spaces in between 'my' and 'test' word.
After assigning this input data to php variable. After that I printed this value
Below content I am getting
Output:
This is my test.
More than one spaces is converting to one space. This behaviour I checked on all browsers like FF,MSIE7/8 opera, safari, chrome.
If have used '<pre>' before printing php variable i.e.:
print "<pre>";
print $inputdata;
At time spaces are not getting lost (I am getting exact content).
Here my conflict is how do I presrve exact contents without using '<pre>'.
I have used encoding/decoding (htmlentitiesencode() and decode () )functionality, in my further data processing, so it may create some conflict if i replace spaces with . ( May conflict ll occur if i use instead space ).
Is anyone has any ideas, suggestions please suggest.
-Thanks
When you output your variables to HTML, they are parsed as HTML. Any additional white space is brought down to one space.
A simple fix would to replace all spaces with the html entitity to force browsers to display each space.
I wouldn't store the string with all the &nbps; in the database, but when you show it the would ensure that each space is seen.
EDIT
I mean only replace spaces on render...like:
print str_replace(' ', ' ', $inputdata);
HTML is capable of showing only one space. I'm not really sure why, but if you check your source code of rendered webpage containing your string, you'll see that it contains all the space, the browser just doesn't show it.
The same is for other space characters, as tabs.
The way to deal with it depends on type of your content. You can either replace spaces with or leave it as it is or do something completely different, i.e. strip more than one space down to one space.
It really depends on naturel of your data–the only real situation, when you would need more spaces than one, that comes to my mind is if you're trying to indent things with spaces, what actually isn't that great idea.
Edit: older resource:
http://www.sightspecific.com/~mosh/WWW_FAQ/nbsp.html

PHP "spinning" content via random find/replace

I'm using file_get_contents('mysourcefile.html') to load the contents of mysourcefile.html into mysql db.
I have two things that I want to do to the contents of mysourcefile.html before I insert it into the db.
First...
I'd like to do a find/replace on specific string matches contained in mysourcefile.
For example: the tags that a user may place in their source input files would look something like this:
Welcome to [site-name], located at
[site-url] contact us at [site-email]
if you need help.
And I'd like to do a simple string match replacement on these values as they appear in the source file before they are written to the db. The replacement text would come from the wordpress database setup fields. For example, get_option('admin_email') and get_option('home')
Secondly...
I'd also like to allow the user to specify, via a special bracket, a string of words in which to use in order to randomize the content each time its imported, using the same input source file.
For example, in the above sentence, it might be encoded in the source file like so:
I'd also [%like|prefer|want%] to
[%allow|permit%]the user to
[%specify|declare|select%] via a
[%special|unique|predefined%] bracket,
a string of [%words|characters|text%]
in which to use in order to randomize
the content from site to site, using
the same input source file.
So I want to parse that content string and do a simple random replacement of each set element to pick one word out of the collection and use that word for the insert.
Its basically a crude content replacement/spinner and I'm looking for some direction and methods which I could use to do it.
For the first part:
$tags = array('[site-name]', '[site-url'], '[site-email]');
$words = array("My Name", "My URL", "My Email");
$content = str_replace($tags, $words, $content);
The second part might be a little trickier. But the process is:
Grab the content between "[%" and "%]" tags.
implode("|", $string);
Pick a random value
So .. you'll need someone who knows Regex.

Categories