Strip RTF strings with PHP - Regex [duplicate]

Strip RTF strings with PHP - Regex [duplicate] - php

This question already has answers here:
Regular Expression for extracting text from an RTF string
(11 answers)
Closed 9 years ago.
A column in the database I work with contains RTF strings, I would like to strip these out using PHP, leaving just the sentence between.
It is a MS SQL database 2005 if I recall correctly.
An example of the kind of strings pulled from the database (need any more let me know, all the rest are similar):
{\rtf1\ansi\ansicpg1252\deff0\deflang2057{\fonttbl{\f0\fnil\fcharset0 Tahoma;}}
\viewkind4\uc1\pard\lang1033\f0\fs17 ASSEMBLE COMPONENTS AS DETAILED ON DRAWING.\lang2057\fs17\par
}
I would like this to be stripped to only return:
ASSEMBLE COMPONENTS AS DETAILED ON DRAWING.
Now, I have successfully managed to strip the characters in ASP.NET for a previous project, however I would like to do so using PHP. Here is the regular expression I used in ASP.NET, which works flawlessly may I add:
"(\{.*\})|}|(\\\S+)"
However when I try to use the same expression in PHP with a preg_replace it does not strip half of the characters.
Any regex gurus out there?

Use this code. it will work fine.
$string = preg_replace("/(\{.*\})|}|(\\\S+)/", "", $string);
Note that I added a '/' in the beginning and at the end '/' in the regex.

Related

Find characters in string and make them a link in php [duplicate]

This question already has an answer here:
PHP regex. Convert [text](url) to text [duplicate]
(1 answer)
Closed 3 years ago.
I have 1000s of posts in Wordpress that have this weird code for a hyperlink in the body copy. For example, I want to find all instances of this:
[Website Name](http://www.website.com)
and turn it into
Website Name
What is the best way to achieve this in php?
$string = "This is a blog post hey check out this website [Website Name](http://www.website.com). It is a real good domain.
// do some magic

You can use preg_replace with this regex:
\[([^]]+)]\((http[^)]+)\)
It looks for a [, followed by some non-] characters, a ] and (http, then some non-) characters until a ).
This is then replaced with $1. For example:
$string = "This is a blog post hey check out this website [Website Name](http://www.website.com). It is a real good domain.";
echo preg_replace('/\[([^]]+)]\((http[^)]+)\)/', '$1', $string);
Output:
This is a blog post hey check out this website Website Name. It is a real good domain.

This weird code is Markdown (used for example here in SO).
If you want to convert it to HTML using PHP you could use this library : https://parsedown.org/
The advantage is that you will convert any other markdown tags and other forms of markdown links present in the posts.

find/replace in PHP string [duplicate]

This question already has answers here:
PHP strtr vs str_replace benchmarking
(3 answers)
Replace text in a string using PHP
(3 answers)
Closed 5 years ago.
I have a PHP string in which I would like to find and replace using the strtr function, problem is I have variable fields so I won't be able to replace by name. The string contains tags like the following:
[field_1=Company]
[field_4=Name]
What makes it difficult is the "Company" and "Name" part of the "tag", these can be variable. So I basically looking for a way to replace this part [field_1] where "=Company" and "=Name" must be discarded. Can this be done?
To explain: I'm using "=Company" so users don't just see "field_1" but know the value it represents. However users are able to change the value to what they see fit.

You are probably looking for regular expressions. There is a function in PHP to do a regex replace:
http://php.net/manual/en/function.preg-replace.php
Been a while since I've worked in PHP but you might want to try something like this:
preg_replace('/field_\d/','REPLACEMENT','[field_1=Company]');
Should result in
[REPLACEMENT=Company]
If you want to replace everything except the brackets:
preg_replace('/field_\d+=\w+/','REPLACEMENT','[field_1=Company]');

PHP how to include everything i type in my textbox to mysql database [duplicate]

This question already has answers here:
How do I escape special characters in MySQL?
(8 answers)
Closed 8 years ago.
i was wondering how can i include everything written on my textbox to be inserted in mysql database
for example:
textbox = "{\buildrel{\lim}"
but what happens is the \ (backslash) remove 'b' and 'l' and the data inserted to my database will be
{uildrelim} somewhat like this, it might come up removing the { } as well
so is there any techniques or method you can advise? so that everything i put in my textbox will be inserted to my database as it is.
I found this solution:
i just need to use the str_replace() method to replace single \ with double \\
$textbox = str_replace('\\\','\\\\\\\',$textbox);
where {\buildrel{\lim} will be {\\\buildrel{\\\lim}

No, You dont use str_replace(), you have addslashes() and stripslashes(), those two are the two functions you are looking for.
Changing a string with str_replace functions isnt a smart thing to do. Those functions aren't created with this in mind, the add/striposlashes are. You might forget a character which you needed to str_replace with a slash.
Also, that whole battery of slashes and escaping slashes doesnt make your code very readable :P

php regex for parsing html [duplicate]

This question already has answers here:
How do you parse and process HTML/XML in PHP?
(31 answers)
Closed 3 years ago.
i need some help to parse a html, extracting everything starting with http://, containing "abc" until first occurance of " or ' or blank space.
i have some regex like this /http:\/\/abc(.*)\"/ but it's not working well :\
are there any ideas? :)
P.S. sorry for bad english, it's not my natural language ;)

StackOverflow tends to prefer an HTML Document Parser over Regular Expressions for parsing HTML.
However, with that said, if you just want URLs from a string that happens to be HTML, I still believe a Regex is fine for the job.
Try preg_match_all:
preg_match_all("/http:\/\/[^\s'\"]*abc[^\s'\"]*/", $string, $matches);

Use a parser instead of a regex.
RegEx match open tags except XHTML self-contained tags

If all you want to do is extract URLs, regexen are a good choice. You don't need to get into the parser world.
If you have unix-like command tools you could approximate it very simply (assuming one url per line) with two passes:
grep http myfile.html | grep abc
You can use preg_grep() similarly.
preg_match_all ('/http:[^"\' ]+/', $html, $urls);
# $urls contains all the urls from your document
$abc_urls = preg_grep( '/abc/', $urls );

PHP: Regular Expression to get a URL from a string [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicates:
Identifying if a URL is present in a string
Php parse links/emails
I'm working on some PHP code which takes input from various sources and needs to find the URLs and save them somewhere. The kind of input that needs to be handled is as follows:
http://www.youtube.com/watch?v=IY2j_GPIqRA
Try google: http://google.com! (note exclamation mark is not part of the URL)
Is http://somesite.com/ down for anyone else?
Output:
http://www.youtube.com/watch?v=IY2j_GPIqRA
http://google.com
http://somesite.com/
I've already borrowed one regular expression from the internet which works, but unfortunately wipes the query string out - not good!
Any help putting together a regular expression, or perhaps another solution to this problem, would be appreciated.

Jan Goyvaerts, Regex Guru, has addressed this issue in his blog. There are quite a few caveats, for example extracting URLs inside parentheses correctly. What you need exactly depends on the "quality" of your input data.
For the examples you provided, \b(?:(?:https?|ftp|file)://|www\.|ftp\.)[-A-Z0-9+&##/%=~_|$?!:,.]*[A-Z0-9+&##/%=~_|$] works when used in case-insensitive mode.
So to find all matches in a multiline string, use
preg_match_all('/\b(?:(?:https?|ftp|file):\/\/|www\.|ftp\.)[-A-Z0-9+&##\/%=~_|$?!:,.]*[A-Z0-9+&##\/%=~_|$]/i', $subject, $result, PREG_PATTERN_ORDER);
$result = $result[0];

Why not try this one. It is the first result of Googling "URL regular expression".
((https?|ftp|gopher|telnet|file|notes|ms-help):((\/\/)|(\\\\))+[\w\d:##%\/;$()~_?\+-=\\\.&]*)
Not PHP, but it should work, I just slightly modified it by escaping forward slashes.
source

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Strip RTF strings with PHP - Regex [duplicate] - php

Use this code. it will work fine. $string = preg_replace("/(\{.*\})|}|(\\\S+)/", "", $string); Note that I added a '/' in the beginning and at the end '/' in the regex.

Related

Find characters in string and make them a link in php [duplicate]

find/replace in PHP string [duplicate]

PHP how to include everything i type in my textbox to mysql database [duplicate]

php regex for parsing html [duplicate]

PHP: Regular Expression to get a URL from a string [duplicate]

Categories

Resources