When I've a string:
$string = 'word1="abc.3" word2="xyz.3"';
How can I replace the point with a comma after xyz in xyz.3 and keep him after abc in abc.3?
You've provided an example but not a description of when the content should be modified and when it should be kept the same. The solution might be simply:
str_replace("xyz.", "xyz", $input);
But if you explicitly want a more explicit match, say requiring a digit after the ful stop, then:
preg_replace("/xyz\.([0-9])+/", 'xyz\${1}', $input);
(not tested)
something like (sorry i did this with javascript and didn't see the PHP tag).
var stringWithPoint = 'word1="abc.3" word2="xyz.3"';
var nopoint = stringWithPoint.replace('xyz.3', 'xyz3');
in php
$str = 'word1="abc.3" word2="xyz.3"';
echo str_replace('xyz.3', 'xyz3', $str);
You can use PHP's string functions to remove the point (.).
str_replace(".", "", $word2);
It depends what are the criteria for replace or not.
You could split string into parts (use explode or preg_split), then replace dot in some parts (eg. str_replace), next join them together (implode).
how about:
$string = 'word1="abc.3" word2="xyz.3"';
echo preg_replace('/\.([^.]+)$/', ',$1', $string);
output:
word1="abc.3" word2="xyz,3"
Related
I'm looking for a solution to strip some HTML from a scraped HTML page. The page has some repetitive data I would like to delete so I tried with preg_replace() to delete the variable data.
Data I want to strip:
Producent:<td class="datatable__body__item" data-title="Producent">Example
Groep:<td class="datatable__body__item" data-title="Produkt groep">Example1
Type:<td class="datatable__body__item" data-title="Produkt type">Example2
....
...
Must be like this afterwards:
Producent:Example
Groep:Example1
Type:Example2
So a big piece is the same except the word within the data-title piece. How could I delete this piece of data?
I tried a few things like this one:
$pattern = '/<td class=\"datatable__body__item\"(.*?)>/';
$tech_specs = str_replace($pattern,"", $tech_specs);
But that didn't work. Is there any solution to this?
Just use a wildcard:
$newstr = preg_replace('/<td class="datatable__body__item" data-title=".*?">/', '', $str);
.*? means match anything but don't be greedy
Assuming that the string looked like this:
$string = 'Producent:<td class="datatable__body__item" data-title="Producent">Example';
You could get the beginning and the end of the string with this:
preg_match('/^(\w+:).*\>(\w+)/', $string, $matches);
echo implode([$matches[1], $matches[2]]);
Which, in this case, will throw Producent:Example. So, then you could add this output to another variable/array you intend to use.
OR, since you mentioned replacing:
$string = preg_replace('/^(\w+:).*\>(\w+)/', '$1$2', $string);
But then again, checking as it would probably come in a variable number of lines:
$string = 'Producent:<td class="datatable__body__item" data-title="Producent">Example
Groep:<td class="datatable__body__item" data-title="Produkt groep">Example1
Type:<td class="datatable__body__item" data-title="Produkt type">Example2';
$stringRows = explode(PHP_EOL, $string);
$pattern = '/^(\w+:).*\>(\w+)/';
$replacement = '$1$2';
foreach ($stringRows as &$stringRow) {
$stringRow = preg_replace($pattern, $replacement, $stringRow);
}
$string = implode(PHP_EOL, $stringRows);
Which will then output the string like you expect.
Explaining my regex:
the first group catches the first word until the two dots :, then another group to catch the last word. I had previously specified anchors for both ends, but when breaking each line this wouldn't work as expected, so I kept only the beginning.
^(\w+:) => the word in the beginning of the string until two dots appear
.*\> => everything else until smaller symbol appears (escaped by slash)
(\w+) => the word after the smaller than symbol
Well maybe my question wasn't that good written. I had a table which I needed to scrape from a website. I needed the info in the table, but had to cleanup some parts as mentioned. The solution I finally made was this one and it works. It still has a little work to do with manual replacements but that is because of the stupid " they use for inch. ;-)
Solution:
\\ find the table in the sourcecode
foreach($techdata->find('table') as $table){
\\ filter out the rows
foreach($table->find('tr') as $row){
\\ take the innertext using simplehtmldom
$tech_specs = $row->innertext;
\\ strip some 'garbage'
$tech_specs = str_replace(" \t\t\t\t\t\t\t\t\t\t\t<td class=\"datatable__body__item\">","", $tech_specs);
\\ find the first word of the string so I can use it
$spec1 = explode('</td>', $tech_specs)[0];
\\ use the found string to strip down the rest of the table
$tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"" . $spec1 . "\">",":", $tech_specs);
\\ manual correction because of the " used
$tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"tbv Montage benodigde 19\">",":", $tech_specs);
\\ manual correction because of the " used
$tech_specs = str_replace("<td class=\"datatable__body__item\" data-title=\"19\">",":", $tech_specs);
\\ strip some 'garbage'
$tech_specs = str_replace("\t\t\t\t\t\t\t\t\t\t","\n", $tech_specs);
$tech_specs = str_replace("</td>","", $tech_specs);
$tech_specs = str_replace(" ","", $tech_specs);
\\ put the clean row in an array ready for usage
$specs[] = $tech_specs;
}
}
I need to erase all comments in $string which contains data from some C file.
The thing I need to replace looks like this:
something before that shouldnt be replaced
/*
* some text in between with / or * on many lines
*/
something after that shouldnt be replaced
and the result should look like this:
something before that shouldnt be replaced
something after that shouldnt be replaced
I have tried many regular expressions but neither work the way I need.
Here are some latest ones:
$string = preg_replace("/\/\*(.*?)\*\//u", "", $string);
and
$string = preg_replace("/\/\*[^\*\/]*\*\//u", "", $string);
Note: the text is in UTF-8, the string can contain multibyte characters.
You would also want to add the s modifier to tell the regex that .* should include newlines. I always think of s to mean "treat the input text as a single line"
So something like this should work:
$string = preg_replace("/\\/\\*(.*?)\\*\\//us", "", $string);
Example: http://codepad.viper-7.com/XVo9Tp
Edit: Added extra escape slashes to the regex as Brandin suggested because he is right.
I don't think regexp fit good here. What about wrote a very small parse to remove this? I don't do PHP coding for a long time. So, I will try to just give you the idea (simple alogorithm) I haven't tested this, it's just to you get the idea, as I said:
buf = new String() // hold the source code without comments
pos = 0
while(string[pos] != EOF) {
if(string[pos] == '/') {
pos++;
while(string[pos] != EOF)
{
if(string[pos] == '*' && string[pos + 1] == '/') {
pos++;
break;
}
pos++;
}
}
buf[buf_index++] = string[pos++];
}
where:
string is the C source code
buf a dynamic allocated string which expands as needed
It is very hard to do this perfectly without ending up writing a full C parser.
Consider the following, for example:
// Not using /*-style comment here.
// This line has an odd number of " characters.
while (1) {
printf("Wheee!
(*\/*)
\\// - I'm an ant!
");
/* This is a multiline comment with a // in, and
// an odd number of " characters. */
}
So, from the above, we can see that our problems include:
multiline quote sequences should be ignored within doublequotes. Unless those doublequotes are part of a comment.
single-line comment sequences can be contained in double-quoted strings, and in multiline strings.
Here's one possibility to address some of those issues, but far from perfect.
// Remove "-strings, //-comments and /*block-comments*/, then restore "-strings.
// Based on regex by mauke of Efnet's #regex.
$file = preg_replace('{("[^"]*")|//[^\n]*|(/\*.*?\*/)}s', '\1', $file);
try this:
$string = preg_replace("#\/\*\n?(.*)\*\/\n?#ms", "", $string);
Use # as regexp boundaries; change that u modifier with the right ones: m (PCRE_MULTILINE) and s (PCRE_DOTALL).
Reference: http://php.net/manual/en/reference.pcre.pattern.modifiers.php
It is important to note that my regexp does not find more than one "comment block"... Use of "dot match all" is generally not a good idea.
I have the following in a variable, |MyString|
I want to strip the leading | and the ending | returning MyString
What is the quickest and non intensive way of doing this?
Easiest way is probably
$result = trim($input, '|');
http://docs.php.net/trim
e.g.
<?php
$in = '|MyString|';
$result = trim($in, '|');
echo $result;
prints MyString
Checkout the str_replace function in PHP http://php.net/manual/en/function.str-replace.php
this should remove all '|' characters:
str_replace('|','',$myString)
You may be able to use a regular expression to only remove the first and last '|' or alternatively using the String trim() function may also work:
http://www.php.net/manual/en/function.trim.php
So, something like this:
$trimmedMyString = trim($myString, "|");
Worth trying anyway.
i made a function to replace a words in a string by putting new words from an array.
this is my code
function myseo($t){
$a = array('me','lord');
$b = array('Mine','TheLord');
$theseotext = $t;
$theseotext = str_replace($a,$b, $theseotext);
return $theseotext;
}
echo myseo('This is me Mrlord');
the output is
This is Mine MrTheLord
and it is wrong it should be print
This is Mine Mrlord
because word (Mrlord) is not included in the array.
i hope i explained my issue in good way. any help guys
regards
According to the code it is correct, but you want it to isolate by word. You could simply do this:
function myseo($t){
$a = array(' me ',' lord ');
$b = array(' Mine ',' TheLord ');
return str_replace($a,$b, ' '.$t.' ');
}
echo myseo('This is me Mrlord');
keep in mind this is kind of a cheap hack since I surround the replace string with empty spaces to ensure both sides get considered. This wouldn't work for punctuated strings. The alternate would be to break apart the string and replace each word individually.
str_replace doesn't look at full words only - it looks at any matching sequence of characters.
Thus, lord matches the latter part of Mrlord.
use str_ireplace instead, it's case insensitive.
I am trying to create a regular expression to do the following (within a preg_replace)
$str = 'http://www.site.com&ID=1620';
$str = 'http://www.site.com';
How would I write a preg_replace to simply remove the &ID=1620 from the string (taking into account the ID could be variable string length
thanks in advance
You could use...
$str = preg_replace('/[?&;]ID=\d+/', '', $str);
I'm assuming this is meant to be a normal URL, hence the [?&;]. If that's the case, the & should be a ?.
If it's part of a larger list of GET params, you are probably better off using...
parse_str($str, $params);
unset($params['ID']);
$str = http_build_query($params);
I'm guessing that & is not allowed as a character in the ID attribute. In that case, you can use
$result = preg_replace('/&ID=[^&]+/', '', $subject);
or (possibly better, thanks to PaulP.R.O.):
$result = preg_replace('/[?&]ID=[^&]+/', '', $subject);
This will remove &ID= (the second version would also remove ?ID=) plus any amount of characters that follow until the next & or end of string. This approach makes sure that any following attributes will be left alone:
$str = 'http://www.site.com?spam=eggs&ID=1620&foo=bar';
will be changed into
$str = 'http://www.site.com?spam=eggs&foo=bar';
You can just use parse_url
(that is if the URL is of the form: http://something.com?id1=1&id2=2):
$url = parse_url($str);
echo "http://{$url['host]}";