PHP: How to preg_split by full stop? - php

I want to take a string and split it (or explode it) into an array by full-stops (periods).
I used to have:
$processed_data = explode(".", $raw_data);
but this removes the full-stop.
Researching, I found preg_split, so tried:
$processed_data = preg_split('\.', $raw_data, PREG_SPLIT_DELIM_CAPTURE);
with both \. and \\.
but try as I might, I cannot find a way to properly include the full-stop.
Would anyone know the right way to do this?
The expected result is:
The string
$raw_data = 'This is my house. This is my car. This is my dog.';
Is broken into an array by full-stop, eg:
array("This is my house.", "This is my car.", "This is my dog.")

To split a string into sentences:
preg_match_all('~\s*\K[^.!?]*[.!?]+~', $raw_data, $matches);
$processed_data = $matches[0];
Note: if you want to handle edge cases like abbreviations, a simple regex doesn't suffice, you need to use nltk or any other nlp tool with a dictionary.

Can you try this.
$string = preg_replace("/\.\s?([A-Z])/", "*****$1", $raw_data);
$array = explode("*****", $string);

Related

How to remove certain Part of JSON

Sorry for my bad English in Advance. Here is my JSON return.
https://images-na.ssl-images-amazon.com/images/M/MV5BYzc3OGZjYWQtZGFkMy00YTNlLWE5NDYtMTRkNTNjODc2MjllXkEyXkFqcGdeQXVyNjExODE1MDc#._V1_UY268_CR5,0,182,268_AL_.jpg
How can I remove the part UY268_CR5,0,182,268_AL_. Specifically that part only. And I have many of this links. Each having different strings there. For example:
https://m.media-amazon.com/images/M/MV5BYWNlMWMxOWYtZWI0Mi00ZTg0LWEwZTMtZTEzZDY0NzAxYTA4XkEyXkFqcGdeQXVyMTQxNzMzNDI#._V1_UX182_CR0,0,182,268_AL_.jpg
As shown it is different. I want to remove the part UX182_CR0,0,182,268_AL_. Each of the results I have has almost the same structure but the end part I want to remove. I am on laravel and so I am encoding my jsons result from controller. Is there anyone this can be done with php?
Update:
Here is the code I tried.
$json = json_decode($data,true);
$slice = str_replace("UY268_CR5,0,182,268_AL_","", $json);
return $slice ['poster'];
The string is removed but what about different strings with different URL's like mentioned above?
You can try with preg_replace() with the combination of lookahead and lookbehind
<?php
$re = '/(?<=_V1_)(.+?)(?=.jpg)/';
$str = 'https://m.media-amazon.com/images/M/MV5BYWNlMWMxOWYtZWI0Mi00ZTg0LWEwZTMtZTEzZDY0NzAxYTA4XkEyXkFqcGdeQXVyMTQxNzMzNDI#._V1_UX182_CR0,0,182,268_AL_.jpg';
$subst = '';
$result = preg_replace($re, $subst, $str, 1);
echo "The result of the substitution is ".$result;
?>
DEMO: https://eval.in/1044470
REFF: Regex lookahead, lookbehind and atomic groups
REGEX EXPLANATION: https://regex101.com/r/aHAw5f/1

Regex to replace a string from one place to another in the same record

I have a record separated by | symbol. I need to replace a string from one place to another in the same record:
My input looks like this:
BANG|ADAR|**285815**|MOTOR|GOOD||INDIA|2.4|SOFTWARE|285816_AKS|SAB_PART|**AKS_PN|285816**
I need to replace 285815 with the string after AKS_PN, in this case I need to replace 285815 with 285816.
With the (([^|]*\|){3})(.*) I am able to fetch 285815, need help in fetching string after AKS_PN in the same regular expression.
I am aware of how to replace 285815 with 285816. I am using PHP.
Regex solution
You need to use capturing groups. In general:
(everything_before)(interesting_part_1)(between)(interesting_part_in_the_end)
Afterwards, just put it together as you wish
(everything_before)(interesting_part_in_the_end)(between)
This leaves (interesting_part_1) out of the final string.
In your specific example this might come down to
^((?:[^|]*\|){2})([^|]*)\|(.*?AKS_PN)\|(.*)
which would need to be replaced by
$1$4|$3
See an example on regex101.com (still not sure what to do with 285815 here).
Everything in PHP:
<?php
$string = "BANG|ADAR|285815|MOTOR|GOOD||INDIA|2.4|SOFTWARE|285816_AKS|SAB_PART|AKS_PN|285816";
$regex = '~^((?:[^|]*\|){2})([^|]*)\|(.*?AKS_PN)\|(.*)~';
$string = preg_replace($regex, "$1$4|$3", $string);
echo $string;
# BANG|ADAR|285816|MOTOR|GOOD||INDIA|2.4|SOFTWARE|285816_AKS|SAB_PART|AKS_PN
?>
Non-regex solution
You don't even need a regular expression here (far too complicated), just split, switch and join afterwards:
<?php
$string = "BANG|ADAR|285815|MOTOR|GOOD||INDIA|2.4|SOFTWARE|285816_AKS|SAB_PART|AKS_PN|285816";
$parts = explode("|", $string);
$parts[2] = $parts[count($parts) - 1];
$string = implode("|", $parts);
echo $string;
?>

How to find which string occuring first in text among multiple strings?

I have text like this, "wow! It's Amazing.". I need to split this text by either "!" or "." operator and need to show the first element of array(example $text[0]).
$str="wow! it's, a nice product.";
$text= preg_split('/[!.]+/', $str);
here $text[0] having the value of "wow" only. but I want to know which string occurring first in text (whether its "!" or "."), so that I will append it to $text[0] and shown like this "wow!".
I want to use this preg_split in smarty templates.
<p>{assign var="desc" value='/[!.]+/'|preg_split:'wow! it's, a nice product.'}
{$desc[0]}.</p>
the above code displays the result as "wow". There is no preg_match in smarty, so far i have searched.other wise,i would use that.
Any help would be appreciated.Thanks in Advance.
Instead of preg_split you should use preg_match:
$str="wow! it's, a nice product.";
if ( preg_match('/^[^!.]+[!.]/', $str, $m) )
$s = $m[0]; //=> wow!
If you must use preg_split only then you can do:
$arr = preg_split('/([^!.]+[!.])/', $str, -1, PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
$s = $arr[0]; //=> wow!
Try this
/(.+[!.])(.+)/
it will split the string in to two.
$1 => wow!
$2 => it's, a nice product.
see here

How to parse string pattern in php while the string is complex

i need your help about how to parse the string. I have a string with structure below :
MALANG|TVhHMTAwMDBK MALANGBONG,GARUT|QkRPMjA3MTlK MALANGKE BARAT,MASAMBA|VVBHMjMzMDVK MALANGKE,MASAMBA|VVBHMjMzMDRK
I'm now confuse how to parse this string so that i can get a pattern like this :
MALANG|TVhHMTAwMDBK
MALANGBONG,GARUT|QkRPMjA3MTlK
MALANGKE BARAT,MASAMBA|VVBHMjMzMDVK
MALANGKE BARAT,MASAMBA|VVBHMjMzMDVK
The pattern output are City_Name|RandomCode
I have try to use explode by space, but the city name sometimes also contains a space. What function in php i could use to solve this problem?
Try this one out. It fits your example ok
$str = 'MALANG|TVhHMTAwMDBK MALANGBONG,GARUT|QkRPMjA3MTlK MALANGKE BARAT,MASAMBA|VVBHMjMzMDVK MALANGKE,MASAMBA|VVBHMjMzMDRK';
$pattern = '/(?<=^| )[A-Z, ]+?\|[A-Za-z0-9]+(?= |$)/';
if (preg_match_all($pattern, $str, $matches)) {
$parts = $matches[0];
}
You may need to tweak some of the character classes if say your city names contain anything other than capital letters, spaces and commas.
Example here - http://codepad.viper-7.com/6ujl3p
Alternatively, if the RandomCode parts are guaranteed to all be 12 characters long, preg_split may be a better fit, eg
$pattern = '/(?<=\|[A-Za-z0-9]{12}) /';
$parts = preg_split($pattern, $str);
Demo here - http://codepad.viper-7.com/Wd4Wmc

php string replace/remove

Say I have strings: "Sports Car (45%)", or "Truck (50%)", how can I convert them to "Sports_Car" and "Truck".
I know about str_replace or whatever but how do I clip the brackets and numbers part off the end? That's the part I'm struggling with.
You can do:
$s = "Sports Car (45%)";
$s = preg_replace(array('/\([^)]*\)/','/^\s*|\s*$/','/ /'),array('','','_'),$s);
See it
There are a few options here, but I would do one of these:
// str_replace() the spaces to _, and rtrim() numbers/percents/brackets/spaces/underscores
$result = str_replace(' ','_',rtrim($str,'01234567890%() _'));
or
// Split by spaces, remove the last element and join by underscores
$split = explode(' ',$str);
array_pop($split);
$result = implode('_',$split);
or you could use one of a thousand regular expression approaches, as suggested by the other answers.
Deciding which approach to use depends on exactly how your strings are formatted, and how sure you are that the format will always remain the same. The regex approach is potentially more complicated but could afford finer-grained control in the long term.
A simple regex should be able to achieve that:
$str = preg_replace('#\([0-9]+%\)#', '', $str);
Of course, you could also choose to use strstr() to look for the (
You can do that using explode:
<?php
$string = "Sports Car (45%)";
$arr = explode(" (",$string);
$answer = $arr[0];
echo $answer;
?>

Categories