regular expressions for one sentence per line

regular expressions for one sentence per line - php

I am trying to take a text area value and run it through regular expression to split it to lines.
so if someone wrote a line then enter and another line and enter the i will have an array with each line per array value
The expression I've came up with so far is :
(.+?)\n|\G(.*)
and this is how i use it(from a website i use to test expressions http://myregextester.com/)
$sourcestring="
this is a sentense yeaa
interesting sentense
yet another sentese
";
preg_match_all('/(.+?)\n|\G(.*)/',$sourcestring,$matches);
echo "<pre>".print_r($matches,true);
however there is 1 object in the array that always empty and i am trying to find a way to get rid of it.
Thanks in advanced.

You don't need a regex for this, just use explode(), like so:
$lines = explode( "\n", trim( $input));
Now each line of the user's $input will be a single array entry in $lines.

This will do and get rid of the empty lines in the beginning and end of the array
explode("\n", trim($sourcestring));
See example: http://viper-7.com/pNqtvV

There are various types of newlines. In HTML form context you'll typically receive CR LF for line endings. A dumb explode will do, but a regex will catch all variations if you use \R. Thus \r\n and \n or \r and others will be processed by:
$lines = preg_split(':\R:', $text);
preg_split() is the equivalent to PHPs explode(). So you don't need to use preg_match_all.

Related

php preg_match() code for comma separated names

I need to validate input patterns using preg_match() so that the patterns is like " anyname1,anyname2,anyname3, ".
Note that there is a comma at the end too. Any letter or number is valid between the commas, numbers do not have to appear at the end. e.g "nam1e,Na2me,NNm22," is valid.
I tried ^([A-Za-z0-9] *, *)*[A-Za-z0-9]$ but did no work. I have gone through other posts too but did not get a perfect answer.
Can someone give me an answer for this?

If you just want the actual values without the comma, then you can simply use this:
\w+(?=[,])
http://regex101.com/r/xT6wE4/1

It sounds like you want to validate that the string contains a series of comma separated alpha-numeric substrings, with an optional trailing comma.
In that situation, this should achieve what you want.
$str = "anyname1,anyname2,anyname3,";
$re = '~^([a-z0-9]+,)+$~i';
if (preg_match($re, $str)) {
// String matches the pattern!
}
else {
// Nope!
}
If the value stored in $str contains a trailing space like in your example, and you don't want to use trim() on the value, the following regex will allow for whitespace at the end of $str:
~^([a-z0-9]+,)+\s*$~i

Why use such a complex solution for a simple problem? You can do the same in two steps:
1: trim spaces, line feeds, line returns and comma's:
$line = trim($line," \r\n,");
2: explode on comma's to see all the names:
$array = explode(',',$line);
You're not telling us what you're going to use it for, so I cannot know which format you really need. But my point is that you don't need complex string functions to do simple tasks.

^([a-zA-Z0-9]+,)+$
You can simply do this.See demo.
http://regex101.com/r/yR3mM3/8

Turning multi-line string into multi-element array using regular expressions in PHP

I need to split the following string and put each new line into a new array element.
this is line a.(EOL chars = '\r\n' or '\n')
(EOL chars)
this is line b.(EOL chars)
this is line c.(EOL chars)
this is the last line d.(OPTIONAL EOL chars)
(Note that the last line might not have any EOL characters present. The string also sometimes contains only 1 line, which is by definition the last one.)
The following rules must be followed:
Empty lines (like the second line) should be discarded and not put
into the array.
EOL chars should not be included, because otherwise
my string comparisons fail.
So this should result in the following array:
[0] => "this is line a."
[1] => "this is line b."
[2] => "this is line c."
[3] => "this is the last line d."
I tried doing the following:
$matches = array();
preg_match_all('/^(.*)$/m', $str, $matches);
return $matches[1];
$matches[1] indeed contains each new line, but:
Empty lines are included as well
It seems that a '\r' character gets smuggled in anyway at the end of the strings in the array. I suspect this has something to do with the regex range '.' which includes everything except '\n'.
Anyway, I've been playing around with '\R' and whatnot, but I just can't find a good regex pattern that follows the two rules I outlined above. Any help please?

Just use preg_split() to split on the regular expression:
// Split on \n, \r is optional..
// The last element won't need an EOL.
$array = preg_split("/\r?\n/", $string);
Note, you might also want to trim($string) if there is a trailing newline, so you don't end up with an extra empty array element.

There is a function just for this - file()

I think preg_split would be the way to go... You can use an appropriate regexp to use any EOL character as separator.
Something like the following (the regexp needs to be a bit more elaborate):
$array = preg_split('/[\n\r]+/', $string);
Hope that helps,

Use preg_split function:
$array = preg_split('/[\r\n]+/', $string);

Problem using regex to remove number formatting in PHP

I'm having this issue with a regular expression in PHP that I can't seem to crack. I've spent hours searching to find out how to get it to work, but nothing seems to have the desired effect.
I have a file that contains lines similar to the one below:
Total','"127','004"','"118','116"','"129','754"','"126','184"','"129','778"','"128','341"','"127','477"','0','0','0','0','0','0
These lines are inserted into INSERT queries. The problem is that values like "127','004" are actually supposed to be 127,004, or without any formatting: 127004. The latter is the actual value I need to insert into the database table, so I figured I'd use preg_replace() to detect values like "127','004" and replace them with 127004.
I played around with a Regular Expression designer and found that I could use the following to get my desired results:
Regular Expression
"(\d+)','(\d{3})"
Replace Expression
$1$2
The line on the top of this post would end up like this: (which is what I am after)
Total','127004','118116','129754','126184','129778','128341','127477','0','0','0','0','0','0
This, however, does not work in PHP. Nothing is being replaced at all.
The code I am using is:
$line = preg_replace("\"(\d+)','(\d{3})\"", '$1$2', $line);
Any help would be greatly appreciated!

There are no delimiters in your regex. Delimiters are required in order for PHP to know what is the pattern to match and what is a pattern modifier (e.g. i - case-insensitive, U - ungreedy, ...). Use a character that doesn't occur in your pattern, typically you'll see a slash '/' used.
Try this:
$line = preg_replace("/\"(\d+)','(\d{3})\"/", '$1$2', $line);

You forgot to wrap your regular expression in front-slashes. Try this instead:
"/\"(\d+)','(\d{3})\"/"

use preg_replace("#\"(\d+)','(\d+)\"#", '$1$2', $s); instead of yours

PHP explode function with file_get_contents?

<?php
$str = "Hello world. It's a beautiful day.";
print_r (explode(" ",$str));
?>
The above code prints an array as an output.
If I use
<?php
$homepage = file_get_contents('http://www.example.com/data.txt');
print_r (explode(" ",$homepage));
?>
However it does not display individual numbers in the text file in the form of an array.
Ultimately I want to read numbers from a text file and print their frequency. The data.txt has 100,000 numbers. One number per line.

A new line is not a space. You have to explode at the appropriate new line character combination. E.g. for Linux:
explode("\n",$homepage)
Alternatively, you can use preg_split and the character group \s which matches every white space character:
preg_split('/\s+/', $homepage);
Another option (maybe faster) might be to use fgetcsv.

If you want the content of a file as an array of lines, there is already a built-in function
var_dump(file('http://www.example.com/data.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES));
See Manual: file()

Try exploding at "\n"
print_r (explode("\n",$homepage));
Also have a look at:
http://php.net/manual/de/function.file.php

You could solve it by using a Regexp also:
$homepage = file_get_contents("http://www.example.com/data.txt");
preg_match_all("/^[0-9]+$/", $homepage, $matches);
This will give you the variable $matches which contains an array with numbers. This will ensure it will only retrieve lines that have numbers in them in case the file is not well formatted.

You are not exploding the string using the correct character. You either need to explode on new line separator (\n) or use a regular expression (will be slower but more robust). In that case, use preg_split

Regex to strip some lines out of a text file

I need to try and strip out lines in a text file that match a pattern something like this:
anything SEARCHTEXT;anything;anything
where SEARCHTEXT will always be a static value and each line ends with a line break. Any chance someone could help with the regext for this please? Or give me some ideas on where to start (been to many years since I looked at regex).
I am planning on using PHP's preg_replace() for this.
Thanks.

This solution removes all lines in $text which contain the sub-string SEARCHTEXT:
$text = preg_replace('/^.*?SEARCHTEXT.*\n?/m', '', $text);
My benchmark tests indicate that this solution is more than 10 times faster than '/\n?.*SEARCHTEXT.*$/m' (and this one correctly handles the case where the first line matches and the second one doesn't).

Use a regex to match the whole line like so:
^.*SEARCHTEXT.*$
preg_replace would be a good option for this.
$str = preg_replace('/\n?.*SEARCHTEXT.*$/m', '', $str);
The \n escape matches the line break for the matched line. This way matched lines are removed and the replace method does not just leave empty lines in the string.
The /m flag makes the caret (^) match the start of each line instead of the start of the string.

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

regular expressions for one sentence per line - php

You don't need a regex for this, just use explode(), like so: $lines = explode( "\n", trim( $input)); Now each line of the user's $input will be a single array entry in $lines.

This will do and get rid of the empty lines in the beginning and end of the array explode("\n", trim($sourcestring)); See example: http://viper-7.com/pNqtvV

Related

php preg_match() code for comma separated names

Turning multi-line string into multi-element array using regular expressions in PHP

Problem using regex to remove number formatting in PHP

PHP explode function with file_get_contents?

Regex to strip some lines out of a text file

Categories

Resources