preg_match all the occurrences in a line - php

Example (file=xref.tex):
This is a example string and first line with <xref>id1</xref>then,<xref>id2</xref>and with no line breaks<xref>id3</xref>.
This is a second line which has <xref>id4</xref>
Example (file=id):
id1 eqvalue1
id2 eqvalue2
id3 eqvalue3
id4 eqvalue4
Requirement: Every unique id has a equivalent value. I need to replace that equivalent value in the place of id in each occurrences in "xref.tex" file.
Tried so far:
$xref=file("xref.tex");
$idfile=file("id");
for($y=0;$y<count($xref);$y++){
for($z=0;$z<count($idfile);$z++){
$idvalue=explode(" ",$idfile[$z])//exploding based on space charac
$id1=$idvalue[0]; //this is equivalent value of unique id
$id2=$idvalue[1]; // this is unique id
preg_match( '/<xref>(.*?)<\/xref/', $xref[$y], $match );
//getting the content between "<xref>"and "</xref>"
if($match[1]===$id2{
$xref[$y]=str_replace($match[1],$id1,$xref[$y]);}
//here first occurrence of id is replaced. how to replace
//second occurrence of id in a line as
//preg_match( '/<xref>(.*?)<\/xref/', $xref[$y], $match )
//this regex focusing on first occurrence only every time.
//???? prob here is how can i do this logic in all the occurrences
//in each line
}
}
}
Expected output:
This is a example string and first line with <xref>eqvalue1</xref>then,<xref>eqvalue2</xref>and with no line breaks<xref>eqvalue3</xref>.
This is a second line which has <xref>eqvalue4</xref>

Here is what I understand. The contents of the file xref.tex is as follows
<xref>id1</xref><xref>id2</xref><xref>id3</xref><xref>id4</xref> //line 1
<xref>id2</xref><xref>id3</xref> //line 2
<xref>id4</xref> //line 3
... and so on
First of all, you have to fix the regex. You're missing > at the end of it. It should be
/<xref>(.*?)<\/xref>/
Then you need to use preg_match_all instead of preg_match as suggested.
I've modified the code a little bit. This should also work if you have same id repeating in a single line.
$xref=file("xref.tex");
$idfile=file("id");
for($y=0;$y<count($xref);$y++)
{
preg_match_all( '/<xref>(.*?)<\/xref/', $xref[$y], $match ); //get all matches and store them in *match*
for($z=0;$z<count($idfile);$z++)
{
$idvalue=explode(" ",$idfile[$z]);
$id1=$idvalue[0];
$id2=$idvalue[1];
//Below, we're replacing all the matches in line with corresponding value. Edit: Maybe not the best way, but it will give you an idea.
foreach($match[0] as $matchItem)
$xref[$y]=str_replace($matchItem,$id1,$xref[$y]);
}
}
EDIT
You might want to check preg_replace. I think that would be a better solution.

Read the file "id" as space separated csv to an array and then use that array with preg_replace on the other file as string using file_get_contents.

Try this:
$re = "/(<xref>[^\\d]+)(\\d)(<\\/xref)/m";
$str = "This is a example string and first line with <xref>id1</xref>then,<xref>id2</xref>and with no line breaks<xref>id3</xref>. This is a second line which has <xref>id4</xref>";
$subst = "$1eqvalue$2$3";
$result = preg_replace($re, $subst, $str);
Live demo

Related

Remove Next line at the beginning of a string in php

Im planning to remove all the Next line at the beginning of the string,
i tried using. str_replace("\n",null,$resultContent) it gives me the result that all Next line are removed.
Example. i need to remove the next line at the beginning of this string
"
String here
String following."
I need to delete the next line at the beginning
Please refer this page .
http://www.w3schools.com/php/func_string_trim.asp
use ltrim($resultContent,"\n") to remove all new line chars from starting of string.
Just explode and take the first result
Do not forget to do some test : if !is_array() .....
$x = explode("\n", $resultContent);
$resultContent = $x[0];
You can also use it like this:
if(startsWith($resultContent, '\n')) { // true: if string starts with '\n'
str_replace("\n",null,$resultContent);
}
Not sure whether you just want to strip blank lines, or remove everything after the \n from the first instance of a word... went with the former so hopefully this is what you're after:
$string = "String first line
string second line";
$replaced = preg_replace('/[\r\n]+/', PHP_EOL, $string);
echo $replaced;
Returns:
String first line
string second line
sounds like ltrim() is what you're looking for:
ltrim — Strip whitespace (or other characters) from the beginning of a
string!
echo $result_string = ltrim($string);

How to get a new line from peragraph using preg_match_all function in php?

I have a peragraph with some new lines :
First line
Second line
Third line
And this is the last line
I want to get the second line from the above peragraph.
So the result I want should be :
"Second line"
I have tried the following script with preg_match_all() function but I don't know why it's not working.
<?php
$pera="First line
Second line
Third line
And this is the last line";
preg_match_all("#\n+{2}.*+#",$pera,$results);
print_r($results);
Do you have any idea how to get the second line from the paragraph?
Any help is much appriciated.
Thanks!
Only for the purpose demonstrated, explode is really better for performance, but if you do want/have to use regex, don't use preg_match_all. That makes it global but you don't need that so go with preg_match. Then, change the pattern:
\n{2}.*
This will match the second line including leading newline character.
https://regex101.com/r/jA3dL9/1
If you want to match w/o the newline, use a capturing group:
\n{2}(.*)
Try this:
$pera="First line
Second line
Third line
And this is the last line";
$results = explode("\n", $pera);
print_r($results[2]);
Try with:
$data = array_values(
array_filter(
explode("\r\n", $pera) // or just \n
)
);
echo $data[1]; // n°line - 1
Demo: http://3v4l.org/gpgOj

RegEx or Similar - Grab string preceding matched value

Here's the deal, I am handling a OCR text document and grabbing UPC information from it with RegEx. That part I've figured out. Then I query a database and if I don't have record of that UPC I need to go back to the text document and get the description of the product.
The format on the receipt is:
NAME OF ITEM 123456789012
OTHER NAME 987654321098
NAME 567890123456
So, when I go back the second time to find the name of the item I am at a complete loss. I know how to get to the line where the UPC is, but how can I use something like regex to get the name that precedes the UPC? Or some other method. I was thinking of somehow storing the entire line and then parsing it with PHP, but not sure how to get the line either.
Using PHP.
Get all of the names of the items indexed by their UPCs with a regex and preg_match_all():
$str = 'NAME OF ITEM 123456789012
OTHER NAME 987654321098
NAME 567890123456';
preg_match_all( '/^(.*?)\s+(\d+)/m', $str, $matches);
$items = array();
foreach( $matches[2] as $k => $upc) {
if( !isset( $items[$upc])) {
$items[$upc] = array( 'name' => $matches[1][$k], 'count' => 0);
}
$items[$upc]['count']++;
}
This forms $items so it looks like:
Array (
[123456789012] => NAME OF ITEM
[987654321098] => OTHER NAME
[567890123456] => NAME
)
Now, you can lookup any item name you want in O(1) time, as seen in this demo:
echo $items['987654321098']; // OTHER NAME
You can find the string preceding a value you know with the following regex:
$receipt = "NAME OF ITEM 123456789012\n" .
"OTHER NAME 987654321098\n" .
"NAME 567890123456";
$upc = '987654321098';
if (preg_match("/^(.*?) *{$upc}/m", $receipt, $matches)) {
$name = $matches[1];
var_dump($name);
}
The /m flag on the regex makes the ^ work properly with multi-line input.
The ? in (.*?) makes that part non-greedy, so it doesn't grab all the spaces
It would be simpler if you grabbed both the name and the number at the same time during the initial pass. Then, when you check the database to see if the number is present, you already have the name if you need to use it. Consider:
preg_match_all('^([A-Za-z ]+) (\d+)$', $document, $matches, PREG_SET_ORDER);
foreach ($matches as $match) {
$name = $match[1];
$number = $match[2];
if (!order_number_in_database($number)) {
save_new_order($number, $name);
}
}
You can use lookahead assertions to match string preceding the UPC.
http://php.net/manual/en/regexp.reference.assertions.php
By something like this: ^\S*(?=\s*123456789012) substituting the UPC with the UPC of the item you want to find.
I'm lazy, so I would just use one regex that gets both parts in one shot using matching groups. Then, I would call it every time and put each capture group into name and upc variables. For cases in which you need the name, just reference it.
Use this type of regex:
/([a-zA-Z ]+)\s*(\d*)/
Then you will have the name in the $1 matching group and the UPC the $2 matching group. Sorry, it's been a while since I've used php, so I can't give you an exact code snippet.
Note: the suggested regex assumes you'll only have letters or spaces in your "names" if that's not the case, you'll have to expand the character class.

In PHP Remove several characters from the beginning of a String?

I need to find a specic line of text, from a text-file,
and then copy it to a new text-file:
1: I have a text file with several lines of text, eg:
JOHN
MIKE
BEN
*BJAMES
PETE
2: So, I read that text-files contents into an array,
with each line of text, placed into a seperate element of the array.
3: Then I tested each element of the array,
to find the line that starts with, say: *B ie:
if ( preg_match( "/^\*(B)/",$contents[$a] ) )
Which works ok...
4: Then I copy (WRITE) that line of text, to a new text-file.
Q: So how can I remove the '*B' from that line of text,
BEFORE I WRITE it to the new text-file ?
If you already use preg_match, you can modify your regex to get what you want in another variable.
if (preg_match('/^\*B(.*)$/', $contens[$a], $matches)
{
fwrite($targetPointer, $matches[1]);
}
After using preg_matchthe variable $matches holds the single matches of subparts of the regex enclosed in brackets. So the relevant part of your line ist matched by (.*) and saved into $matches[1].
This approach writes the lines as the file is read, which is more memory efficient:
$sourceFile = new SplFileObject('source.txt');
$destinationFile = new SplFileObject('destination.txt', 'w+');
foreach (new RegexIterator($sourceFile, '/^\*B.*/') as $filteredLine) {
$destinationFile->fwrite(
substr_replace($filteredLine, '', 0, 2)
);
}
demo
With substr or preg_replace.
Have a try with:
preg_replace('/^\*B/', '', $content[$a], -1, $count);
if ($count) {
fwrite($file, $content[$a]);
}

Delete all lines that doesn't contain a specific word in php

i need some code which can delete/filter arrays which doesn't contain a specific word
or we can say keep only that contain a specific word and drop all other ones
which one use less resource ????
update : the correct answer to my problem is
<?php
$nomatch = preg_grep("/{$keyword}/i",$array,PREG_GREP_INVERT);
?>
Notice the PREG_GREP_INVERT.
That will result in an array ($nomatch) that contains all entries of $array where $keyword IS NOT found.
so you have to remove that invert and use it :)
$nomatch = preg_grep("/{$keyword}/i",$array);
now it will get only that lines which have that specific word
You can use preg_grep with
$nomatch = preg_grep("/$WORD/i",$array,PREG_GREP_INVERT);
A more general solution is to use array_filter with a custom filter
function inverseWordFilter($string)
{
return !preg_match("/$WORD/i" , $string);
}
$newArray = array_filter ( $inputArray, "inverseWordFilter" )
The /i at the end of the pattern means case insenstive, remove it to make it case sensitive
you can use preg_grep with PREG_GREP_INVERT option
$alines[0] = 'Line one';
$alines[1] = 'line with the word magic';
$alines[2] = 'last line';
$word = 'Magic';
for ($i=0;$i<count($alines);++$i)
{
if (stripos($alines[$i],$word)!==false)
{
array_splice($alines,$i,1);
$i--;
}
}
var_dump($alines);
Since this is such a simple problem, I'll give you pseudo-code instead of the actual code - to make sure you still have some fun with it:
Create a new string where you'll keep the result
Split the original text into an array of lines using explode()
Iterate over the lines:
- Check whether the current line contains your specific word (use substr_count())
-- If it does, skip over that line
-- If it does not, append the line to the result

Categories