import-bot (20211) [Avatar] Offline
#1
That sure helped.
[Originally posted by nkamerka]

Hi Andrew,
That was a great help! I was toying with s///
after posting this. But forgot about the non greedy operator.
Thanks a lot.
Nimish.
import-bot (20211) [Avatar] Offline
#2
[Originally posted by nkamerka]

#!/usr/bin/perl -w
use strict;
my(@fields, $field);

while(<DATA>smilie
{
@fields = split / /;
}

for $field (@fields)
{
print "$field
";
}
__END__
{one} {two} {three four} {five} {six seven}

I want to split the line on the spaces between } and {?
With the above program I come close to it ! How can I say what I
want to say?
Also after processing the file, I want to introduce the
braces back as it is needed further up the data chain!
Any ideas?
Nimish.
import-bot (20211) [Avatar] Offline
#3
Re: How to say split line where fields enclosed by '{' and '}'?
[Originally posted by nkamerka]

Sorry the split should be split /} /
Nimish.
import-bot (20211) [Avatar] Offline
#4
Re: How to say split line where fields enclosed by '{' and '}'?
[Originally posted by jandrew]

Nimish,

In this case you might consider extracting your fields with
a m//g operation rather than a split:

@fields = m/{(.*?)}/g;

The above uses a non-greedy quantifier .*? (see chapter 10)
to capture just what is between curly braces. It will work
fine for the data you show, but will require more complexity
if your data is more complex (there might be escaped curlies
within a field).

To output each field with the braces added back can be as
simple as just putting them back and outputting each field
separately:

for $field (@fields) {
print "{$field}
";
}

To recreate the entire lines with braces added back, the easiest
method uses the map() and join() functions:

print join(" ", map{"{$_}"} @fields), "
";

Does that help?

andrew
import-bot (20211) [Avatar] Offline
#5
Oops. With backslashes escaped
[Originally posted by jandrew]

Nimish,

In this case you might consider extracting your fields with
a m//g operation rather than a split:

@fields = m/{(.*?)}/g;

The above uses a non-greedy quantifier .*? (see chapter 10)
to capture just what is between curly braces. It will work
fine for the data you show, but will require more complexity
if your data is more complex (there might be escaped curlies
within a field).

To output each field with the braces added back can be as
simple as just putting them back and outputting each field
separately:

for $field (@fields) {
print "{$field}
";
}

To recreate the entire lines with braces added back, the easiest
method uses the map() and join() functions:

print join(" ", map{"{$_}"} @fields), "
";

Does that help?

andrew
import-bot (20211) [Avatar] Offline
#6
Re: How to say split line where fields enclosed by '{' and '}'?
[Originally posted by nkamerka]

Hi Andrew,
There seems to be a small change in the data.
The fields are not all enclosed by { and }.
Only the last four fields are.
I can use two m// operators and get the result I
want as follows:
#!/usr/bin/perl -w
use strict;
my(@fields, $field, @last_four);

while(<DATA>smilie
{

@fields = m/([^ ]*) /g;
@last_four = m/{(.*?)}/g;
}
my $i = 0;
for $field (@fields)
{
# print (" Field: $field
");
print (" Field: $field
"smilie if ($i < 4);
$i++;
}
for $field (@last_four)
{
print " Field: $field
";
}

# $fields[
# print join(" ", map{"{$_}"} @fields), "
";

__END__
ab-sd t/wo th:ree 488 {five six seven} {abc} {cdf gbf} {asgd sjaj dkdka}

I tried modifying the first match line as follows:
@fields = m/([^ ]*) {4}/g with the {4} before the blank and after the blank
in the match pattern, but without success. How can I tell it to capture only
the first four instances of the pattern and then actually the second pattern
all in one something like this?
@fields = m/([^ ]*) {4}{(.*?)}/g;
This ofcourse doesn't work!

Nimish.
import-bot (20211) [Avatar] Offline
#7
Re: How to say split line where fields enclosed by '{' and '}'?
[Originally posted by jandrew]

Nimish,

To break it into both kinds of fields (non-space sequences and
sequences within curly braces) we can use alternation:

@fields = m/({.*?}|S+)/g;

We put the more complex alternation first and if it fails we just
grab a sequence of non-space characters. This will NOT guarantee
that only the last four fields are curly-brace delimited --- it
allows for: all non-space sequences, or all curly-brace delimited, or
any mixture of non-space fields and curly-brace delimited fields.

Also note that the curly braces aren't stripped in this case --- but
you can always do that as a separate step:

s/^{(.*?)}$/$1/ for @fields;

or take care of it when processing each field (or whatever is to be
done).

Does that help address the problem?

andrew