Originally Posted by
rfr3sh
pinnacle changing there website is giving me headaches now
...
so are td tags and td class tags the same or different
means a | with a specific ('foo') formating of the text. The meaning of 'foo' is described elsewhere.
Originally Posted by rfr3sh
pinnacle changing there website is giving me headaches now
for instance if im scraping
Milwaukee Brewers |
1.5 |
187 |
I am scraping td tags
however now in the new layout it is displayed as
Milwaukee Brewers M. Parra |
+1.5 -126 |
+187 |
so are td tags and td class tags the same or different
When I want to match a td tag with a regular expression I use:
Code:
m/]*>/si
This tells Perl to match a (potentially multiline) string beginning with the substring ' | ', and then ending with the closing '>'.
And of course if you wanted to get the contents within the | tag pair then (assuming you were reading from STDIN) you could use:
Code:
#!perl
use strict;
use warnings;
while(<>) {
if (m!]*>(.*?) | ]*>!si) {
warn $1;
}
}
With $1 containing the matched substring.
Originally Posted by Ganchrow
When I want to match a td tag with a regular expression I use:
Code:
m/]*>/si
This tells Perl to match a (potentially multiline) string beginning with the substring ' | ', and then ending with the closing '>'.
Man I don't think anyone understands the power of regular expressions. I get so frustrated when I'm unable to use them. I <3 Perl.
I just started with Perl only ever having used Pascal before and I am loving it.
I know perl and I am a software engineer.
for pinny's lines you might be better in the long run using an actual xml library, not simply matching regular expressions... just my 2 cents though
FWIW, using regexps to parse html is probably a bad idea. The module HTML::Parse would help alot.
Originally Posted by rockchalk24
FWIW, using regexps to parse html is probably a bad idea. The module HTML::Parse would help alot.
And what, pray tell, do you think that the now deprecated HTML::Parse module and its more robust offspring use internally?
- magic
- regular expressions
Points Awarded:
djiddish98 gave Ganchrow 5 SBR Point(s) for this post.
trixtrix gave Ganchrow 465 SBR Point(s) for this post.
|
ganchrow is back???
we missed you man!! hope all is well
here have some SBR points
The power of perl comes with regex...USE IT.
Originally Posted by Ganchrow
And what, pray tell, do you think that the now deprecated HTML::Parse module and its more robust offspring use internally?
- magic
- regular expressions
Yeah, I know it uses regular expressions. That doesn't mean you should use them manually. Your work is a lot easier when most of it is done by a module.
Originally Posted by rockchalk24
Yeah, I know it uses regular expressions. That doesn't mean you should use them manually. Your work is a lot easier when most of it is done by a module.
I'm sorry, my earlier response was overly sarcastic. That was rude of me.
Personally, I'm more of a do-it-yourself type guy and find that packaging my own regex parsing routines works infinitely better than calling prebuilt one-size-fits-all methods. But you're right ... a neophyte Perl programmer (referring to the OP) would likely have an easier go starting off with preexisting modules.
That said, regular expressions are certainly one of Perl's most powerful features and I think it behooves anyone seeking to learn the language to spend time honing his or her regex skillz.
I mean how else could one endeavor creates gems like this:
Code:
perl -e "print grep ord $_,map{y/)!#@'%{];/0-8/;y/,0-9/--/c;s/[^0-9-]//g;$x.=qq~+$_~;chr eval $x;}q=~m#([^\|&*\\ a-z]+)#gi#tr/-0-9//d;JenBird Long Live Tow! print RBS trixtrix Dozer Walker;"
Granchow could I pm you a question
Originally Posted by rfr3sh
Granchow could I pm you a question
You can post it here and if I have the time and find it sufficiently interesting I'll give it a shot.
Originally Posted by Ganchrow
I'm sorry, my earlier response was overly sarcastic. That was rude of me.
Personally, I'm more of a do-it-yourself type guy and find that packaging my own regex parsing routines works infinitely better than calling prebuilt one-size-fits-all methods. But you're right ... a neophyte Perl programmer (referring to the OP) would likely have an easier go starting off with preexisting modules.
That said, regular expressions are certainly one of Perl's most powerful features and I think it behooves anyone seeking to learn the language to spend time honing his or her regex skillz.
I mean how else could one endeavor creates gems like this:
Code:
perl -e "print grep ord $_,map{y/)!#@'%{];/0-8/;y/,0-9/--/c;s/[^0-9-]//g;$x.=qq~+$_~;chr eval $x;}q=~m#([^\|&*\\ a-z]+)#gi#tr/-0-9//d;JenBird Long Live Tow! print RBS trixtrix Dozer Walker;"
That kind of complexity reminds me of my favorite use of regular expressions:
Code:
sub is_prime {
my ($number) = @_;
return (1 x $number) !~ m/\A (?: 1? | (11+?) (?> \1+ ) ) \Z/xms;
}
(take from Perl Best Practices)
| | |