Feb
13
perl清除html(格式)标签, strip HTML tags
在google里搜索过,大概有以下方法:
http://www.programmingtalk.com/archive/index.php/t-6109.html
This will strip all HTML tags:
##############################
use HTML::TreeBuilder;
use HTML::FormatText;
my $s; # the text out of which you wish to strip HTML
my $formatter = HTML::FormatText->new;
my $tree = HTML::TreeBuilder->new;
$tree->parse($s);
if ($tree) {
$formatter->format($tree);
$tree->delete; # DO NOT OMIT THIS STEP!
}
##############################
或者
$code = "Thisshouldbe
$code =~ s/<.+?>//g;
print "Content-type: text/plain\n\n";
print $code;
在google里输入perl striphtml 查询得来的. 在百度里搜索没什么结果.
有时间实践一下.
北大词性标注版本
分享一种受益终生的密码设



