Friday, September 18, 2009

HTML::Restrict - Easily Strip HTML From Your Documents

I've just released HTML::Restrict to the world. It's a Perl module which allows you to strip HTML from text very easily. Here's an example:



#!/usr/bin/perl

use strict;
use warnings;

use HTML::Restrict;

my $hr = HTML::Restrict->new();
# use default rules to start with (strip away all HTML)
my $processed = $hr->process('i am bold');

# $processed now equals: i am bold

If you want to allow some HTML but not all, you can add a set of rules to allow arbitrary elements and attributes:



#!/usr/bin/perl

use strict;
use warnings;

use HTML::Restrict;

my $hr = HTML::Restrict->new();
$hr->set_rules({
b => [],
img => [qw( src alt )]
});

my $html = q[hello me];
my $processed = $hr->process( $html );

# $processed now equals: hello me


This has now been released as Open Source software and is available on the CPAN

1 comments:

Anonymous said...

Hi,
just wanted to say that it's awesome.