Regex for stripping all non-ASCII characters

NOTE: If you want to filter telephone number data, take a look at this newer post.

This is important for me because I see a lot of creating formatting in directory entries, especially telephone number fields, that comes back to haunt me when exporting data for reports.

Of course it would be a lot better if the Netscape family of directory servers strictly enforced the schema when it came to phone number values, but unfortunately that train has left the station.

So here it is, a perl regex to strip any non-ASCII characters from an attribute value. I’ll use a cell phone number as an example.

my $mobile = $entry->get_value('mobile');
for($mobile) {
     s/[^x20-x7Ex0Ax0D]/ /g;
}

Note that in this example I replace any of the offending characters with white space.

Thanks to the indomitable Mark Smith for this one.