Remove all non-ascii characters with perl

Sometimes when creating one-liners, when the source is not "clean", you may end up with non-ascii characters which make the parsing harder.

To get rid of all of it, you can use perl.

 
cat dirty-source.txt|perl -pe 's/[^[:ascii:]]//g' > clean-output.txt
 

The same command can be used on a file directly too.

 
perl -pe 's/[^[:ascii:]]//g' dirty-source.txt
 

To create a backup (eg dirty-source.txt.bak), you can add the "-i" flag like "-i.bak"

Hope it helps!
Andrea

Leave a Reply

Your email address will not be published.