unicode - (grep) Regex to match non-ASCII characters? -
On Linux, I have a lot of files, some of them have non-ASCII characters, but they all are valid. There is a bug in a program that prevents it from working with non-ASCII filenames, and I have to find out how many impressions I was going to do with this search
and then the non- ASCII used to print the characters, and then do a wc -l
to find the number. It does not have to be grep; I can use any standard UNIX, such as ,,, etc.
However, is there a regular expression for 'any letter that is not ASCII character'?
This will match a non-ASCII character:
[ ^ \ X00- \ x7F]
This is a valid ( Perl-compatible regular expression ).
You can also use the shorthound:
-
[[ascii:]]
- matches a single ASCII character -
[^ [: ascii:]]
- matches a single non-ASCII letter
[^ [: Print:]]
Maybe it will be enough for you. **
Comments
Post a Comment