How to get Vim to highlight non-ascii characters?
RegexVimRegex Problem Overview
I'm trying to get Vim to highlight non-ASCII characters. Is there an available setting, regex search pattern, or plugin to do so?
Regex Solutions
Solution 1 - Regex
Using range in a []
character class in your search, you ought to be able to exclude the ASCII hexadecimal character range, therefore highlighting (assuming you have hlsearch
enabled) all other characters lying outside the ASCII range:
/[^\x00-\x7F]
This will do a negative match (via [^]
) for characters between ASCII 0x00
and ASCII 0x7F
(0-127), and appears to work in my simple test. For extended ASCII, of course, extend the range up to \xFF
instead of \x7F
using /[^\x00-\xFF]
.
You may also express it in decimal via \d
:
/[^\d0-\d127]
If you need something more specific, like exclusion of non-printable characters, you will need to add those ranges into the character class []
.
Solution 2 - Regex
Yes, there is a native feature to do highlighting for any matched strings. Inside Vim, do:
:help highlight
:help syn-match
syn-match
defines a string that matches fall into a group.
highlight
defines the color used by the group.
Just think about syntax highlighting for your vimrc files.
So you can use below commands in your .vimrc file:
syntax match nonascii "[^\x00-\x7F]"
highlight nonascii guibg=Red ctermbg=2
Solution 3 - Regex
For other (from now on less unlucky) folks ending up here via a search engine and can't accomplish highlighting of non-ASCII characters, try this (put this into your .vimrc):
highlight nonascii guibg=Red ctermbg=1 term=standout
au BufReadPost * syntax match nonascii "[^\u0000-\u007F]"
This has the added benefit of not colliding with regular (filetype [file extension] based) syntax definitions.
Solution 4 - Regex
This regex works to highlight as well. It was the first google hit for "vim remove non-ascii characters" from briceolion.com and with :set hlsearch
will highlight:
/[^[:alnum:][:punct:][:space:]]/
Solution 5 - Regex
If you are interested also in the non printable characters use this one: /[^\x00-\xff]/
I use it in a function:
function! NonPrintable()
setlocal enc=utf8
if search('[^\x00-\xff]') != 0
call matchadd('Error', '[^\x00-\xff]')
echo 'Non printable characters in text'
else
setlocal enc=latin1
echo 'All characters are printable'
endif
endfunction
Solution 6 - Regex
Based on the other answers on this topic and the answer I got here I've added this to my .vimrc
, so that I can control the non-ascii highlighting by typing <C-w>1
. It also shows inside comments, although you will need to add the comment group for each file syntax you will use. That is, if you will edit a zsh file, you will need to add zshComment
to the line
au BufReadPost * syntax match nonascii "[^\x00-\x7F]" containedin=cComment,vimLineComment,pythonComment
otherwise it won't show the non-ascii character (you can also set containedin=ALL if you want to be sure to show non-ascii characters in all groups). To check how the comment is called on a different file type, open a file of the desired type and enter :sy
on vim, then search on the syntax items for the comment.
function HighlightNonAsciiOff()
echom "Setting non-ascii highlight off"
syn clear nonascii
let g:is_non_ascii_on=0
augroup HighlightUnicode
autocmd!
augroup end
endfunction
function HighlightNonAsciiOn()
echom "Setting non-ascii highlight on"
augroup HighlightUnicode
autocmd!
autocmd ColorScheme *
\ syntax match nonascii "[^\x00-\x7F]" containedin=cComment,vimLineComment,pythonComment |
\ highlight nonascii cterm=underline ctermfg=red ctermbg=none term=underline
augroup end
silent doautocmd HighlightUnicode ColorScheme
let g:is_non_ascii_on=1
endfunction
function ToggleHighlightNonascii()
if g:is_non_ascii_on == 1
call HighlightNonAsciiOff()
else
call HighlightNonAsciiOn()
endif
endfunction
silent! call HighlightNonAsciiOn()
nnoremap <C-w>1 :call ToggleHighlightNonascii()<CR>
Solution 7 - Regex
Somehow none of the above answers worked for me.
So I used :1,$ s/[^0-9a-zA-Z,-_\.]//g
It keeps most of the characters I am interested in.
Solution 8 - Regex
Someone already have answered the question. However, for others that are still having problems, here is another solution to highlight non-ascii characters in comments (or any syntax group in the matter). It's not the best, but it's a temporary fix.
One may try:
:syntax match nonascii "[^\u0000-\u007F]" containedin=ALL contained |
\ highlight nonascii ctermfg=yellow guifg=yellow
This has mix parts from other solutions. You may remove contained
, but, from documentation, there may be potential problem of recursing itself (as I understand). To view other defined patterns, syn-contains
section would contain it.
:help syn-containedin
:help syn-contains
Replicated issue from: Set item to higher highlight priority on vim