How to get Vim to highlight non-ascii characters?

RegexVim

Regex Problem Overview


I'm trying to get Vim to highlight non-ASCII characters. Is there an available setting, regex search pattern, or plugin to do so?

Regex Solutions


Solution 1 - Regex

Using range in a [] character class in your search, you ought to be able to exclude the ASCII hexadecimal character range, therefore highlighting (assuming you have hlsearch enabled) all other characters lying outside the ASCII range:

/[^\x00-\x7F]

This will do a negative match (via [^]) for characters between ASCII 0x00 and ASCII 0x7F (0-127), and appears to work in my simple test. For extended ASCII, of course, extend the range up to \xFF instead of \x7F using /[^\x00-\xFF].

You may also express it in decimal via \d:

/[^\d0-\d127]

If you need something more specific, like exclusion of non-printable characters, you will need to add those ranges into the character class [].

Solution 2 - Regex

Yes, there is a native feature to do highlighting for any matched strings. Inside Vim, do:

:help highlight
:help syn-match

syn-match defines a string that matches fall into a group. highlight defines the color used by the group. Just think about syntax highlighting for your vimrc files.

So you can use below commands in your .vimrc file:

syntax match nonascii "[^\x00-\x7F]"
highlight nonascii guibg=Red ctermbg=2

Solution 3 - Regex

For other (from now on less unlucky) folks ending up here via a search engine and can't accomplish highlighting of non-ASCII characters, try this (put this into your .vimrc):

highlight nonascii guibg=Red ctermbg=1 term=standout
au BufReadPost * syntax match nonascii "[^\u0000-\u007F]"

This has the added benefit of not colliding with regular (filetype [file extension] based) syntax definitions.

Solution 4 - Regex

This regex works to highlight as well. It was the first google hit for "vim remove non-ascii characters" from briceolion.com and with :set hlsearch will highlight:

/[^[:alnum:][:punct:][:space:]]/

Solution 5 - Regex

If you are interested also in the non printable characters use this one: /[^\x00-\xff]/

I use it in a function:

 function! NonPrintable()
   setlocal enc=utf8
   if search('[^\x00-\xff]') != 0
     call matchadd('Error', '[^\x00-\xff]')
     echo 'Non printable characters in text'
   else
     setlocal enc=latin1
     echo 'All characters are printable'
   endif
 endfunction

Solution 6 - Regex

Based on the other answers on this topic and the answer I got here I've added this to my .vimrc, so that I can control the non-ascii highlighting by typing <C-w>1. It also shows inside comments, although you will need to add the comment group for each file syntax you will use. That is, if you will edit a zsh file, you will need to add zshComment to the line

au BufReadPost * syntax match nonascii "[^\x00-\x7F]" containedin=cComment,vimLineComment,pythonComment

otherwise it won't show the non-ascii character (you can also set containedin=ALL if you want to be sure to show non-ascii characters in all groups). To check how the comment is called on a different file type, open a file of the desired type and enter :sy on vim, then search on the syntax items for the comment.

function HighlightNonAsciiOff()
  echom "Setting non-ascii highlight off"
  syn clear nonascii
  let g:is_non_ascii_on=0
  augroup HighlightUnicode
  autocmd!
  augroup end
endfunction

function HighlightNonAsciiOn()
  echom "Setting non-ascii highlight on"
  augroup HighlightUnicode
  autocmd!
  autocmd ColorScheme *
        \ syntax match nonascii "[^\x00-\x7F]" containedin=cComment,vimLineComment,pythonComment |
        \ highlight nonascii cterm=underline ctermfg=red ctermbg=none term=underline
  augroup end
  silent doautocmd HighlightUnicode ColorScheme
  let g:is_non_ascii_on=1
endfunction

function ToggleHighlightNonascii()
  if g:is_non_ascii_on == 1
    call HighlightNonAsciiOff()
  else
    call HighlightNonAsciiOn()
  endif
endfunction

silent! call HighlightNonAsciiOn()
nnoremap <C-w>1 :call ToggleHighlightNonascii()<CR>

Solution 7 - Regex

Somehow none of the above answers worked for me.

So I used :1,$ s/[^0-9a-zA-Z,-_\.]//g

It keeps most of the characters I am interested in.

Solution 8 - Regex

Someone already have answered the question. However, for others that are still having problems, here is another solution to highlight non-ascii characters in comments (or any syntax group in the matter). It's not the best, but it's a temporary fix.

One may try:

:syntax match nonascii "[^\u0000-\u007F]" containedin=ALL contained |
			\ highlight nonascii ctermfg=yellow guifg=yellow

This has mix parts from other solutions. You may remove contained, but, from documentation, there may be potential problem of recursing itself (as I understand). To view other defined patterns, syn-contains section would contain it.

:help syn-containedin
:help syn-contains 

Replicated issue from: Set item to higher highlight priority on vim

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionchutsuView Question on Stackoverflow
Solution 1 - RegexMichael BerkowskiView Answer on Stackoverflow
Solution 2 - RegexSteven DingView Answer on Stackoverflow
Solution 3 - RegexPAStheLoDView Answer on Stackoverflow
Solution 4 - RegexGrant BowmanView Answer on Stackoverflow
Solution 5 - RegexRemanView Answer on Stackoverflow
Solution 6 - RegexWernerView Answer on Stackoverflow
Solution 7 - Regexuser2250246View Answer on Stackoverflow
Solution 8 - RegexnateView Answer on Stackoverflow