What is the difference between Character.isAlphabetic and Character.isLetter in Java?

JavaUnicode

Java Problem Overview


What is the difference between Character.isAlphabetic() and Character.isLetter() in Java? When should one use one and when should one use the other?

Java Solutions


Solution 1 - Java

According to the API docs, isLetter() returns true if the character has any of the following general category types: UPPERCASE_LETTER (Lu), LOWERCASE_LETTER (Ll), TITLECASE_LETTER (Lt), MODIFIER_LETTER (Lm), OTHER_LETTER (Lo). If we compare isAlphabetic(), it has the same but adds LETTER_NUMBER (Nl), and also any characters having Other_Alphabetic property.

What does this mean in practice? Every letter is alphabetic, but not every alphabetic is a letter - in Java 7 (which uses Unicode 6.0.0), there are 824 characters in the BMP which are alphabetic but not letters. Some examples include 0345 (a combiner used in polytonic Greek), Hebrew vowel points (niqqud) starting at 05B0, Arabic honorifics such as saw ("peace be upon him") at 0610, Arabic vowel points... the list goes on.

But basically, for English text, the distinction makes no difference. For some other languages, the distinction might make a difference, but it is hard to predict in advance what the difference might be in practice. If one has a choice, the best answer may be isLetter() - one can always change to permit additional characters in the future, but reducing the set of accepted characters might be harder.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSimon KissaneView Question on Stackoverflow
Solution 1 - JavaSimon KissaneView Answer on Stackoverflow