C++: what regex library should I use?
C++RegexLinuxC++ Problem Overview
I'm working on a commercial (not open source) C++ project that runs on a linux-based system. I need to do some regex within the C++ code. (I know: I now have 2 problems.)
QUESTION: What libraries do people who regularly do regex from C/C++ recommend I look into? A quick search has brought the following to my attention:
-
Boost.Regex (I need to go read the Boost Software License, but this question is not about software licenses)
-
C (not C++) POSIX regex (#include <regex.h>, regcomp, regexec, etc.)
-
http://freshmeat.net/projects/cpp_regex/ (I know nothing about this one; seems to be GPL, therefore not usable on this project)
C++ Solutions
Solution 1 - C++
Boost.Regex is very good and is slated to become part of the C++0x standard (it's already in TR1).
Personally, I find Boost.Xpressive much nicer to work with. It is a header-only library and it has some nice features such as static regexes (regexes compiled at compile time).
Update: If you're using a C++11 compliant compiler (gcc 4.8 is NOT!), use std::regex unless you have good reason to use something else.
Solution 2 - C++
Thanks for all the suggestions.
I tried out a few things today, and with the stuff we're trying to do, I opted for the simplest solution where I don't have to download any other 3rd-party library. In the end, I #include <regex.h> and used the standard C POSIX calls regcomp() and regexec(). Not C++, but in a pinch this proved to be the easiest.
Solution 3 - C++
In C++ projects past, I have used PCRE with good success. It's very complete and well-tested since it's used in many high profile projects. And I see that Google has contributed a set of C++ wrappers for PCRE recently, too.
Solution 4 - C++
C++ has a builtin regex library since TR1. AFAIK Boost's regex library is very compatible with it and can be used as a replacement, if your standard library doesn't provide TR1.
Solution 5 - C++
Boost has http://www.boost.org/doc/libs/1_36_0/libs/regex/doc/html/index.html">regex in it.
That should fill the bill
Solution 6 - C++
Two more options:
If you can write it in c++11 - Do the tutorial: http://www.codeguru.com/cpp/cpp/cpp_mfc/stl/article.php/c15339
Note: At the time of writing the only c++11 regex library that I know works is the clang/llvm one, and only works on Mac. The GNU still doesn't implement regex yet. I don't know about Visual Studio. Most people still use the boost regex implementation.
Or you can use ragel to generate a finite state machine to do the parsing for you, and generate the C/C++ code implementation: http://www.complang.org/ragel/
I used it a little to generate code to parse json. This ragel file: https://github.com/matiu2/yajp/blob/master/parser/number.rl is used to generate this code https://github.com/matiu2/yajp/blob/master/parser/json.hpp#L254 and this finite state machine diagram:
Update 1:
lvm's libc++ regex works on ubuntu 14.04: libc++-dev - LLVM C++ Standard library (development files). When compiling: clang++ -std=c++11 -lc++ -I/usr/include/c++/v1 ...
Update 2:
I'm currently enjoying boost spirit 3 - I like it more than regex, because it has BNF style rules and is well thought out. (Older (more documented) Spirit Qi libs found here)
Solution 7 - C++
You can also look at fast regex library that was developed at Yandex search engine for doing fast matches of thousands of patterns against huge amounts of data.
Solution 8 - C++
I've personally always used boost.regex (although I don't have much need for regex in C++). Microsoft Labs has a regex library too, called GRETA: http://research.microsoft.com/projects/greta/. Apparently it's very fast and features a whole Perl 5 syntax. I haven't used it, but you may want to test it out.
Solution 9 - C++
I faced a similar situation and ended up using Henry Spencers Regexp Engine http://www.codeproject.com/KB/string/spencerregexp.aspx
Solution 10 - C++
Noone here said anything about the one that comes with C++0x. If you are using a compiler and the STL that supports C++0x you could just use that instead of having another lib in your project.