Using named matches from Go regex
RegexGoRegex Problem Overview
I'm coming from Python, so I'm probably just not looking at this the right way. I'd like to create a fairly complicated regex and be able to access the fields match by name. I can't seem to find a good example. The closest I've managed to get is this:
package main
import (
"fmt"
"regexp"
)
var myExp = regexp.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)
func main() {
fmt.Printf("%+v", myExp.FindStringSubmatch("1234.5678.9"))
match := myExp.FindStringSubmatch("1234.5678.9")
for i, name := range myExp.SubexpNames() {
fmt.Printf("'%s'\t %d -> %s\n", name, i, match[i])
}
//fmt.Printf("by name: %s %s\n", match["first"], match["second"])
}
The commented out line is how I would expect to access the named fields in Python. What's the equivalent way to do this in go?
Or if I need to convert the match to a map, what's the most idiomatic way in go to make and then access the map?
Regex Solutions
Solution 1 - Regex
You can reference your named capture groups by utilizing map
as follows:
package main
import (
"fmt"
"regexp"
)
var myExp = regexp.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)
func main() {
match := myExp.FindStringSubmatch("1234.5678.9")
result := make(map[string]string)
for i, name := range myExp.SubexpNames() {
if i != 0 && name != "" {
result[name] = match[i]
}
}
fmt.Printf("by name: %s %s\n", result["first"], result["second"])
}
Solution 2 - Regex
I don't have the reputation to comment so forgive me if this shouldn't be an 'answer', but I found the above answer helpful so I wrapped it in to a function:
func reSubMatchMap(r *regexp.Regexp, str string) (map[string]string) {
match := r.FindStringSubmatch(str)
subMatchMap := make(map[string]string)
for i, name := range r.SubexpNames() {
if i != 0 {
subMatchMap[name] = match[i]
}
}
return subMatchMap
}
Example usage on Playground: https://play.golang.org/p/LPLND6FnTXO
Hope this is helpful to someone else. Love the ease of named capture groups in Go.
Solution 3 - Regex
The other approaches will throw an error when a match wasn't found for a 'named group'.
The following, however, creates a map
with whatever named groups were actually found:
func findNamedMatches(regex *regexp.Regexp, str string) map[string]string {
match := regex.FindStringSubmatch(str)
results := map[string]string{}
for i, name := range match {
results[regex.SubexpNames()[i]] = name
}
return results
}
This approach will just return the map with the named group matches. If there are no matches, it'll just return an empty map. I've found that's much easier to deal with than errors being thrown if a match isn't found.
Solution 4 - Regex
You can use regroup
library for that https://github.com/oriser/regroup
Example:
package main
import (
"fmt"
"github.com/oriser/regroup"
)
var myExp = regroup.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)
func main() {
match, err := myExp.Groups("1234.5678.9")
if err != nil {
panic(err)
}
fmt.Printf("by name: %s %s\n", match["first"], match["second"])
}
You can also use a struct for that:
package main
import (
"fmt"
"github.com/oriser/regroup"
)
type Example struct {
First int `regroup:"first"`
Second int `regroup:"second"`
}
var myExp = regroup.MustCompile(`(?P<first>\d+)\.(\d+).(?P<second>\d+)`)
func main() {
res := &Example{}
err := myExp.MatchToTarget("1234.5678.9", res)
if err != nil {
panic(err)
}
fmt.Printf("by struct: %+v\n", res)
}