Extract string between two strings in java

JavaRegexString

Java Problem Overview


I try to get string between <%= and %>, here is my implementation:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>");
String[] result = pattern.split(str);
System.out.println(Arrays.toString(result));

it return

[ZZZZL ,  AFFF ]

But my expectation is:

[ dsn , AFG ]

Where am i wrong and how to correct it ?

Java Solutions


Solution 1 - Java

Your pattern is fine. But you shouldn't be split()ting it away, you should find() it. Following code gives the output you are looking for:

String str = "ZZZZL <%= dsn %> AFFF <%= AFG %>";
Pattern pattern = Pattern.compile("<%=(.*?)%>", Pattern.DOTALL);
Matcher matcher = pattern.matcher(str);
while (matcher.find()) {
	System.out.println(matcher.group(1));
}

Solution 2 - Java

I have answered this question here: https://stackoverflow.com/a/38238785/1773972

Basically use

StringUtils.substringBetween(str, "<%=", "%>");

This requirs using "Apache commons lang" library: https://mvnrepository.com/artifact/org.apache.commons/commons-lang3/3.4

This library has a lot of useful methods for working with string, you will really benefit from exploring this library in other areas of your java code !!!

Solution 3 - Java

Jlordo approach covers specific situation. If you try to build an abstract method out of it, you can face a difficulty to check if 'textFrom' is before 'textTo'. Otherwise method can return a match for some other occurance of 'textFrom' in text.

Here is a ready-to-go abstract method that covers this disadvantage:

  /**
   * Get text between two strings. Passed limiting strings are not 
   * included into result.
   *
   * @param text     Text to search in.
   * @param textFrom Text to start cutting from (exclusive).
   * @param textTo   Text to stop cuutting at (exclusive).
   */
  public static String getBetweenStrings(
    String text,
    String textFrom,
    String textTo) {

    String result = "";

    // Cut the beginning of the text to not occasionally meet a      
    // 'textTo' value in it:
    result =
      text.substring(
        text.indexOf(textFrom) + textFrom.length(),
        text.length());

    // Cut the excessive ending of the text:
    result =
      result.substring(
        0,
        result.indexOf(textTo));

    return result;
  }

Solution 4 - Java

Your regex looks correct, but you're splitting with it instead of matching with it. You want something like this:

// Untested code
Matcher matcher = Pattern.compile("<%=(.*?)%>").matcher(str);
while (matcher.find()) {
    System.out.println(matcher.group());
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionTien NguyenView Question on Stackoverflow
Solution 1 - JavajlordoView Answer on Stackoverflow
Solution 2 - JavaPini CheyniView Answer on Stackoverflow
Solution 3 - JavaZonView Answer on Stackoverflow
Solution 4 - JavaHenry KeiterView Answer on Stackoverflow