How do you count the number of occurrences of a certain substring in a SQL varchar?

Sql ServerTsql

Sql Server Problem Overview


I have a column that has values formatted like a,b,c,d. Is there a way to count the number of commas in that value in T-SQL?

Sql Server Solutions


Solution 1 - Sql Server

The first way that comes to mind is to do it indirectly by replacing the comma with an empty string and comparing the lengths

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

Solution 2 - Sql Server

Quick extension of cmsjr's answer that works for strings with more than one character.

CREATE FUNCTION dbo.CountOccurrencesOfString
(
	@searchString nvarchar(max),
	@searchTerm nvarchar(max)
)
RETURNS INT
AS
BEGIN
	return (LEN(@searchString)-LEN(REPLACE(@searchString,@searchTerm,'')))/LEN(@searchTerm)
END

Usage:

SELECT * FROM MyTable
where dbo.CountOccurrencesOfString(MyColumn, 'MyString') = 1

Solution 3 - Sql Server

You can compare the length of the string with one where the commas are removed:

len(value) - len(replace(value,',',''))

Solution 4 - Sql Server

The answer by @csmjr has a problem in some instances.

His answer was to do this:

Declare @string varchar(1000)
Set @string = 'a,b,c,d'
select len(@string) - len(replace(@string, ',', ''))

This works in most scenarios, however, try running this:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(@string) - LEN(REPLACE(@string, ',', ''))

For some reason, REPLACE gets rid of the final comma but ALSO the space just before it (not sure why). This results in a returned value of 5 when you'd expect 4. Here is another way to do this which will work even in this special scenario:

DECLARE @string VARCHAR(1000)
SET @string = 'a,b,c,d ,'
SELECT LEN(REPLACE(@string, ',', '**')) - LEN(@string)

Note that you don't need to use asterisks. Any two-character replacement will do. The idea is that you lengthen the string by one character for each instance of the character you're counting, then subtract the length of the original. It's basically the opposite method of the original answer which doesn't come with the strange trimming side-effect.

Solution 5 - Sql Server

Building on @Andrew's solution, you'll get much better performance using a non-procedural table-valued-function and CROSS APPLY:

SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
/*	Usage:
	SELECT t.[YourColumn], c.StringCount
	FROM YourDatabase.dbo.YourTable t
		CROSS APPLY dbo.CountOccurrencesOfString('your search string',     t.[YourColumn]) c
*/
CREATE FUNCTION [dbo].[CountOccurrencesOfString]
(
    @searchTerm nvarchar(max),
    @searchString nvarchar(max)
    
)
RETURNS TABLE
AS
    RETURN 
	SELECT (DATALENGTH(@searchString)-DATALENGTH(REPLACE(@searchString,@searchTerm,'')))/NULLIF(DATALENGTH(@searchTerm), 0) AS StringCount

Solution 6 - Sql Server

Declare @string varchar(1000)

DECLARE @SearchString varchar(100)

Set @string = 'as as df df as as as'

SET @SearchString = 'as'

select ((len(@string) - len(replace(@string, @SearchString, ''))) -(len(@string) - 
        len(replace(@string, @SearchString, ''))) % 2)  / len(@SearchString)

Solution 7 - Sql Server

Accepted answer is correct , extending it to use 2 or more character in substring:

Declare @string varchar(1000)
Set @string = 'aa,bb,cc,dd'
Set @substring = 'aa'
select (len(@string) - len(replace(@string, @substring, '')))/len(@substring)

Solution 8 - Sql Server

Darrel Lee I think has a pretty good answer. Replace CHARINDEX() with PATINDEX(), and you can do some weak regex searching along a string, too...

Like, say you use this for @pattern:

set @pattern='%[-.|!,'+char(9)+']%'

Why would you maybe want to do something crazy like this?

Say you're loading delimited text strings into a staging table, where the field holding the data is something like a varchar(8000) or nvarchar(max)...

Sometimes it's easier/faster to do ELT (Extract-Load-Transform) with data rather than ETL (Extract-Transform-Load), and one way to do this is to load the delimited records as-is into a staging table, especially if you may want an simpler way to see the exceptional records rather than deal with them as part of an SSIS package...but that's a holy war for a different thread.

Solution 9 - Sql Server

If we know there is a limitation on LEN and space, why cant we replace the space first? Then we know there is no space to confuse LEN.

len(replace(@string, ' ', '-')) - len(replace(replace(@string, ' ', '-'), ',', ''))

Solution 10 - Sql Server

Use this code, it is working perfectly. I have create a sql function that accept two parameters, the first param is the long string that we want to search into it,and it can accept string length up to 1500 character(of course you can extend it or even change it to text datatype). And the second parameter is the substring that we want to calculate the number of its occurance(its length is up to 200 character, of course you can change it to what your need). and the output is an integer, represent the number of frequency.....enjoy it.


CREATE FUNCTION [dbo].[GetSubstringCount]
(
  @InputString nvarchar(1500),
  @SubString NVARCHAR(200)
)
RETURNS int
AS
BEGIN 
		declare @K int , @StrLen int , @Count int , @SubStrLen int 
		set @SubStrLen = (select len(@SubString))
		set @Count = 0
		Set @k = 1
		set @StrLen =(select len(@InputString))
	While @K <= @StrLen
		Begin
			if ((select substring(@InputString, @K, @SubStrLen)) = @SubString)
				begin
					if ((select CHARINDEX(@SubString ,@InputString)) > 0)
						begin
						set @Count = @Count +1
						end
				end
								Set @K=@k+1
		end
		return @Count
end

Solution 11 - Sql Server

In SQL 2017 or higher, you can use this:

declare @hits int = 0
set @hits = (select value from STRING_SPLIT('F609,4DFA,8499',','));
select count(@hits)

Solution 12 - Sql Server

Improved version based on top answer and other answers:

Wrapping the string with delimiters ensures that LEN works properly. Making the replace character string one character longer than the match string removes the need for division.

CREATE FUNCTION dbo.MatchCount(@value nvarchar(max), @match  nvarchar(max))
RETURNS int
BEGIN
    RETURN LEN('[' + REPLACE(@value,@match,REPLICATE('*', LEN('[' + @match + ']') - 1)) + ']') - LEN('['+@value+']')
END

Solution 13 - Sql Server

DECLARE @records varchar(400)
SELECT @records = 'a,b,c,d'
select  LEN(@records) as 'Before removing Commas' , LEN(@records) - LEN(REPLACE(@records, ',', '')) 'After Removing Commans'

Solution 14 - Sql Server

The following should do the trick for both single character and multiple character searches:

CREATE FUNCTION dbo.CountOccurrences
(
   @SearchString VARCHAR(1000),
   @SearchFor    VARCHAR(1000)
)
RETURNS TABLE
AS
   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   (
                       SELECT ROW_NUMBER() OVER (ORDER BY O.object_id) AS n
                       FROM   sys.objects AS O
                    ) AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );
GO

---------------------------------------------------------------------------------------
-- Test the function for single and multiple character searches
---------------------------------------------------------------------------------------
DECLARE @SearchForComma      VARCHAR(10) = ',',
        @SearchForCharacters VARCHAR(10) = 'de';

DECLARE @TestTable TABLE
(
   TestData VARCHAR(30) NOT NULL
);

INSERT INTO @TestTable
     (
        TestData
     )
VALUES
     ('a,b,c,de,de ,d e'),
     ('abc,de,hijk,,'),
     (',,a,b,cde,,');

SELECT TT.TestData,
       CO.Occurrences AS CommaOccurrences,
       CO2.Occurrences AS CharacterOccurrences
FROM   @TestTable AS TT
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForComma) AS CO
       OUTER APPLY dbo.CountOccurrences(TT.TestData, @SearchForCharacters) AS CO2;

The function can be simplified a bit using a table of numbers (dbo.Nums):

   RETURN (
             SELECT COUNT(*) AS Occurrences
             FROM   dbo.Nums AS N
                    JOIN (
                            VALUES (@SearchString)
                         ) AS S (SearchString)
                         ON
                         SUBSTRING(S.SearchString, N.n, LEN(@SearchFor)) = @SearchFor
          );

Solution 15 - Sql Server

I finally write this function that should cover all the possible situations, adding a char prefix and suffix to the input. this char is evaluated to be different to any of the char conteined in the search parameter, so it can't affect the result.

CREATE FUNCTION [dbo].[CountOccurrency]
(
@Input nvarchar(max),
@Search nvarchar(max)
)
RETURNS int AS
BEGIN
    declare @SearhLength as int = len('-' + @Search + '-') -2;
	declare @conteinerIndex as int = 255;
	declare @conteiner as char(1) = char(@conteinerIndex);
	WHILE ((CHARINDEX(@conteiner, @Search)>0) and (@conteinerIndex>0))
	BEGIN
		set @conteinerIndex = @conteinerIndex-1;
		set @conteiner = char(@conteinerIndex);
	END;
	set @Input = @conteiner + @Input + @conteiner
	RETURN (len(@Input) - len(replace(@Input, @Search, ''))) / @SearhLength
END 

usage

select dbo.CountOccurrency('a,b,c,d ,', ',')

Solution 16 - Sql Server

Declare @MainStr nvarchar(200)
Declare @SubStr nvarchar(10)
Set @MainStr = 'nikhildfdfdfuzxsznikhilweszxnikhil'
Set @SubStr = 'nikhil'
Select (Len(@MainStr) - Len(REPLACE(@MainStr,@SubStr,'')))/Len(@SubStr)

Solution 17 - Sql Server

this T-SQL code finds and prints all occurrences of pattern @p in sentence @s. you can do any processing on the sentence afterward.

declare @old_hit int = 0
declare @hit int = 0
declare @i int = 0
declare @s varchar(max)='alibcalirezaalivisualization'
declare @p varchar(max)='ali'
 while @i<len(@s)
  begin
   set @hit=charindex(@p,@s,@i)
   if @hit>@old_hit 
    begin
    set @old_hit =@hit
    set @i=@hit+1
    print @hit
   end
  else
    break
 end

the result is: 1 6 13 20

Solution 18 - Sql Server

I ended up using a CTE table for this,

CREATE TABLE #test (
 [id] int,
 [field] nvarchar(500)
)

INSERT INTO #test ([id], [field])
VALUES (1, 'this is a test string http://url, and https://google.com'),
       (2, 'another string, hello world http://example.com'),
       (3, 'a string with no url')

SELECT *
FROM #test

;WITH URL_count_cte ([id], [url_index], [field])
AS
(
    SELECT [id], CHARINDEX('http', [field], 0)+1 AS [url_index], [field]
    FROM #test AS [t]
    WHERE CHARINDEX('http', [field], 0) != 0
    UNION ALL
    SELECT [id], CHARINDEX('http', [field], [url_index])+1 AS [url_index], [field]
    FROM URL_count_cte
    WHERE CHARINDEX('http', [field], [url_index]) > 0
)

-- total urls
SELECT COUNT(1)
FROM URL_count_cte

-- urls per row
SELECT [id], COUNT(1) AS [url_count]
FROM URL_count_cte
GROUP BY [id]

Solution 19 - Sql Server

Using this function, you can get the number of repetitions of words in a text.

/****** Object:  UserDefinedFunction [dbo].[fn_getCountKeywords]    Script Date: 22/11/2021 17:52:00 ******/
DROP FUNCTION IF EXISTS [dbo].[fn_getCountKeywords]
GO
/****** Object:  UserDefinedFunction [dbo].[fn_getCountKeywords]    Script Date: 2211/2021 17:52:00 ******/
SET ANSI_NULLS OFF
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author:		m_Khezrian
-- Create date: 2021/11/22-17:52
-- Description:	Return Count Keywords In Input Text
-- =============================================

Create OR Alter Function [dbo].[fn_getCountKeywords]
	(@Text		nvarchar(max)
	,@Keywords	nvarchar(max)
	)
RETURNS @Result TABLE
(    
   [ID]			int Not Null IDENTITY PRIMARY KEY
  ,[Keyword]	nvarchar(max) Not Null
  ,[Cnt]		int Not Null Default(0)

)
/*With ENCRYPTION*/ As 
Begin
	Declare @Key	nvarchar(max);
	Declare @Cnt	int;
	Declare @I		int;

	Set @I = 0 ;
	--Set @Text = QUOTENAME(@Text);

	Insert Into @Result
		([Keyword])
	Select Trim([value])
	From String_Split(@Keywords,N',')
	Group By [value]
	Order By Len([value]) Desc;

	Declare CntKey_Cursor Insensitive Cursor For
	Select [Keyword]
	From @Result
	Order By [ID];

	Open CntKey_Cursor;
	Fetch Next From CntKey_Cursor Into @Key;
	While (@@Fetch_STATUS = 0) Begin
		Set @Cnt = 0;

		While (PatIndex(N'%'+@Key+'%',@Text) > 0) Begin
			Set @Cnt += 1;
			Set @I += 1 ;
			Set @Text = Stuff(@Text,PatIndex(N'%'+@Key+'%',@Text),len(@Key),N'{'+Convert(nvarchar,@I)+'}');
			--Set @Text = Replace(@Text,@Key,N'{'+Convert(nvarchar,@I)+'}');
		End--While

		Update @Result
			Set [Cnt] = @Cnt
		Where ([Keyword] = @Key);

		Fetch Next From CntKey_Cursor Into @Key;
	End--While
	Close CntKey_Cursor;
	Deallocate CntKey_Cursor;
	Return
 End
GO

--Test
Select *
From dbo.fn_getCountKeywords(
		N'<U+0001F4E3> MARKET IMPACT Euro area Euro CPIarea annual inflation up to 3.0% MaCPIRKET forex'
		,N'CPI ,core,MaRKET , Euro area'
		)		

Go

Solution 20 - Sql Server

Reference https://docs.microsoft.com/en-us/sql/t-sql/functions/string-split-transact-sql?view=sql-server-ver15

Example:

SELECT	s.*
	,s.[Number1] - (SELECT COUNT(Value)
						FROM string_split(s.[StringColumn],',')
						WHERE RTRIM(VALUE) <> '')
FROM TableName AS s

Applies to: SQL Server 2016 (13.x) and later

Solution 21 - Sql Server

You can use the following stored procedure to fetch , values.

IF  EXISTS (SELECT * FROM sys.objects 
WHERE object_id = OBJECT_ID(N'[dbo].[sp_parsedata]') AND type in (N'P', N'PC'))
    DROP PROCEDURE [dbo].[sp_parsedata]
GO
create procedure sp_parsedata
(@cid integer,@st varchar(1000))
as
  declare @coid integer
  declare @c integer
  declare @c1 integer
  select @c1=len(@st) - len(replace(@st, ',', ''))
  set @c=0
  delete from table1 where complainid=@cid;
  while (@c<=@c1)
    begin
      if (@c<@c1) 
        begin
          select @coid=cast(replace(left(@st,CHARINDEX(',',@st,1)),',','') as integer)
          select @st=SUBSTRING(@st,CHARINDEX(',',@st,1)+1,LEN(@st))
        end
      else
        begin
          select @coid=cast(@st as integer)
        end
      insert into table1(complainid,courtid) values(@cid,@coid)
      set @c=@c+1
    end

Solution 22 - Sql Server

The Replace/Len test is cute, but probably very inefficient (especially in terms of memory). A simple function with a loop will do the job.

CREATE FUNCTION [dbo].[fn_Occurences] 
(
	@pattern varchar(255),
	@expression varchar(max)
)
RETURNS int
AS
BEGIN

	DECLARE @Result int = 0;

	DECLARE @index BigInt = 0
	DECLARE @patLen int = len(@pattern)

	SET @index = CHARINDEX(@pattern, @expression, @index)
	While @index > 0
	BEGIN
		SET @Result = @Result + 1;
		SET @index = CHARINDEX(@pattern, @expression, @index + @patLen)
	END

	RETURN @Result

END

Solution 23 - Sql Server

Perhaps you should not store data that way. It is a bad practice to ever store a comma delimited list in a field. IT is very inefficient for querying. This should be a related table.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionOrion AdrianView Question on Stackoverflow
Solution 1 - Sql ServercmsjrView Answer on Stackoverflow
Solution 2 - Sql ServerAndrew BarrettView Answer on Stackoverflow
Solution 3 - Sql ServerGuffaView Answer on Stackoverflow
Solution 4 - Sql ServerbubblekingView Answer on Stackoverflow
Solution 5 - Sql ServerRussell FoxView Answer on Stackoverflow
Solution 6 - Sql ServerNIKHIL THAKURView Answer on Stackoverflow
Solution 7 - Sql ServerImran RizviView Answer on Stackoverflow
Solution 8 - Sql Serveruser1390375View Answer on Stackoverflow
Solution 9 - Sql ServerMartinCView Answer on Stackoverflow
Solution 10 - Sql ServerOne DayView Answer on Stackoverflow
Solution 11 - Sql ServerRudy HinojosaView Answer on Stackoverflow
Solution 12 - Sql ServerJim BergView Answer on Stackoverflow
Solution 13 - Sql ServerShivaView Answer on Stackoverflow
Solution 14 - Sql Servercmfox1970View Answer on Stackoverflow
Solution 15 - Sql ServerArden InsideView Answer on Stackoverflow
Solution 16 - Sql ServerNIKHIL THAKURView Answer on Stackoverflow
Solution 17 - Sql ServerHasan ZafariView Answer on Stackoverflow
Solution 18 - Sql Serveruser802599View Answer on Stackoverflow
Solution 19 - Sql ServerMahmoodView Answer on Stackoverflow
Solution 20 - Sql Serverdaniyal_1363View Answer on Stackoverflow
Solution 21 - Sql ServerNileshView Answer on Stackoverflow
Solution 22 - Sql ServerDarrel LeeView Answer on Stackoverflow
Solution 23 - Sql ServerHLGEMView Answer on Stackoverflow