java codility training Genomic-range-query

JavaAlgorithm

Java Problem Overview


The task is:

A non-empty zero-indexed string S is given. String S consists of N characters from the set of upper-case English letters A, C, G, T.

This string actually represents a DNA sequence, and the upper-case letters represent single nucleotides.

You are also given non-empty zero-indexed arrays P and Q consisting of M integers. These arrays represent queries about minimal nucleotides. We represent the letters of string S as integers 1, 2, 3, 4 in arrays P and Q, where A = 1, C = 2, G = 3, T = 4, and we assume that A < C < G < T.

Query K requires you to find the minimal nucleotide from the range (P[K], Q[K]), 0 ≤ P[i] ≤ Q[i] < N.

For example, consider string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

The minimal nucleotides from these ranges are as follows:

    (0, 8) is A identified by 1,
    (0, 2) is A identified by 1,
    (4, 5) is C identified by 2,
    (7, 7) is T identified by 4.

Write a function:

class Solution { public int[] solution(String S, int[] P, int[] Q); } 

that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M characters specifying the consecutive answers to all queries.

The sequence should be returned as:

    a Results structure (in C), or
    a vector of integers (in C++), or
    a Results record (in Pascal), or
    an array of integers (in any other programming language).

For example, given the string S = GACACCATA and arrays P, Q such that:

P[0] = 0    Q[0] = 8
P[1] = 0    Q[1] = 2
P[2] = 4    Q[2] = 5
P[3] = 7    Q[3] = 7

the function should return the values [1, 1, 2, 4], as explained above.

Assume that:

    N is an integer within the range [1..100,000];
    M is an integer within the range [1..50,000];
    each element of array P, Q is an integer within the range [0..N − 1];
    P[i] ≤ Q[i];
    string S consists only of upper-case English letters A, C, G, T.

Complexity:

    expected worst-case time complexity is O(N+M);
    expected worst-case space complexity is O(N), 
         beyond input storage 
         (not counting the storage required for input arguments).

Elements of input arrays can be modified.

My solution is:

class Solution {
    public int[] solution(String S, int[] P, int[] Q) {
        final  char c[] = S.toCharArray();
        final int answer[] = new int[P.length];
        int tempAnswer;
        char tempC;

        for (int iii = 0; iii < P.length; iii++) {
            tempAnswer = 4;
            for (int zzz = P[iii]; zzz <= Q[iii]; zzz++) {
                tempC = c[zzz];
                if (tempC == 'A') {
                    tempAnswer = 1;
                    break;
                } else if (tempC == 'C') {
                    if (tempAnswer > 2) {
                        tempAnswer = 2;
                    }
                } else if (tempC == 'G') {
                    if (tempAnswer > 3) {
                        tempAnswer = 3;
                    }

                }
            }
            answer[iii] = tempAnswer;
        }

        return answer;
    }
}

It is not optimal, I believe it's supposed to be done within one loop, any hint how can I achieve it?

You can check quality of your solution here https://codility.com/train/ test name is Genomic-range-query.

Java Solutions


Solution 1 - Java

Here is the solution that got 100 out of 100 in codility.com. Please read about prefix sums to understand the solution:

public static int[] solveGenomicRange(String S, int[] P, int[] Q) {
        //used jagged array to hold the prefix sums of each A, C and G genoms
        //we don't need to get prefix sums of T, you will see why.
        int[][] genoms = new int[3][S.length()+1];
        //if the char is found in the index i, then we set it to be 1 else they are 0
        //3 short values are needed for this reason
        short a, c, g;
        for (int i=0; i<S.length(); i++) {
            a = 0; c = 0; g = 0;
            if ('A' == (S.charAt(i))) {
                a=1;
            }
            if ('C' == (S.charAt(i))) {
                c=1;
            }
            if ('G' == (S.charAt(i))) {
                g=1;
            }
            //here we calculate prefix sums. To learn what's prefix sums look at here https://codility.com/media/train/3-PrefixSums.pdf
            genoms[0][i+1] = genoms[0][i] + a;
            genoms[1][i+1] = genoms[1][i] + c;
            genoms[2][i+1] = genoms[2][i] + g;
        }

        int[] result = new int[P.length];
        //here we go through the provided P[] and Q[] arrays as intervals
        for (int i=0; i<P.length; i++) {
            int fromIndex = P[i];
            //we need to add 1 to Q[i], 
            //because our genoms[0][0], genoms[1][0] and genoms[2][0]
            //have 0 values by default, look above genoms[0][i+1] = genoms[0][i] + a; 
            int toIndex = Q[i]+1;
            if (genoms[0][toIndex] - genoms[0][fromIndex] > 0) {
                result[i] = 1;
            } else if (genoms[1][toIndex] - genoms[1][fromIndex] > 0) {
                result[i] = 2;
            } else if (genoms[2][toIndex] - genoms[2][fromIndex] > 0) {
                result[i] = 3;
            } else {
                result[i] = 4;
            }
        }

        return result;
    }

Solution 2 - Java

Simple, elegant, domain specific, 100/100 solution in JS with comments!

function solution(S, P, Q) {
    var N = S.length, M = P.length;

    // dictionary to map nucleotide to impact factor
    var impact = {A : 1, C : 2, G : 3, T : 4};

    // nucleotide total count in DNA
    var currCounter = {A : 0, C : 0, G : 0, T : 0};

    // how many times nucleotide repeats at the moment we reach S[i]
    var counters = [];
    
    // result
    var minImpact = [];

    var i;
    
    // count nucleotides
    for(i = 0; i <= N; i++) {
        counters.push({A: currCounter.A, C: currCounter.C, G: currCounter.G});
        currCounter[S[i]]++;
    }

    // for every query
    for(i = 0; i < M; i++) {
        var from = P[i], to = Q[i] + 1;

        // compare count of A at the start of query with count at the end of equry
        // if counter was changed then query contains A
        if(counters[to].A - counters[from].A > 0) {
            minImpact.push(impact.A);
        }
        // same things for C and others nucleotides with higher impact factor
        else if(counters[to].C - counters[from].C > 0) {
            minImpact.push(impact.C);
        }
        else if(counters[to].G - counters[from].G > 0) {
            minImpact.push(impact.G);
        }
        else { // one of the counters MUST be changed, so its T
            minImpact.push(impact.T);
        }
    }
    
    return minImpact;
}

Solution 3 - Java

Java, 100/100, but with no cumulative/prefix sums! I stashed the last occurrence index of lower 3 nucelotides in a array "map". Later I check if the last index is between P-Q. If so it returns the nuclotide, if not found, it's the top one (T):

class Solution {

int[][] lastOccurrencesMap;

public int[] solution(String S, int[] P, int[] Q) {
    int N = S.length();
    int M = P.length;
    
    int[] result = new int[M];
    lastOccurrencesMap = new int[3][N];
    int lastA = -1;
    int lastC = -1;
    int lastG = -1;
    
    for (int i = 0; i < N; i++) {
        char c = S.charAt(i);
        
        if (c == 'A') {
            lastA = i;
        } else if (c == 'C') {
            lastC = i;
        } else if (c == 'G') {
            lastG = i;
        }
        
        lastOccurrencesMap[0][i] = lastA;
        lastOccurrencesMap[1][i] = lastC;
        lastOccurrencesMap[2][i] = lastG;
    }
    
    for (int i = 0; i < M; i++) {
        int startIndex = P[i];
        int endIndex = Q[i];
        
        int minimum = 4;
        for (int n = 0; n < 3; n++) {
            int lastOccurence = getLastNucleotideOccurrence(startIndex, endIndex, n);
            if (lastOccurence != 0) {
                minimum = n + 1; 
                break;
            }
        }
        
        result[i] = minimum;
    }
    return result;
}

int getLastNucleotideOccurrence(int startIndex, int endIndex, int nucleotideIndex) {
    int[] lastOccurrences = lastOccurrencesMap[nucleotideIndex];
    int endValueLastOccurenceIndex = lastOccurrences[endIndex];
    if (endValueLastOccurenceIndex >= startIndex) {
        return nucleotideIndex + 1;
    } else {
        return 0;
    }
}
}

Solution 4 - Java

Here is the solution, supposing someone is still interested.

class Solution {
        public int[] solution(String S, int[] P, int[] Q) {
            int[] answer = new int[P.length];
            char[] chars = S.toCharArray();
            int[][] cumulativeAnswers = new int[4][chars.length + 1];
    
            for (int iii = 0; iii < chars.length; iii++) {
                if (iii > 0) {
                    for (int zzz = 0; zzz < 4; zzz++) {
                        cumulativeAnswers[zzz][iii + 1] = cumulativeAnswers[zzz][iii];
                    }
                }
    
                switch (chars[iii]) {
                    case 'A':
                        cumulativeAnswers[0][iii + 1]++;
                        break;
                    case 'C':
                        cumulativeAnswers[1][iii + 1]++;
                        break;
                    case 'G':
                        cumulativeAnswers[2][iii + 1]++;
                        break;
                    case 'T':
                        cumulativeAnswers[3][iii + 1]++;
                        break;
                }
            }
    
            for (int iii = 0; iii < P.length; iii++) {
                for (int zzz = 0; zzz < 4; zzz++) {
    
                    if ((cumulativeAnswers[zzz][Q[iii] + 1] - cumulativeAnswers[zzz][P[iii]]) > 0) {
                        answer[iii] = zzz + 1;
                        break;
                    }
    
                }
            }
    
            return answer;
        }
    }

Solution 5 - Java

In case anyone cares about C:

#include <string.h>

struct Results solution(char *S, int P[], int Q[], int M) {    
    int i, a, b, N, *pA, *pC, *pG;
    struct Results result;
    
    result.A = malloc(sizeof(int) * M);
    result.M = M;
    
    // calculate prefix sums
    N = strlen(S);
    pA = malloc(sizeof(int) * N);
    pC = malloc(sizeof(int) * N);
    pG = malloc(sizeof(int) * N);
    pA[0] = S[0] == 'A' ? 1 : 0;
    pC[0] = S[0] == 'C' ? 1 : 0;
    pG[0] = S[0] == 'G' ? 1 : 0;
    for (i = 1; i < N; i++) {
        pA[i] = pA[i - 1] + (S[i] == 'A' ? 1 : 0);
        pC[i] = pC[i - 1] + (S[i] == 'C' ? 1 : 0);
        pG[i] = pG[i - 1] + (S[i] == 'G' ? 1 : 0);
    }
    
    for (i = 0; i < M; i++) {
        a = P[i] - 1;
        b = Q[i];
        
        if ((pA[b] - pA[a]) > 0) {
            result.A[i] = 1;
        } else if ((pC[b] - pC[a]) > 0) {
            result.A[i] = 2;
        } else if ((pG[b] - pG[a]) > 0) {
            result.A[i] = 3;
        } else {
            result.A[i] = 4;
        }
    }
    
    
    return result;
}

Solution 6 - Java

Here is my solution Using Segment Tree O(n)+O(log n)+O(M) time

public class DNAseq {


public static void main(String[] args) {
	String S="CAGCCTA";
	int[] P={2, 5, 0};
	int[] Q={4, 5, 6};
	int [] results=solution(S,P,Q);
	System.out.println(results[0]);
}

static class segmentNode{
	int l;
	int r;
	int min;
	segmentNode left;
	segmentNode right;
}



public static segmentNode buildTree(int[] arr,int l,int r){
	if(l==r){
		segmentNode n=new segmentNode();
		n.l=l;
		n.r=r;
		n.min=arr[l];
		return n;
	}
	int mid=l+(r-l)/2;
	segmentNode le=buildTree(arr,l,mid);
	segmentNode re=buildTree(arr,mid+1,r);
	segmentNode root=new segmentNode();
	root.left=le;
	root.right=re;
	root.l=le.l;
	root.r=re.r;
	
	root.min=Math.min(le.min,re.min);
	
	return root;
}

public static int getMin(segmentNode root,int l,int r){
	if(root.l>r || root.r<l){
		return Integer.MAX_VALUE;
	}
	if(root.l>=l&& root.r<=r) {
		return root.min;
	}
	return Math.min(getMin(root.left,l,r),getMin(root.right,l,r));
}
public static int[] solution(String S, int[] P, int[] Q) {
	int[] arr=new int[S.length()];
	for(int i=0;i<S.length();i++){
		switch (S.charAt(i)) {
		case 'A':
			arr[i]=1;
			break;
		case 'C':
			arr[i]=2;
			break;
		case 'G':
			arr[i]=3;
			break;
		case 'T':
			arr[i]=4;
			break;
		default:
			break;
		}
	}
	
	segmentNode root=buildTree(arr,0,S.length()-1);
	int[] result=new int[P.length];
	for(int i=0;i<P.length;i++){
		result[i]=getMin(root,P[i],Q[i]);
	}
	return result;
} }

Solution 7 - Java

Here is my solution. Got %100 . Of course I needed to first check and study a little bit prefix sums.

public int[] solution(String S, int[] P, int[] Q){
		
		int[] result = new int[P.length];

		int[] factor1 = new int[S.length()];
		int[] factor2 = new int[S.length()];
		int[] factor3 = new int[S.length()];
		int[] factor4 = new int[S.length()];
		
		int factor1Sum = 0;
		int factor2Sum = 0;
		int factor3Sum = 0;
		int factor4Sum = 0;
		
		for(int i=0; i<S.length(); i++){
			switch (S.charAt(i)) {
			case 'A':
				factor1Sum++;
				break;
			case 'C':
				factor2Sum++;
				break;
			case 'G':
				factor3Sum++;
				break;
			case 'T':
				factor4Sum++;
				break;
			default:
				break;
			}
			factor1[i] = factor1Sum;
			factor2[i] = factor2Sum;
			factor3[i] = factor3Sum;
			factor4[i] = factor4Sum;
		}
		
		for(int i=0; i<P.length; i++){
			
			int start = P[i];
			int end = Q[i];
			
			if(start == 0){
				if(factor1[end] > 0){
					result[i] = 1;
				}else if(factor2[end] > 0){
					result[i] = 2;
				}else if(factor3[end] > 0){
					result[i] = 3;
				}else{
					result[i] = 4;
				}
			}else{
				if(factor1[end] > factor1[start-1]){
					result[i] = 1;
				}else if(factor2[end] > factor2[start-1]){
					result[i] = 2;
				}else if(factor3[end] > factor3[start-1]){
					result[i] = 3;
				}else{
					result[i] = 4;
				}
			}
			
		}
		
		return result;
	}

Solution 8 - Java

This is my JavaScript solution that got 100% across the board on Codility:

function solution(S, P, Q) {
    let total = [];
    let min;

    for (let i = 0; i < P.length; i++) {
        const substring = S.slice(P[i], Q[i] + 1);
        if (substring.includes('A')) {
            min = 1;
        } else if (substring.includes('C')) {
            min = 2;
        } else if (substring.includes('G')) {
            min = 3;
        } else if (substring.includes('T')) {
            min = 4;
        }
        total.push(min);
    }
    return total;
}

Solution 9 - Java

import java.util.Arrays;
import java.util.HashMap;
class Solution {
    
   static HashMap<Character, Integer > characterMapping = new HashMap<Character, Integer>(){{
    put('A',1);
    put('C',2);
    put('G',3);
    put('T',4);
  }};

  public static int minimum(int[] arr) {

    if (arr.length ==1) return arr[0];

    int smallestIndex = 0;
    for (int index = 0; index<arr.length; index++) {
      if (arr[index]<arr[smallestIndex]) smallestIndex=index;
    }
    return arr[smallestIndex];
  }
    
    public int[] solution(String S, int[] P, int[] Q) {
        final char[] characterInput = S.toCharArray();
    final int[] integerInput = new int[characterInput.length];

    for(int counter=0; counter < characterInput.length; counter++) {
      integerInput[counter] = characterMapping.get(characterInput[counter]);
    }

    int[] result = new int[P.length];

    //assuming P and Q have the same length
    for(int index =0; index<P.length; index++) {

      if (P[index]==Q[index]) {
        result[index] = integerInput[P[index]];
        break;
      }
      final int[] subArray = Arrays.copyOfRange(integerInput, P[index], Q[index]+1);
      final int minimumValue = minimum(subArray);
      result[index]= minimumValue;
    }
    return result;
    }
}

Solution 10 - Java

Here is a C# solution, the basic idea is pretty much the same as the other answers, but it may be cleaner:

using System;

class Solution
{
    public int[] solution(string S, int[] P, int[] Q)
    {
        int N = S.Length;
        int M = P.Length;
        char[] chars = {'A','C','G','T'};

        //Calculate accumulates
        int[,] accum = new int[3, N+1];
        for (int i = 0; i <= 2; i++)
        {
            for (int j = 0; j < N; j++)
            {
                if(S[j] == chars[i]) accum[i, j+1] = accum[i, j] + 1;
                else accum[i, j+1] = accum[i, j];
            }
        }

        //Get minimal nucleotides for the given ranges
        int diff;
        int[] minimums = new int[M];
        for (int i = 0; i < M; i++)
        {
            minimums[i] = 4;
            for (int j = 0; j <= 2; j++)
            {
                diff = accum[j, Q[i]+1] - accum[j, P[i]];
                if (diff > 0)
                {
                    minimums[i] = j+1;
                    break;
                }
            }
        }

        return minimums;
    }
}

Solution 11 - Java

Here's 100% Scala solution:

def solution(S: String, P: Array[Int], Q: Array[Int]): Array[Int] = {


    val resp = for(ind <- 0 to P.length-1) yield {

      val sub= S.substring(P(ind),Q(ind)+1)


      var factor = 4

      if(sub.contains("A")) {factor=1}
      else{
        if(sub.contains("C")) {factor=2}
        else{
          if(sub.contains("G")) {factor=3}
        }
      }
      factor

    }

    return resp.toArray

  }

And performance: https://codility.com/demo/results/trainingEUR4XP-425/

Solution 12 - Java

Hope this helps.

public int[] solution(String S, int[] P, int[] K) {
        // write your code in Java SE 8
        char[] sc = S.toCharArray();
        int[] A = new int[sc.length];
        int[] G = new int[sc.length];
        int[] C = new int[sc.length];
        
        int prevA =-1,prevG=-1,prevC=-1;
        
        for(int i=0;i<sc.length;i++){
            if(sc[i]=='A')
               prevA=i;
            else if(sc[i] == 'G')
               prevG=i;
            else if(sc[i] =='C')
               prevC=i;
            A[i] = prevA;
            G[i] = prevG;
            C[i] = prevC;
            //System.out.println(A[i]+ " "+G[i]+" "+C[i]);

        }
        int[] result = new int[P.length];

        for(int i=0;i<P.length;i++){
            //System.out.println(A[P[i]]+ " "+A[K[i]]+" "+C[P[i]]+" "+C[K[i]]+" "+P[i]+" "+K[i]);

            if(A[K[i]] >=P[i] && A[K[i]] <=K[i]){
                  result[i] =1;
            }
            else if(C[K[i]] >=P[i] && C[K[i]] <=K[i]){
                  result[i] =2;
            }else if(G[K[i]] >=P[i] && G[K[i]] <=K[i]){
                  result[i] =3;
            }
            else{
                result[i]=4;
            }
        }
        
        return result;
    }

Solution 13 - Java

If someone is still interested in this exercise, I share my Python solution (100/100 in Codility)

def solution(S, P, Q):

    count = []
    for i in range(3):
        count.append([0]*(len(S)+1))

    for index, i in enumerate(S):
        count[0][index+1] = count[0][index] + ( i =='A')
        count[1][index+1] = count[1][index] + ( i =='C')
        count[2][index+1] = count[2][index] + ( i =='G')

    result = []

    for i in range(len(P)):
      start = P[i]
      end = Q[i]+1

      if count[0][end] - count[0][start]:
          result.append(1)
      elif count[1][end] - count[1][start]:
          result.append(2)
      elif count[2][end] - count[2][start]:
          result.append(3)
      else:
          result.append(4)

    return result

Solution 14 - Java

Python Solution with explanation

The idea is to hold an auxiliary array per nucleotide X, with position i (ignoring zero) is how many times X has occurred as of now. And so if we need the number of occurrences of X from position f to position t, we could take the following equation:

> aux(t) - aux(f)

Time complexity is:

> O(N+M)

def solution(S, P, Q):
    n = len(S)
    m = len(P)

    aux = [[0 for i in range(n+1)] for i in [0,1,2]]
    
    for i,c in enumerate(S):
        aux[0][i+1] = aux[0][i] + ( c == 'A' )
        aux[1][i+1] = aux[1][i] + ( c == 'C' )
        aux[2][i+1] = aux[2][i] + ( c == 'G' )

    result = []

    for i in range(m):
        fromIndex , toIndex = P[i] , Q[i] +1
        if   aux[0][toIndex] - aux[0][fromIndex] > 0:
            r = 1
        elif aux[1][toIndex] - aux[1][fromIndex] > 0:
            r = 2
        elif aux[2][toIndex] - aux[2][fromIndex] > 0:
            r = 3
        else:
            r = 4

        result.append(r)

    return result

Solution 15 - Java

This is a Swift 4 solution to the same problem. It is based on @codebusta's solution above:

public func solution(_ S : inout String, _ P : inout [Int], _ Q : inout [Int]) -> [Int] {
var impacts = [Int]()
var prefixSum = [[Int]]()
for _ in 0..<3 {
    let array = Array(repeating: 0, count: S.count + 1)
    prefixSum.append(array)
}

for (index, character) in S.enumerated() {
    var a = 0
    var c = 0
    var g = 0
    
    switch character {
    case "A":
        a = 1
        
    case "C":
        c = 1
        
    case "G":
        g = 1
        
    default:
        break
    }
    
    prefixSum[0][index + 1] = prefixSum[0][index] + a
    prefixSum[1][index + 1] = prefixSum[1][index] + c
    prefixSum[2][index + 1] = prefixSum[2][index] + g
}

for tuple in zip(P, Q) {
    if  prefixSum[0][tuple.1 + 1] - prefixSum[0][tuple.0] > 0 {
        impacts.append(1)
    }
    else if prefixSum[1][tuple.1 + 1] - prefixSum[1][tuple.0] > 0 {
        impacts.append(2)
    }
    else if prefixSum[2][tuple.1 + 1] - prefixSum[2][tuple.0] > 0 {
        impacts.append(3)
    }
    else {
        impacts.append(4)
    }
}

   return impacts
 }

Solution 16 - Java

pshemek's solution constrains itself to the space complexity (O(N)) - even with the 2-d array and the answer array because a constant (4) is used for the 2-d array. That solution also fits in with the computational complexity - whereas mine is O (N^2) - though the actual computational complexity is much lower because it skips over entire ranges that include minimal values.

I gave it a try - but mine ends up using more space - but makes more intuitive sense to me (C#):

public static int[] solution(String S, int[] P, int[] Q)
{
	const int MinValue = 1;
	Dictionary<char, int> stringValueTable = new Dictionary<char,int>(){ {'A', 1}, {'C', 2}, {'G', 3}, {'T', 4} };

	char[] inputArray = S.ToCharArray();
	int[,] minRangeTable = new int[S.Length, S.Length]; // The meaning of this table is [x, y] where x is the start index and y is the end index and the value is the min range - if 0 then it is the min range (whatever that is)
	for (int startIndex = 0; startIndex < S.Length; ++startIndex)
	{
		int currentMinValue = 4;
		int minValueIndex = -1;
		for (int endIndex = startIndex; (endIndex < S.Length) && (minValueIndex == -1); ++endIndex)
		{
			int currentValue = stringValueTable[inputArray[endIndex]];
			if (currentValue < currentMinValue)
			{
				currentMinValue = currentValue;
				if (currentMinValue == MinValue) // We can stop iterating - because anything with this index in its range will always be minimal
					minValueIndex = endIndex;
				else
					minRangeTable[startIndex, endIndex] = currentValue;
			}
			else
				minRangeTable[startIndex, endIndex] = currentValue;
		}

		if (minValueIndex != -1) // Skip over this index - since it is minimal
			startIndex = minValueIndex; // We would have a "+ 1" here - but the "auto-increment" in the for statement will get us past this index
	}

	int[] result = new int[P.Length];
	for (int outputIndex = 0; outputIndex < result.Length; ++outputIndex)
	{
		result[outputIndex] = minRangeTable[P[outputIndex], Q[outputIndex]];
		if (result[outputIndex] == 0) // We could avoid this if we initialized our 2-d array with 1's
			result[outputIndex] = 1;
	}

	return result;
}

In pshemek's answer - the "trick" in the second loop is simply that once you've determined you've found a range with the minimal value - you don't need to continue iterating. Not sure if that helps.

Solution 17 - Java

In ruby (100/100)

def interval_sum x,y,p
	p[y+1] - p[x]
end

def solution(s,p,q)

	#Hash of arrays with prefix sums

	p_sums = {}
	respuesta = []


	%w(A C G T).each do |letter|
		p_sums[letter] = Array.new s.size+1, 0
	end 

	(0...s.size).each do |count|
		%w(A C G T).each do |letter|
			p_sums[letter][count+1] = p_sums[letter][count] 
		end if count > 0

		case s[count]
		when 'A'
			p_sums['A'][count+1] += 1
		when 'C'
			p_sums['C'][count+1] += 1
		when 'G'
			p_sums['G'][count+1] += 1
		when 'T'
			p_sums['T'][count+1] += 1
		end	

	end




	(0...p.size).each do |count|


	  	x = p[count]
		y = q[count]

			
		if interval_sum(x, y, p_sums['A']) > 0 then
			respuesta << 1
			next
		end	

		if interval_sum(x, y, p_sums['C']) > 0 then
			respuesta << 2
			next
		end	

		if interval_sum(x, y, p_sums['G']) > 0 then
			respuesta << 3
			next
		end	

		if interval_sum(x, y, p_sums['T']) > 0 then
			respuesta << 4
			next
		end	

	end

	respuesta

end

Solution 18 - Java

The php 100/100 solution:

function solution($S, $P, $Q) {
    $S      = str_split($S);
    $len    = count($S);
    $lep    = count($P);
    $arr    = array();
    $result = array();
    $clone  = array_fill(0, 4, 0);
    for($i = 0; $i < $len; $i++){
        $arr[$i] = $clone;
        switch($S[$i]){
            case 'A':
                $arr[$i][0] = 1;
                break;
            case 'C':
                $arr[$i][1] = 1;
                break;
            case 'G':
                $arr[$i][2] = 1;
                break;
            default:
                $arr[$i][3] = 1;
                break;
        }
    }
    for($i = 1; $i < $len; $i++){
        for($j = 0; $j < 4; $j++){
            $arr[$i][$j] += $arr[$i - 1][$j];
        }
    }
    for($i = 0; $i < $lep; $i++){
        $x = $P[$i];
        $y = $Q[$i];
        for($a = 0; $a < 4; $a++){
            $sub = 0;
            if($x - 1 >= 0){
                $sub = $arr[$x - 1][$a];
            }
            if($arr[$y][$a] - $sub > 0){
                $result[$i] = $a + 1;
                break;
            }
        }
    }
    return $result;
}

Solution 19 - Java

This program has got score 100 and performance wise has got an edge over other java codes listed above!

The code can be found here.

public class GenomicRange {

final int Index_A=0, Index_C=1, Index_G=2, Index_T=3;
final int A=1, C=2, G=3, T=4; 

public static void main(String[] args) {

	GenomicRange gen = new GenomicRange();
	int[] M = gen.solution( "GACACCATA", new int[] { 0,0,4,7 } , new int[] { 8,2,5,7 } );
	System.out.println(Arrays.toString(M));
} 

public int[] solution(String S, int[] P, int[] Q) {

	int[] M = new int[P.length];
	char[] charArr = S.toCharArray();
	int[][] occCount = new int[3][S.length()+1];

	int charInd = getChar(charArr[0]);
	
	if(charInd!=3) {
		occCount[charInd][1]++;
	}
			
	for(int sInd=1; sInd<S.length(); sInd++) {

		charInd = getChar(charArr[sInd]);
		
		if(charInd!=3)
			occCount[charInd][sInd+1]++;
		
		occCount[Index_A][sInd+1]+=occCount[Index_A][sInd];
		occCount[Index_C][sInd+1]+=occCount[Index_C][sInd];
		occCount[Index_G][sInd+1]+=occCount[Index_G][sInd];
	}

	for(int i=0;i<P.length;i++) {
		
		int a,c,g;
		
		if(Q[i]+1>=occCount[0].length) continue;
		
		a =  occCount[Index_A][Q[i]+1] - occCount[Index_A][P[i]];
		c =  occCount[Index_C][Q[i]+1] - occCount[Index_C][P[i]];
		g =  occCount[Index_G][Q[i]+1] - occCount[Index_G][P[i]];
		
		M[i] = a>0? A : c>0 ? C : g>0 ? G : T;    
	}

	return M;
}

private int getChar(char c) {

	return ((c=='A') ? Index_A : ((c=='C') ? Index_C : ((c=='G') ? Index_G : Index_T)));  
}
}

Solution 20 - Java

Here's a simple javascript solution which got 100%.

function solution(S, P, Q) {
    var A = [];
    var C = [];
    var G = [];
    var T = [];
    var result = [];
    var i = 0;
    
    S.split('').forEach(function(a) {
        if (a === 'A') {
            A.push(i);
        } else if (a === 'C') {
            C.push(i);
        } else if (a === 'G') {
            G.push(i);
        } else {
            T.push(i);
        }
        
        i++;
    });
    
    function hasNucl(typeArray, start, end) {
        return typeArray.some(function(a) {
            return a >= P[j] && a <= Q[j];
        });
    }
    
    for(var j=0; j<P.length; j++) {
        if (hasNucl(A, P[j], P[j])) {
            result.push(1)
        } else if (hasNucl(C, P[j], P[j])) {
            result.push(2);
        } else if (hasNucl(G, P[j], P[j])) {
            result.push(3);
        } else {
            result.push(4);
        }
    }
    
    return result;
}

Solution 21 - Java

perl 100/100 solution:

sub solution {
    my ($S, $P, $Q)=@_; my @P=@$P; my @Q=@$Q;
    
    my @_A = (0), @_C = (0), @_G = (0), @ret =();
    foreach (split //, $S)
    {
        push @_A, $_A[-1] + ($_ eq 'A' ? 1 : 0);
        push @_C, $_C[-1] + ($_ eq 'C' ? 1 : 0);
        push @_G, $_G[-1] + ($_ eq 'G' ? 1 : 0);
    }
    
    foreach my $i (0..$#P)
    {
        my $from_index = $P[$i];
        my $to_index = $Q[$i] + 1;
        if ( $_A[$to_index] - $_A[$from_index] > 0 )
        {
            push @ret, 1;
            next;
        }
        if ( $_C[$to_index] - $_C[$from_index] > 0 )
        {
            push @ret, 2;
            next;
        }
        if ( $_G[$to_index] - $_G[$from_index] > 0 )
        {
            push @ret, 3;
            next;
        }
        push @ret, 4
    }
    
    return @ret;
}

Solution 22 - Java

Java 100/100

class Solution {
public int[] solution(String S, int[] P, int[] Q) {
	int     qSize       = Q.length;
	int[]   answers     = new int[qSize];
	
	char[]  sequence    = S.toCharArray();
	int[][] occCount 	= new int[3][sequence.length+1];
	
	int[] geneImpactMap = new int['G'+1];
	geneImpactMap['A'] = 0;
	geneImpactMap['C'] = 1;
	geneImpactMap['G'] = 2;

	if(sequence[0] != 'T') {
		occCount[geneImpactMap[sequence[0]]][0]++;
	}
	
	for(int i = 0; i < sequence.length; i++) {
		occCount[0][i+1] = occCount[0][i];
		occCount[1][i+1] = occCount[1][i];
		occCount[2][i+1] = occCount[2][i];
		
		if(sequence[i] != 'T') {
			occCount[geneImpactMap[sequence[i]]][i+1]++;
		}
	}

	for(int j = 0; j < qSize; j++) {
		for(int k = 0; k < 3; k++) {
			if(occCount[k][Q[j]+1] - occCount[k][P[j]] > 0) {
				answers[j] = k+1;
				break;
			}
			
			answers[j] = 4;
		}			 
	}
	
	return answers;
}
} 

Solution 23 - Java

simple php 100/100 solution

function solution($S, $P, $Q) {
    $result = array();
    for ($i = 0; $i < count($P); $i++) {
        $from = $P[$i];
        $to = $Q[$i];
        $length = $from >= $to ? $from - $to + 1 : $to - $from + 1;
        $new = substr($S, $from, $length);
        
        if (strpos($new, 'A') !== false) {
            $result[$i] = 1;
        } else {
            if (strpos($new, 'C') !== false) {
                $result[$i] = 2;
            } else {
                if (strpos($new, 'G') !== false) {
                    $result[$i] = 3;
                } else {
                   $result[$i] = 4;
                }
            }
        }
    }
    return $result;
}

Solution 24 - Java

Here's my Java (100/100) Solution:

class Solution {
    private ImpactFactorHolder[] mHolder;
    private static final int A=0,C=1,G=2,T=3;
    
    public int[] solution(String S, int[] P, int[] Q) { 
        mHolder = createImpactHolderArray(S);
        
        int queriesLength = P.length;
        int[] result = new int[queriesLength];
        
        for (int i = 0; i < queriesLength; ++i ) {
            int value = 0;
            if( P[i] == Q[i]) {
              value = lookupValueForIndex(S.charAt(P[i])) + 1;
            } else {
             value = calculateMinImpactFactor(P[i], Q[i]);
            }
            result[i] = value;
        }
        return result;    
        
    }
    
    public int calculateMinImpactFactor(int P, int Q) {
        int minImpactFactor = 3;
        
        for (int nucleotide = A; nucleotide <= T; ++nucleotide ) {
            int qValue = mHolder[nucleotide].mOcurrencesSum[Q];
            int pValue = mHolder[nucleotide].mOcurrencesSum[P];
            // handling special cases when the less value is assigned on the P index
            if( P-1 >= 0 ) {
                pValue = mHolder[nucleotide].mOcurrencesSum[P-1] == 0 ? 0 : pValue;
            } else if ( P == 0 ) {
                pValue = mHolder[nucleotide].mOcurrencesSum[P] == 1 ? 0 : pValue;
            }
            
            if ( qValue - pValue > 0) {
                minImpactFactor = nucleotide;
                break;
            } 
        }        
        return minImpactFactor + 1;
    } 
    
    public int lookupValueForIndex(char nucleotide) {
        int value = 0;
        switch (nucleotide) {
            case 'A' :
                    value = A;
                    break;
                case 'C' :
                    value = C;
                    break;
                case 'G':
                   value = G;
                    break;
                case 'T':
                    value = T;
                    break;
                default:                    
                    break;
        }
        return value;
    }
    
    public ImpactFactorHolder[] createImpactHolderArray(String S) {
        int length = S.length();
        ImpactFactorHolder[] holder = new ImpactFactorHolder[4];
        holder[A] = new ImpactFactorHolder(1,'A', length);
        holder[C] = new ImpactFactorHolder(2,'C', length);
        holder[G] = new ImpactFactorHolder(3,'G', length);
        holder[T] = new ImpactFactorHolder(4,'T', length);
        int i =0;
        for(char c : S.toCharArray()) {
            int nucleotide = lookupValueForIndex(c);
            ++holder[nucleotide].mAcum;
            holder[nucleotide].mOcurrencesSum[i] = holder[nucleotide].mAcum;  
            holder[A].mOcurrencesSum[i] = holder[A].mAcum;
            holder[C].mOcurrencesSum[i] = holder[C].mAcum;
            holder[G].mOcurrencesSum[i] = holder[G].mAcum;
            holder[T].mOcurrencesSum[i] = holder[T].mAcum;
            ++i;
        }
        
        return holder;
    }
    
    private static class ImpactFactorHolder {
        public ImpactFactorHolder(int impactFactor, char nucleotide, int length) {
            mImpactFactor = impactFactor;
            mNucleotide = nucleotide;
            mOcurrencesSum = new int[length];
            mAcum = 0;
        }
        int mImpactFactor;
        char mNucleotide;
        int[] mOcurrencesSum;
        int mAcum;
    }
}

Link: https://codility.com/demo/results/demoJFB5EV-EG8/ I'm looking forward to implement a Segment Tree similar to @Abhishek Kumar solution

Solution 25 - Java

My C++ solution

vector<int> solution(string &S, vector<int> &P, vector<int> &Q) {

    vector<int> impactCount_A(S.size()+1, 0);
    vector<int> impactCount_C(S.size()+1, 0);
    vector<int> impactCount_G(S.size()+1, 0);

    int lastTotal_A = 0;
    int lastTotal_C = 0;
    int lastTotal_G = 0;
    for (int i = (signed)S.size()-1; i >= 0; --i) {
        switch(S[i]) {
            case 'A':
                ++lastTotal_A;
                break;
            case 'C':
                ++lastTotal_C;
                break;
            case 'G':
                ++lastTotal_G;
                break;
        };
    
        impactCount_A[i] = lastTotal_A;
        impactCount_C[i] = lastTotal_C;
        impactCount_G[i] = lastTotal_G;
    }

    vector<int> results(P.size(), 0);

    for (int i = 0; i < P.size(); ++i) {
        int pIndex = P[i];
        int qIndex = Q[i];
    
        int numA = impactCount_A[pIndex]-impactCount_A[qIndex+1];
        int numC = impactCount_C[pIndex]-impactCount_C[qIndex+1];
        int numG = impactCount_G[pIndex]-impactCount_G[qIndex+1];
    
        if (numA > 0) {
            results[i] = 1;
        }
        else if (numC > 0) {
            results[i] = 2;
        }
        else if (numG > 0) {
            results[i] = 3;
        }
        else {
            results[i] = 4;
        }
    }

    return results;
}

Solution 26 - Java

/* 100/100 solution C++. Using prefix sums. Firstly converting chars to integer in nuc variable. Then in a bi-dimensional vector we account the occurrence in S of each nucleoside x in it's respective prefix_sum[s][x]. After we just have to find out the lower nucluoside that occurred in each interval K.

*/ . vector solution(string &S, vector &P, vector &Q) {

int n=S.size();
int m=P.size();
vector<vector<int> > prefix_sum(n+1,vector<int>(4,0));
int nuc;

//prefix occurrence sum
for (int s=0;s<n; s++) {
	nuc = S.at(s) == 'A' ? 1 : (S.at(s) == 'C' ? 2 : (S.at(s) == 'G' ? 3 : 4) );		
	for (int u=0;u<4;u++) {
		prefix_sum[s+1][u] = prefix_sum[s][u] + ((u+1)==nuc?1:0);
	}
}

//find minimal impact factor in each interval K
int lower_impact_factor;

for (int k=0;k<m;k++) {

	lower_impact_factor=4;
	for (int u=2;u>=0;u--) {
		if (prefix_sum[Q[k]+1][u] - prefix_sum[P[k]][u] != 0)
			lower_impact_factor = u+1;
	}
	P[k]=lower_impact_factor;
}

return P;

}

Solution 27 - Java

   static public int[] solution(String S, int[] P, int[] Q) {
    // write your code in Java SE 8

    int A[] = new int[S.length() + 1], C[] = new int[S.length() + 1], G[] = new int[S.length() + 1];

    int last_a = 0, last_c = 0, last_g = 0;

    int results[] = new int[P.length];
    int p = 0, q = 0;
    for (int i = S.length() - 1; i >= 0; i -= 1) {
        switch (S.charAt(i)) {
            case 'A': {
                last_a += 1;
                break;
            }
            case 'C': {
                last_c += 1;
                break;
            }

            case 'G': {
                last_g += 1;
                break;
            }

        }
        A[i] = last_a;
        G[i] = last_g;
        C[i] = last_c;
    }


    for (int i = 0; i < P.length; i++) {
        p = P[i];
        q = Q[i];

        if (A[p] - A[q + 1] > 0) {
            results[i] = 1;
        } else if (C[p] - C[q + 1] > 0) {
            results[i] = 2;
        } else if (G[p] - G[q + 1] > 0) {
            results[i] = 3;
        } else {
            results[i] = 4;
        }

    }
    return results;
}

Solution 28 - Java

I know a lot more out there but that's my answer it got 100/100 .. I hope it is a bit readable

class GenomeCounter
{

    public char GenomeCode { get; private set; }
    public int Value { get; private set; }
    private List<long> CountFromStart;
    private long currentCount;

    public GenomeCounter(char genomeCode, int value)
    {
        CountFromStart = new List<long>();
        GenomeCode = genomeCode;
        currentCount = 0;
        Value = value;
    }
    public void AddCounter()
    {
        CountFromStart.Add(currentCount);
    }
    public void Increment()
    {
        currentCount++;
        Touch();
    }

    public long GetCountAt(int i)
    {
        return CountFromStart[i];
    }
}

class Solution
{
    static private readonly Dictionary<char, int> genomes = new Dictionary<char, int>{
        {  'A',1 },
        { 'C',2 },
        { 'G',3 },
        {'T',4}
        };
    private Dictionary<char, GenomeCounter> GenomeCounters;
    public Solution()
    {
        GenomeCounters = new Dictionary<char, GenomeCounter>();

        foreach (var genome in genomes)
        {
            GenomeCounters[genome.Key] = new GenomeCounter(genome.Key, genome.Value);
        }

    }

    
    private int GetMinBetween(string S, int First, int Last)
    {
        if (First > Last) throw new Exception("Wrong Input");
        int min = GenomeCounters[S[First]].Value;
        foreach (var genomeCount in GenomeCounters)
        {
            if (genomeCount.Value.GetCountAt(First) < (genomeCount.Value.GetCountAt(Last)))
            {
                if (min > genomeCount.Value.Value) min = genomeCount.Value.Value;
            }
        }
        return min;

    }
    private void CalculateTotalCount(string S)
    {

        for (var i = 0; i < S.Length; i++)
        {
            foreach (var genome in GenomeCounters)
            {
                if (genome.Key == S[i]) genome.Value.Increment();
                else genome.Value.AddCounter();
            }
        }

    }
    public int[] solution(string S, int[] P, int[] Q)
    {
        // write your code in C# 6.0 with .NET 4.5 (Mono)
        int M = P.Length;
        int N = S.Length;
        List<int> Mins = new List<int>();
    
        CalculateTotalCount(S);
        for (int i = 0; i < M; i++)
        {
            Mins.Add(GetMinBetween(S, P[i], Q[i]));
        }
        return Mins.ToArray();
    }
}

Solution 29 - Java

scala solution 100/100

import scala.annotation.switch
import scala.collection.mutable

object Solution {
  def solution(s: String, p: Array[Int], q: Array[Int]): Array[Int] = {

    val n = s.length

    def arr = mutable.ArrayBuffer.fill(n + 1)(0L)

    val a = arr
    val c = arr
    val g = arr
    val t = arr

    for (i <- 1 to n) {
      def inc(z: mutable.ArrayBuffer[Long]): Unit = z(i) = z(i - 1) + 1L

      def shift(z: mutable.ArrayBuffer[Long]): Unit = z(i) = z(i - 1)

      val char = s(i - 1)
      (char: @switch) match {
        case 'A' => inc(a); shift(c); shift(g); shift(t);
        case 'C' => shift(a); inc(c); shift(g); shift(t);
        case 'G' => shift(a); shift(c); inc(g); shift(t);
        case 'T' => shift(a); shift(c); shift(g); inc(t);
      }
    }

    val r = mutable.ArrayBuffer.fill(p.length)(0)

    for (i <- p.indices) {
      val start = p(i)
      val end = q(i) + 1
      r(i) =
        if (a(start) != a(end)) 1
        else if (c(start) != c(end)) 2
        else if (g(start) != g(end)) 3
        else if (t(start) != t(end)) 4
        else 0
    }

    r.toArray
  }
}

Solution 30 - Java

I think I'm using dynamic programming. Here is my solution. Little space. Code is mot really clean, just show my idea.

class Solution {
public int[] solution(String S, int[] P, int[] Q) {
    int[] preDominator = new int[S.length()];
    int A = -1;
    int C = -1;
    int G = -1;
    int T = -1;
    
    for (int i = 0; i < S.length(); i++) {
        char c = S.charAt(i);
        if (c == 'A') { 
            A = i;
            preDominator[i] = -1;
        } else if (c == 'C') {
            C = i;
            preDominator[i] = A;
        } else if (c == 'G') {
            G = i;
            preDominator[i] = Math.max(A, C);
        } else {
            T = i;
            preDominator[i] = Math.max(Math.max(A, C), G);
        }
    }
    
    int N = preDominator.length;
    int M = Q.length;
    int[] result = new int[M];
    for (int i = 0; i < M; i++) {
        int p = P[i];
        int q = Math.min(N, Q[i]);
        for (int j = q;;) {
            if (preDominator[j] < p) {
                char c = S.charAt(j);
                if (c == 'A') {
                    result[i] = 1;
                } else if (c == 'C') {
                    result[i] = 2;
                } else if (c == 'G') {
                    result[i] = 3;
                } else {
                    result[i] = 4;
                }
                break;
            }
            j = preDominator[j];
        }
    }
    return result;
}

}

Solution 31 - Java

I implemented a Segment Tree solution in Kotlin

import kotlin.math.*
  
fun solution(S: String, P: IntArray, Q: IntArray): IntArray {

    val a = IntArray(S.length)
    for (i in S.indices) {
        a[i] = when (S[i]) {
            'A' -> 1
            'C' -> 2
            'G' -> 3
            'T' -> 4
            else -> throw IllegalStateException()
        }
    }
    
    val segmentTree = IntArray(2*nextPowerOfTwo(S.length)-1)
    constructSegmentTree(a, segmentTree, 0, a.size-1, 0)
    
    val result = IntArray(P.size)
    for (i in P.indices) {
        result[i] = rangeMinQuery(segmentTree, P[i], Q[i], 0, a.size-1, 0)
    }
    return result
}

fun constructSegmentTree(input: IntArray, segmentTree: IntArray,  low: Int,  high: Int,  pos: Int) {

    if (low == high) {
        segmentTree[pos] = input[low]
        return
    }
    val mid = (low + high)/2
    constructSegmentTree(input, segmentTree, low, mid, 2*pos+1)
    constructSegmentTree(input, segmentTree, mid+1, high, 2*pos+2)
    segmentTree[pos] = min(segmentTree[2*pos+1], segmentTree[2*pos+2])
}

fun rangeMinQuery(segmentTree: IntArray, qlow:Int, qhigh:Int ,low:Int, high:Int, pos:Int): Int {

    if (qlow <= low && qhigh >= high) {
        return segmentTree[pos]
    }
    if (qlow > high || qhigh < low) {
        return Int.MAX_VALUE
    }
    val mid = (low + high)/2
    return min(rangeMinQuery(segmentTree, qlow, qhigh, low, mid, 2*pos+1), rangeMinQuery(segmentTree, qlow, qhigh, mid+1, high, 2*pos+2))
}

fun nextPowerOfTwo(n:Int): Int {
    var count = 0
    var number = n
    if (number > 0 && (number and (number - 1)) == 0) return number
    while (number != 0) {
        number = number shr 1
        count++
    }
    return 1 shl count
}

Solution 32 - Java

Here is python solution with little explanation hope it helps some one. Python codility 100%

def solution(S, P, Q):
"""
https://app.codility.com/demo/results/training8QBVFJ-EQB/
100%

Idea is consider solution as single dimensional array and use concept of prefix some ie.
stores the value in array for p,c and g based on frequency
array stores the frequency of p,c and g for all positions
Example -
    # [0, 0, 1, 1, 1, 1, 1, 2] - prefix some of A - represents the max occurrence of A as 2 in array
    # [0, 1, 1, 1, 2, 3, 3, 3] - prefix some of C - represents the max occurrence of A as 3 in array
    # [0, 0, 0, 1, 1, 1, 1, 1] - prefix some of G - represents the max occurrence of A as 1 in array

# To find the query answers we can just use prefix some and find the distance between position

S = CAGCCTA

P[0] = 2    Q[0] = 4
P[1] = 5    Q[1] = 5
P[2] = 0    Q[2] = 6

Given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting
of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.

The part of the DNA between positions 2 and 4 contains nucleotide G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
The part between positions 0 and 6 (the whole string) contains all nucleotide, in particular nucleotide A whose impact factor is 1, so the answer is 1.

   N is an integer within the range [1..100,000];
   M is an integer within the range [1..50,000];
   each element of arrays P, Q is an integer within the range [0..N − 1];
   P[K] ≤ Q[K], where 0 ≤ K < M;
   string S consists only of upper-case English letters A, C, G, T.

Ref - https://github.com/ghanan94/codility-lesson-solutions/blob/master/Lesson%2005%20-%20Prefix%20Sums/PrefixSums.pdf

:return:  return the values [2, 4, 1]
   """
# two d array - column size is 3 for a,c,g - not taking size 4 since that will be part of else ie. don`t need to calculate
# row size is the length of DNA sequence
prefix_sum_two_d_array = [[0 for i in range(len(S) + 1)] for j in range(3)]
# find the prefix some of all nucleotide in given sequence
for i, nucleotide in enumerate(S):
    # store prefix some of each
    # nucleotide == 'A -> 1 if true 0 if false
    # [0, 0, 1, 1, 1, 1, 1, 2] - prefix some of A - represents the max occurrence of A as 2 in array
    prefix_sum_two_d_array[0][i + 1] = prefix_sum_two_d_array[0][i] + (nucleotide == 'A')
    # store prefix some of c
    # [0, 1, 1, 1, 2, 3, 3, 3] - prefix some of C - represents the max occurrence of A as 3 in array
    prefix_sum_two_d_array[1][i + 1] = prefix_sum_two_d_array[1][i] + (nucleotide == 'C')
    # store prefix some of g
    # [0, 0, 0, 1, 1, 1, 1, 1] - prefix some of G - represents the max occurrence of A as 1 in array
    prefix_sum_two_d_array[2][i + 1] = prefix_sum_two_d_array[2][i] + (nucleotide == 'G')

#print(prefix_sum_two_d_array)

# now to find the query answers we can just use prefix some and find the distance between position
query_answers = []
for position in range(len(P)):
    # for each query of p
    # find the start index from p
    start_index = P[position]
    # find the end index from Q
    end_index = Q[position] + 1
    # find the value from prefix some array - just subtract end index and start index to find the value
    if prefix_sum_two_d_array[0][end_index] - prefix_sum_two_d_array[0][start_index]:
        query_answers.append(1)
    elif prefix_sum_two_d_array[1][end_index] - prefix_sum_two_d_array[1][start_index]:
        query_answers.append(2)
    elif prefix_sum_two_d_array[2][end_index] - prefix_sum_two_d_array[2][start_index]:
        query_answers.append(3)
    else:
        query_answers.append(4)

return query_answers


result = solution("CAGCCTA", [2, 5, 0], [4, 5, 6])
print("Sol " + str(result))
# Sol [2, 4, 1]

Solution 33 - Java

This solution got also 100 out of 100 points. Its time complexity is O(n + m). In the first step it calculates the prefix sums of all nucleotides in the sequence s for all nucleotides in NUCLEOTIDES. This is 4n ∈ O(n) where n is the length of the sequence s. In the second step it searches for the lowest impact factor of each given range defined by p[i] and q[i]. Therefore it calculates the prefix sum delta of each nucleotide starting with the lowest impact factor. Whenever the delta is greater than zero the nucleotide exists in the given range. The complexity for the second step is 4m ∈ O(m) where m is the number of given ranges.

I used some collections for the nucleotides and their impact factors. As such the algorithm is more generic and might be more readable.

import java.util.*;

class Solution {

    private static final Map<Character, Integer> FACTORS = new LinkedHashMap<Character, Integer>(){{
        put('A', 1);
        put('C', 2);
        put('G', 3);
        put('T', 4);
    }};

    private static final Map<Character, Integer> INDEXES;
    
    private static final Character[] NUCLEOTIDES;

    static {
        // calculate the indexes of the impact factors
        INDEXES = new HashMap<Character, Integer>(){{
            int i = 0;
            // assumes the factors are sorted ascending
            for (char c : FACTORS.keySet()) {
                put(c, i);
                i++;
            }
        }};
        // cache the factors
        NUCLEOTIDES = FACTORS.keySet().toArray(new Character[FACTORS.size()]);
    }

    public int[] solution(String s, int[] p, int[] q) {
        final int n = s.length();
        final int m = p.length;
        final int l = FACTORS.size();
        // init the table for the prefix sums
        final int[][] t = new int[n][];
        final int[] r = new int[l];
        // init the result
        final int[] result = new int[m];
        // calculate the table with the prefix sums
        for (int i = 0; i < n; i++) {
            final char c = s.charAt(i);

            r[INDEXES.get(c)]++;
            t[i] = r.clone();
        }
        // search for the lowest impact factor
        for (int i = 0; i < m; i++) {
            for (int j = 0; j < l; j++) {
                if (t[q[i]][j] - (p[i] > 0 ? t[p[i] - 1][j] : 0) > 0) {
                    result[i] = FACTORS.get(NUCLEOTIDES[j]);
                    break;
                }
            }
        }
        return result;
    }
} 

Solution 34 - Java

Swift solution, it could be shorter and use a dictionary instead of 4 different arrays for sure but made it this way to make it a little bit more clear

public func solution(_ S : inout String, _ P : inout [Int], _ Q : inout [Int]) -> [Int] {
        
        var arrayS: [Character] = Array(S)
        var minimunImpactFactorsArray: [Int] = []

        var prefixSumsA: [Int] = Array(repeating: 0, count: S.count + 1)
        var prefixSumsC: [Int] = Array(repeating: 0, count: S.count + 1)
        var prefixSumsG: [Int] = Array(repeating: 0, count: S.count + 1)
        var prefixSumsT: [Int] = Array(repeating: 0, count: S.count + 1)
        
        var a: Int = 0
        var c: Int = 0
        var g: Int = 0
        var t: Int = 0
        
        for (k, letter) in S.enumerated() {
            a = 0
            c = 0
            g = 0
            t = 0
            
            switch letter {
            case "A":
                a = 1
            case "C":
                c = 1
            case "G":
                g = 1
            case "T":
                t = 1
            default:
                break
            }
            
            prefixSumsA[k+1] = prefixSumsA[k] + a
            prefixSumsC[k+1] = prefixSumsC[k] + c
            prefixSumsG[k+1] = prefixSumsG[k] + g
            prefixSumsT[k+1] = prefixSumsT[k] + t
        }

        func sum(p: [Int], x: Int, y: Int) -> Int {
            return p[y+1] - p[x]
        }
        
        for (index, pk) in P.enumerated() {
            let qk = Q[index]
            
            if pk == qk {
                switch arrayS[pk] {
                case "A":
                    minimunImpactFactorsArray.append(1)
                case "C":
                    minimunImpactFactorsArray.append(2)
                case "G":
                    minimunImpactFactorsArray.append(3)
                case "T":
                    minimunImpactFactorsArray.append(4)
                default:
                    break
                }
                
                continue
            }
            
            if (sum(p: prefixSumsA, x: pk, y: qk)) > 0 {
                minimunImpactFactorsArray.append(1)
            } else if (sum(p: prefixSumsC, x: pk, y: qk)) > 0 {
                minimunImpactFactorsArray.append(2)
            } else if (sum(p: prefixSumsG, x: pk, y: qk)) > 0 {
                minimunImpactFactorsArray.append(3)
            } else {
                minimunImpactFactorsArray.append(4)
            }
        }
        
        return minimunImpactFactorsArray
    
    }

Solution 35 - Java

Here is a C++ solution:

#include <algorithm>

std::vector<int> solution(std::string &S, std::vector<int> &P, std::vector<int> &Q)
{
    std::vector<int> genA(S.size()+1, 0);
    std::vector<int> genC(S.size()+1, 0);
    std::vector<int> genG(S.size()+1, 0);

    for (size_t i = 0; i < S.size(); i++) 
    {
        int a = 0; 
        int c = 0; 
        int g = 0;
        
        if ('A' == S[i]) 
        {
            a = 1;
        }
        else if ('C' == S[i]) 
        {
            c = 1;
        }
        else if ('G' == S[i]) 
        {
            g = 1;
        }
           
        genA[i+1] = genA[i] + a;
        genC[i+1] = genC[i] + c;
        genG[i+1] = genG[i] + g;
    }

    std::vector<int> res(P.size());
    
    for (size_t i = 0; i < P.size(); i++) 
    {
        int ip = P[i];
        int iq = Q[i] + 1;
        
        if (genA[iq] - genA[ip] > 0) 
        {
            res[i] = 1;
        } 
        else if (genC[iq] - genC[ip] > 0) 
        {
            res[i] = 2;
        } 
        else if (genG[iq] - genG[ip] > 0) 
        {
            res[i] = 3;
        } 
        else 
        {
            res[i] = 4;
        }
    }

    return res;
}

Solution 36 - Java

Here is a simple 100/100 solution in Javascript

function solution(S, P, Q) {
    const len = S.length
    const M = P.length
    let result = []

    for (let i=0;i<M;i++) {
        const queryString = S.substring(P[i], Q[i]+1)
        if (queryString.includes('A')) {
            result.push(1)
        }
        else if(queryString.includes('C')) {
            result.push(2)
        }
        else if(queryString.includes('G')) {
            result.push(3)
        }
        else {
            result.push(4)
        }
    }
    return result
}

Solution 37 - Java

Javascript solution 100/100

function solution(S, P, Q) {
    const values = { A: 1, C: 2, G: 3, T: 4};
    const keys = Object.keys(values)
    const result = []
    for (i = 0; i < P.length; i++) {
        const string = S.slice(P[i], Q[i] + 1)
        let min;
        for (j = 0; j < keys.length; j++) {
            if (string.includes(keys[j])) {
                min = values[keys[j]];
                break;
            }
        }
        result.push(min)
    }
    return result
}

Solution 38 - Java

Based on the solutions answered by many users, witch where very helpful and I thank you all, I could make using streams with 100/100 !!!

Map<Character, Integer> map = new HashMap<>();
    map.put('A', 0);
    map.put('C', 1);
    map.put('G', 2);
    map.put('T', 3);
    int[][] arr = IntStream.range(0, S.length()).collect(() -> new int[S.length()][4], (acc, i) -> {
        acc[i][map.get(S.charAt(i))] += 1;
        if (i + 1 != S.length()) {
            acc[i + 1][0] += acc[i][0];
            acc[i + 1][1] += acc[i][1];
            acc[i + 1][2] += acc[i][2];
            acc[i + 1][3] += acc[i][3];
        }
    }, Arrays::fill);
    return IntStream.range(0, P.length)
            .map(i -> {
                return 1 + IntStream.range(0, 4).filter(j -> arr[Q[i]][j] - (P[i] > 0 ? arr[P[i] - 1][j] : 0) > 0).min().orElse(-1);
            }).toArray();

Solution 39 - Java

My solution using lambdas

 public int[] solution(String S, int[] P, int[] Q) {
    CharSequence charSequence;
    int length = P.length;
    int[] result = new int[length];
    for (int i=0; i<length; i++){
        int left = P[i];
        int right = Q[i]+1;
        charSequence = S.subSequence(left, right);
        String[] segment = charSequence.toString().split("");

        int countA, countC, countG, countT = 0;
        countA = (int) Arrays.stream(segment).filter(s -> s.equals("A")).count();
        countC = (int) Arrays.stream(segment).filter(s -> s.equals("C")).count();
        countG = (int) Arrays.stream(segment).filter(s -> s.equals("G")).count();
        countT = (int) Arrays.stream(segment).filter(s -> s.equals("T")).count();

        int min = 0;
        if(countA>0){
            min = 1;
        }else if(countC>0){
            min = 2;
        }else if(countG>0){
            min = 3;
        }else if(countT>0){
            min = 4;
        }
        result[i] = min;

    }
    return result;
}

Solution 40 - Java

My 100/100 solution in Go.

func DnaSeqSolution(S string, P []int, Q []int) []int {

	sol := make([]int, len(P), len(P))

	prefixSum := make([][]int, 3, 3)
	for i := 0; i < 3; i++ {
		prefixSum[i] = make([]int, len(S)+1, len(S)+1)
	}

	// calculate prefix sum Arr for A,C,G
	for i := 1; i <= len(S); i++ {
		ch := S[i-1]
		var a, c, g int
		if ch == 'A' {
			a = 1
		}
		if ch == 'C' {
			c = 1
		}
		if ch == 'G' {
			g = 1
		}

		prefixSum[0][i] = prefixSum[0][i-1] + a
		prefixSum[1][i] = prefixSum[1][i-1] + c
		prefixSum[2][i] = prefixSum[2][i-1] + g
	}

	// Figure out whether A,C or G is present in between the indices or not
	for i := 0; i < len(P); i++ {
		end := Q[i] + 1
		beg := P[i]

		if prefixSum[0][end]-prefixSum[0][beg] > 0 {
			sol[i] = 1
			continue
		}
		if prefixSum[1][end]-prefixSum[1][beg] > 0 {
			sol[i] = 2
			continue
		}
		if prefixSum[2][end]-prefixSum[2][beg] > 0{
			sol[i] = 3
			continue
		}
		sol[i] = 4
	}

	return sol

}

Solution 41 - Java

 public static int[] solution(String S, int[] P, int[] Q) {
		 
		 HashMap<String, Integer> hm= new HashMap<String, Integer>();
		 hm.put("A", 1);hm.put("C", 2);hm.put("G", 3);hm.put("T", 4);
		 
		 char[] arr=S.toCharArray();
		 List<Integer> tempList=new ArrayList<Integer>();
		 
		 for(int i=0;i<=Q.length-1;i++) {
			 int minVal=Integer.MAX_VALUE;
			 int minRange = P[i];
			 int maxRange = Q[i];
			 for(int j=minRange;j<=maxRange;j++) {
				 String valueOf = String.valueOf(arr[j]);
				if(hm.containsKey(valueOf)) {
					 if(hm.get(valueOf)<minVal) {
						 minVal=hm.get(valueOf);
					 }
				 }
			 }
			 tempList.add(minVal);
		 }
		 return tempList.stream().mapToInt(x->x.intValue()).toArray();
		 
	    }

Solution 42 - Java

My 100% JavaScript solution with O(N + M) time complexity and no use of advanced built-in methods such as .includes, .substring, etc:

function solution(S, P, Q) {

    // initialize prefix sums for A, C, G (you don't need T)
    const A = [0];
    const C = [0];
    const G = [0];

    // calculate prefix sums for A, C, G
    for (let i = 0, len = S.length; i < len; i++) {
        A.push(A[i] + Number("A" === S[i]));
        C.push(C[i] + Number("C" === S[i]));
        G.push(G[i] + Number("G" === S[i]));
    }

    // calculate the result using prefix sums
    const result = [];
    for (let i = 0, len = P.length; i < len; i++) {
        const from = P[i];
        const to = Q[i] + 1;
        if (A[to] - A[from] > 0) {
            result.push(1);
        } else if (C[to] - C[from] > 0) {
            result.push(2);
        } else if (G[to] - G[from] > 0) {
            result.push(3);
        } else {
            result.push(4); // this is why you don't need T
        }
    }
    return result;

}

Solution 43 - Java

this works for me

 #include <unordered_map>


 vector<int> solution(string &S, vector<int> &P, vector<int> &Q) {

    vector<int> r;
    
    unordered_map<char, vector<int>> acum;
    acum.insert(make_pair('A', vector<int>(S.length())));
    acum.insert(make_pair('C', vector<int>(S.length())));
    acum.insert(make_pair('G', vector<int>(S.length())));
    acum.insert(make_pair('T', vector<int>(S.length())));
        
    int a = 0, c = 0, g = 0, t = 0;
    
    for(unsigned int i=0; i < S.length(); i++){
        char ch = S.at(i);
        if(ch == 'C') c++;
        else if(ch == 'G') g++;
        else if(ch == 'T') t++;
        else if(ch == 'A') a++;
        acum.at('C')[i] = c;
        acum.at('G')[i] = g;
        acum.at('T')[i] = t;
        acum.at('A')[i] = a;
    }
    
    for(unsigned int i = 0; i < P.size(); i++){
        int init = P[i], end = Q[i];
        if(S.at(init) == 'A' || acum.at('A')[end] - acum.at('A')[init] > 0) r.emplace_back(1);
        else if(S.at(init) == 'C' ||acum.at('C')[end] - acum.at('C')[init] > 0) r.emplace_back(2);
        else if(S.at(init) == 'G' || acum.at('G')[end] - acum.at('G')[init] > 0) r.emplace_back(3);
        else if(S.at(init) == 'T' || acum.at('T')[end] - acum.at('T')[init] > 0) r.emplace_back(4);
    }
    
    return r;
    
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionpshemekView Question on Stackoverflow
Solution 1 - JavacodebustaView Answer on Stackoverflow
Solution 2 - JavadimaaanView Answer on Stackoverflow
Solution 3 - JavaSmolaView Answer on Stackoverflow
Solution 4 - JavapshemekView Answer on Stackoverflow
Solution 5 - JavaThiago NegriView Answer on Stackoverflow
Solution 6 - Java Abhishek KumarView Answer on Stackoverflow
Solution 7 - JavaTürkmen Mustafa DemirciView Answer on Stackoverflow
Solution 8 - JavaNickView Answer on Stackoverflow
Solution 9 - JavaViswanathView Answer on Stackoverflow
Solution 10 - JavaPedroView Answer on Stackoverflow
Solution 11 - Javamilos.aiView Answer on Stackoverflow
Solution 12 - JavaVishwasView Answer on Stackoverflow
Solution 13 - JavaFran SandiView Answer on Stackoverflow
Solution 14 - Javauser1854182View Answer on Stackoverflow
Solution 15 - JavaХристо УзуновView Answer on Stackoverflow
Solution 16 - JavaSrcXformView Answer on Stackoverflow
Solution 17 - JavaMaruccioView Answer on Stackoverflow
Solution 18 - JavaAlexander M.View Answer on Stackoverflow
Solution 19 - Javahrajesh4uView Answer on Stackoverflow
Solution 20 - JavancabralView Answer on Stackoverflow
Solution 21 - JavaMarcin WojciechowskiView Answer on Stackoverflow
Solution 22 - JavaRadu GanceaView Answer on Stackoverflow
Solution 23 - JavadontHaveNameView Answer on Stackoverflow
Solution 24 - JavamoxiView Answer on Stackoverflow
Solution 25 - JavacheshyView Answer on Stackoverflow
Solution 26 - JavaJose LadeiraView Answer on Stackoverflow
Solution 27 - JavaRemarioView Answer on Stackoverflow
Solution 28 - JavaIbrahim MagdyView Answer on Stackoverflow
Solution 29 - JavadyrkinView Answer on Stackoverflow
Solution 30 - JavaMo WangView Answer on Stackoverflow
Solution 31 - JavaDiego Marin SantosView Answer on Stackoverflow
Solution 32 - JavaDinakar Prasad MauryaView Answer on Stackoverflow
Solution 33 - JavawitrinView Answer on Stackoverflow
Solution 34 - JavaJefferson Vélez EscobarView Answer on Stackoverflow
Solution 35 - JavapaulsepoliaView Answer on Stackoverflow
Solution 36 - JavaDeveloperView Answer on Stackoverflow
Solution 37 - JavaSelvio PerezView Answer on Stackoverflow
Solution 38 - JavaGuillermo AbbonaView Answer on Stackoverflow
Solution 39 - JavaLeandro MaroView Answer on Stackoverflow
Solution 40 - JavaAmit BasuriView Answer on Stackoverflow
Solution 41 - JavaKunalView Answer on Stackoverflow
Solution 42 - JavanoviewpointView Answer on Stackoverflow
Solution 43 - JavaSalvatore GonzalezView Answer on Stackoverflow