Parsing one terabyte of text and efficiently counting the number of occurrences of each word

C#Algorithm

C# Problem Overview


Recently I came across an interview question to create a algorithm in any language which should do the following

  1. Read 1 terabyte of content
  2. Make a count for each reoccuring word in that content
  3. List the top 10 most frequently occurring words

Could you let me know the best possible way to create an algorithm for this?

Edit:

OK, let's say the content is in English. How we can find the top 10 words that occur most frequently in that content? My other doubt is, if purposely they are giving unique data then our buffer will expire with heap size overflow. We need to handle that as well.

C# Solutions


Solution 1 - C#

Interview Answer

This task is interesting without being too complex, so a great way to start a good technical discussion. My plan to tackle this task would be:

  1. Split input data in words, using white space and punctuation as delimiters
  2. Feed every word found into a Trie structure, with counter updated in nodes representing a word's last letter
  3. Traverse the fully populated tree to find nodes with highest counts

In the context of an interview ... I would demonstrate the idea of Trie by drawing the tree on a board or paper. Start from empty, then build the tree based on a single sentence containing at least one recurring word. Say "the cat can catch the mouse". Finally show how the tree can then be traversed to find highest counts. I would then justify how this tree provides good memory usage, good word lookup speed (especially in the case of natural language for which many words derive from each other), and is suitable for parallel processing.

Draw on the board

Draw the example trie

Demo

The C# program below goes through 2GB of text in 75secs on an 4 core xeon W3520, maxing out 8 threads. Performance is around 4.3 million words per second with less than optimal input parsing code. With the Trie structure to store words, memory is not an issue when processing natural language input.

Notes:

  • test text obtained from the Gutenberg project
  • input parsing code assumes line breaks and is pretty sub-optimal
  • removal of punctuation and other non-word is not done very well
  • handling one large file instead of several smaller one would require a small amount of code to start reading threads between specified offset within the file.

using System;
using System.Collections.Generic;
using System.Collections.Concurrent;
using System.IO;
using System.Threading;

namespace WordCount
{
    class MainClass
    {
        public static void Main(string[] args)
        {
            Console.WriteLine("Counting words...");
            DateTime start_at = DateTime.Now;
            TrieNode root = new TrieNode(null, '?');
            Dictionary<DataReader, Thread> readers = new Dictionary<DataReader, Thread>();
            
            if (args.Length == 0)
            {
                args = new string[] { "war-and-peace.txt", "ulysees.txt", "les-miserables.txt", "the-republic.txt",
                                      "war-and-peace.txt", "ulysees.txt", "les-miserables.txt", "the-republic.txt" };
            }

            if (args.Length > 0)
            {
                foreach (string path in args)
                {
                    DataReader new_reader = new DataReader(path, ref root);
                    Thread new_thread = new Thread(new_reader.ThreadRun);
                    readers.Add(new_reader, new_thread);
                    new_thread.Start();
                }
            }

            foreach (Thread t in readers.Values) t.Join();

            DateTime stop_at = DateTime.Now;
            Console.WriteLine("Input data processed in {0} secs", new TimeSpan(stop_at.Ticks - start_at.Ticks).TotalSeconds);
            Console.WriteLine();
            Console.WriteLine("Most commonly found words:");

            List<TrieNode> top10_nodes = new List<TrieNode> { root, root, root, root, root, root, root, root, root, root };
            int distinct_word_count = 0;
            int total_word_count = 0;
            root.GetTopCounts(ref top10_nodes, ref distinct_word_count, ref total_word_count);
            top10_nodes.Reverse();
            foreach (TrieNode node in top10_nodes)
            {
                Console.WriteLine("{0} - {1} times", node.ToString(), node.m_word_count);
            }

            Console.WriteLine();
            Console.WriteLine("{0} words counted", total_word_count);
            Console.WriteLine("{0} distinct words found", distinct_word_count);
            Console.WriteLine();
            Console.WriteLine("done.");
        }
    }

    #region Input data reader

    public class DataReader
    {
        static int LOOP_COUNT = 1;
        private TrieNode m_root;
        private string m_path;        

        public DataReader(string path, ref TrieNode root)
        {
            m_root = root;
            m_path = path;
        }

        public void ThreadRun()
        {
            for (int i = 0; i < LOOP_COUNT; i++) // fake large data set buy parsing smaller file multiple times
            {
                using (FileStream fstream = new FileStream(m_path, FileMode.Open, FileAccess.Read))
                {
                    using (StreamReader sreader = new StreamReader(fstream))
                    {
                        string line;
                        while ((line = sreader.ReadLine()) != null)
                        {
                            string[] chunks = line.Split(null);
                            foreach (string chunk in chunks)
                            {
                                m_root.AddWord(chunk.Trim());
                            }
                        }
                    }
                }
            }
        }
    }

    #endregion

    #region TRIE implementation

    public class TrieNode : IComparable<TrieNode>
    {
        private char m_char;
        public int m_word_count;
        private TrieNode m_parent = null;
        private ConcurrentDictionary<char, TrieNode> m_children = null;

        public TrieNode(TrieNode parent, char c)
        {
            m_char = c;
            m_word_count = 0;
            m_parent = parent;
            m_children = new ConcurrentDictionary<char, TrieNode>();            
        }

        public void AddWord(string word, int index = 0)
        {
            if (index < word.Length)
            {
                char key = word[index];
                if (char.IsLetter(key)) // should do that during parsing but we're just playing here! right?
                {
                    if (!m_children.ContainsKey(key))
                    {
                        m_children.TryAdd(key, new TrieNode(this, key));
                    }
                    m_children[key].AddWord(word, index + 1);
                }
                else
                {
                    // not a letter! retry with next char
                    AddWord(word, index + 1);
                }
            }
            else
            {
                if (m_parent != null) // empty words should never be counted
                {
                    lock (this)
                    {
                        m_word_count++;                        
                    }
                }
            }
        }

        public int GetCount(string word, int index = 0)
        {
            if (index < word.Length)
            {
                char key = word[index];
                if (!m_children.ContainsKey(key))
                {
                    return -1;
                }
                return m_children[key].GetCount(word, index + 1);
            }
            else
            {
                return m_word_count;
            }
        }

        public void GetTopCounts(ref List<TrieNode> most_counted, ref int distinct_word_count, ref int total_word_count)
        {
            if (m_word_count > 0)
            {
                distinct_word_count++;
                total_word_count += m_word_count;
            }
            if (m_word_count > most_counted[0].m_word_count)
            {
                most_counted[0] = this;
                most_counted.Sort();
            }
            foreach (char key in m_children.Keys)
            {
                m_children[key].GetTopCounts(ref most_counted, ref distinct_word_count, ref total_word_count);
            }
        }

        public override string ToString()
        {
            if (m_parent == null) return "";
            else return m_parent.ToString() + m_char;
        }

        public int CompareTo(TrieNode other)
        {
            return this.m_word_count.CompareTo(other.m_word_count);
        }
    }

    #endregion
}

Here the output from processing the same 20MB of text 100 times across 8 threads.

Counting words...
Input data processed in 75.2879952 secs

Most commonly found words:
the - 19364400 times
of - 10629600 times
and - 10057400 times
to - 8121200 times
a - 6673600 times
in - 5539000 times
he - 4113600 times
that - 3998000 times
was - 3715400 times
his - 3623200 times

323618000 words counted
60896 distinct words found

Solution 2 - C#

A great deal here depends on some things that haven't been specified. For example, are we trying to do this once, or are we trying to build a system that will do this on a regular and ongoing basis? Do we have any control over the input? Are we dealing with text that's all in a single language (e.g., English) or are many languages represented (and if so, how many)?

These matter because:

  1. If the data starts out on a single hard drive, parallel counting (e.g., map-reduce) isn't going to do any real good -- the bottleneck is going to be the transfer speed from the disk. Making copies to more disks so we can count faster will be slower than just counting directly from the one disk.
  2. If we're designing a system to do this on a regular basis, most of our emphasis is really on the hardware -- specifically, have lots of disks in parallel to increase our bandwidth and at least get a little closer to keeping up with the CPU.
  3. No matter how much text you're reading, there's a limit on the number of discrete words you need to deal with -- whether you have a terabyte of even a petabyte of English text, you're not going to see anything like billions of different words in English. Doing a quick check, the Oxford English Dictionary lists approximately 600,000 words in English.
  4. Although the actual words are obviously different between languages, the number of words per language is roughly constant, so the size of the map we build will depend heavily on the number of languages represented.

That mostly leaves the question of how many languages could be represented. For the moment, let's assume the worst case. ISO 639-2 has codes for 485 human languages. Let's assume an average of 700,000 words per language, and an average word length of, say, 10 bytes of UTF-8 per word.

Just stored as simple linear list, that means we can store every word in every language on earth along with an 8-byte frequency count in a little less than 6 gigabytes. If we use something like a Patricia trie instead, we can probably plan on that shrinking at least somewhat -- quite possibly to 3 gigabytes or less, though I don't know enough about all those languages to be at all sure.

Now, the reality is that we've almost certainly overestimated the numbers in a number of places there -- quite a few languages share a fair number of words, many (especially older) languages probably have fewer words than English, and glancing through the list, it looks like some are included that probably don't have written forms at all.

Summary: Almost any reasonably new desktop/server has enough memory to hold the map entirely in RAM -- and more data won't change that. For one (or a few) disks in parallel, we're going to be I/O-bound anyway, so parallel counting (and such) will probably be a net loss. We probably need tens of disks in parallel before any other optimization means much.

Solution 3 - C#

You can try a map-reduce approach for this task. The advantage of map-reduce is scalability, so even for 1TB, or 10TB or 1PB - the same approach will work, and you will not need to do a lot of work in order to modify your algorithm for the new scale. The framework will also take care for distributing the work among all machines (and cores) you have in your cluster.

First - Create the (word,occurances) pairs.
The pseudo code for this will be something like that:

map(document):
  for each word w:
     EmitIntermediate(w,"1")

reduce(word,list<val>):
   Emit(word,size(list))

Second you can find the ones with the topK highest occurances easily with a single iteration over the pairs, This thread explains this concept. The main idea is to hold a min-heap of top K elements, and while iterating - make sure the heap always contains the top K elements seen so far. When you are done - the heap contains the top K elements.

A more scalable (though slower if you have few machines) alternative is you use the map-reduce sorting functionality, and sort the data according to the occurances, and just grep the top K.

Solution 4 - C#

Three things of note for this.

Specifically: File to large to hold in memory, word list (potentially) too large to hold in memory, word count can be too large for a 32 bit int.

Once you get through those caveats, it should be straight forward. The game is managing the potentially large word list.

If it's any easier (to keep your head from spinning).

"You're running a Z-80 8 bit machine, with 65K of RAM and have a 1MB file..."

Same exact problem.

Solution 5 - C#

It depends on the requirements, but if you can afford some error, streaming algorithms and probabilistic data structures can be interesting because they are very time and space efficient and quite simple to implement, for instance:

  • Heavy hitters (e.g., Space Saving), if you are interested only in the top n most frequent words
  • Count-min sketch, to get an estimated count for any word

Those data structures require only very little constant space (exact amount depends on error you can tolerate).

See http://alex.smola.org/teaching/berkeley2012/streams.html for an excellent description of these algorithms.

Solution 6 - C#

I'd be quite tempted to use a DAWG (wikipedia, and a C# writeup with more details). It's simple enough to add a count field on the leaf nodes, efficient memory wise and performs very well for lookups.

EDIT: Though have you tried simply using a Dictionary<string, int>? Where <string, int> represents word and count? Perhaps you're trying to optimize too early?

editor's note: This post originally linked to this wikipedia article, which appears to be about another meaning of the term DAWG: A way of storing all substrings of one word, for efficient approximate string-matching.

Solution 7 - C#

A different solution could be using an SQL table, and let the system handle the data as good as it can. First create the table with the single field word, for each word in the collection.

Then use the query (sorry for syntax issue, my SQL is rusty - this is a pseudo-code actually):

SELECT DISTINCT word, COUNT(*) AS c FROM myTable GROUP BY word ORDER BY c DESC

The general idea is to first generate a table (which is stored on disk) with all words, and then use a query to count and sort (word,occurances) for you. You can then just take the top K from the retrieved list.


To all: If I indeed have any syntax or other issues in the SQL statement: feel free to edit

Solution 8 - C#

First, I only recently "discovered" the Trie data structure and zeFrenchy's answer was great for getting me up to speed on it.

I did see in the comments several people making suggestions on how to improve its performance, but these were only minor tweaks so I'd thought I'd share with you what I found to be the real bottle neck... the ConcurrentDictionary.

I'd wanted to play around with thread local storage and your sample gave me a great opportunity to do that and after some minor changes to use a Dictionary per thread and then combine the dictionaries after the Join() saw the performance improve ~30% (processing 20MB 100 times across 8 threads went from ~48 sec to ~33 sec on my box).

The code is pasted below and you'll notice not much changed from the approved answer.

P.S. I don't have more than 50 reputation points so I could not put this in a comment.

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading;

namespace WordCount
{
    class MainClass
    {
        public static void Main(string[] args)
        {
            Console.WriteLine("Counting words...");
            DateTime start_at = DateTime.Now;
            Dictionary<DataReader, Thread> readers = new Dictionary<DataReader, Thread>();
            if (args.Length == 0)
            {
                args = new string[] { "war-and-peace.txt", "ulysees.txt", "les-miserables.txt", "the-republic.txt",
                                      "war-and-peace.txt", "ulysees.txt", "les-miserables.txt", "the-republic.txt" };
            }

            List<ThreadLocal<TrieNode>> roots;
            if (args.Length == 0)
            {
                roots = new List<ThreadLocal<TrieNode>>(1);
            }
            else
            {
                roots = new List<ThreadLocal<TrieNode>>(args.Length);

                foreach (string path in args)
                {
                    ThreadLocal<TrieNode> root = new  ThreadLocal<TrieNode>(() =>
                    {
                        return new TrieNode(null, '?');
                    });

                    roots.Add(root);

                    DataReader new_reader = new DataReader(path, root);
                    Thread new_thread = new Thread(new_reader.ThreadRun);
                    readers.Add(new_reader, new_thread);
                    new_thread.Start();
                }
            }

            foreach (Thread t in readers.Values) t.Join();

            foreach(ThreadLocal<TrieNode> root in roots.Skip(1))
            {
                roots[0].Value.CombineNode(root.Value);
                root.Dispose();
            }

            DateTime stop_at = DateTime.Now;
            Console.WriteLine("Input data processed in {0} secs", new TimeSpan(stop_at.Ticks - start_at.Ticks).TotalSeconds);
            Console.WriteLine();
            Console.WriteLine("Most commonly found words:");

            List<TrieNode> top10_nodes = new List<TrieNode> { roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value, roots[0].Value };
            int distinct_word_count = 0;
            int total_word_count = 0;
            roots[0].Value.GetTopCounts(top10_nodes, ref distinct_word_count, ref total_word_count);

            top10_nodes.Reverse();
            foreach (TrieNode node in top10_nodes)
            {
                Console.WriteLine("{0} - {1} times", node.ToString(), node.m_word_count);
            }

            roots[0].Dispose();

            Console.WriteLine();
            Console.WriteLine("{0} words counted", total_word_count);
            Console.WriteLine("{0} distinct words found", distinct_word_count);
            Console.WriteLine();
            Console.WriteLine("done.");
            Console.ReadLine();
        }
    }

    #region Input data reader

    public class DataReader
    {
        static int LOOP_COUNT = 100;
        private TrieNode m_root;
        private string m_path;

        public DataReader(string path, ThreadLocal<TrieNode> root)
        {
            m_root = root.Value;
            m_path = path;
        }

        public void ThreadRun()
        {
            for (int i = 0; i < LOOP_COUNT; i++) // fake large data set buy parsing smaller file multiple times
            {
                using (FileStream fstream = new FileStream(m_path, FileMode.Open, FileAccess.Read))
                using (StreamReader sreader = new StreamReader(fstream))
                {
                    string line;
                    while ((line = sreader.ReadLine()) != null)
                    {
                        string[] chunks = line.Split(null);
                        foreach (string chunk in chunks)
                        {
                            m_root.AddWord(chunk.Trim());
                        }
                    }
                }
            }
        }
    }

    #endregion

    #region TRIE implementation

    public class TrieNode : IComparable<TrieNode>
    {
        private char m_char;
        public int m_word_count;
        private TrieNode m_parent = null;
        private Dictionary<char, TrieNode> m_children = null;

        public TrieNode(TrieNode parent, char c)
        {
            m_char = c;
            m_word_count = 0;
            m_parent = parent;
            m_children = new Dictionary<char, TrieNode>();            
        }

        public void CombineNode(TrieNode from)
        {
            foreach(KeyValuePair<char, TrieNode> fromChild in from.m_children)
            {
                char keyChar = fromChild.Key;
                if (!m_children.ContainsKey(keyChar))
                {
                    m_children.Add(keyChar, new TrieNode(this, keyChar));
                }
                m_children[keyChar].m_word_count += fromChild.Value.m_word_count;
                m_children[keyChar].CombineNode(fromChild.Value);
            }
        }

        public void AddWord(string word, int index = 0)
        {
            if (index < word.Length)
            {
                char key = word[index];
                if (char.IsLetter(key)) // should do that during parsing but we're just playing here! right?
                {
                    if (!m_children.ContainsKey(key))
                    {
                        m_children.Add(key, new TrieNode(this, key));
                    }
                    m_children[key].AddWord(word, index + 1);
                }
                else
                {
                    // not a letter! retry with next char
                    AddWord(word, index + 1);
                }
            }
            else
            {
                if (m_parent != null) // empty words should never be counted
                {
                    m_word_count++;                        
                }
            }
        }

        public int GetCount(string word, int index = 0)
        {
            if (index < word.Length)
            {
                char key = word[index];
                if (!m_children.ContainsKey(key))
                {
                    return -1;
                }
                return m_children[key].GetCount(word, index + 1);
            }
            else
            {
                return m_word_count;
            }
        }

        public void GetTopCounts(List<TrieNode> most_counted, ref int distinct_word_count, ref int total_word_count)
        {
            if (m_word_count > 0)
            {
                distinct_word_count++;
                total_word_count += m_word_count;
            }
            if (m_word_count > most_counted[0].m_word_count)
            {
                most_counted[0] = this;
                most_counted.Sort();
            }
            foreach (char key in m_children.Keys)
            {
                m_children[key].GetTopCounts(most_counted, ref distinct_word_count, ref total_word_count);
            }
        }

        public override string ToString()
        {
            return BuildString(new StringBuilder()).ToString();
        }

        private StringBuilder BuildString(StringBuilder builder)
        {
            if (m_parent == null)
            {
                return builder;
            }
            else
            {
                return m_parent.BuildString(builder).Append(m_char);
            }
        }

        public int CompareTo(TrieNode other)
        {
            return this.m_word_count.CompareTo(other.m_word_count);
        }
    }

    #endregion
}

Solution 9 - C#

As a quick general algorithm I would do this.

Create a map with entries being the count for a specific word and the key being the actual string.




for each string in content:
if string is a valid key for the map:
increment the value associated with that key
else
add a new key/value pair to the map with the key being the word and the count being one
done

for each string in content: if string is a valid key for the map: increment the value associated with that key else add a new key/value pair to the map with the key being the word and the count being one done

Then you could just find the largest value in the map


create an array size 10 with data pairs of (word, count)




for each value in the map
if current pair has a count larger than the smallest count in the array
replace that pair with the current one




print all pairs in array

print all pairs in array

Solution 10 - C#

Well the 1st thought is to manage a dtabase in form of hashtable /Array or whatever to save each words occurence, but according to the data size i would rather:

  • Get the 1st 10 found words where occurence >= 2
  • Get how many times these words occure in the entire string and delete them while counting
  • Start again, once you have two sets of 10 words each you get the most occured 10 words of both sets
  • Do the same for the rest of the string(which dosnt contain these words anymore).

You can even try to be more effecient and start with 1st found 10 words where occurence >= 5 for example or more, if not found reduce this value until 10 words found. Throuh this you have a good chance to avoid using memory intensivly saving all words occurences which is a huge amount of data, and you can save scaning rounds (in a good case)

But in the worst case you may have more rounds than in a conventional algorithm.

By the way its a problem i would try to solve with a functional programing language rather than OOP.

Solution 11 - C#

Well, personally, I'd split the file into different sizes of say 128mb, maintaining two in memory all the time while scannng, any discovered word is added to a Hash list, and List of List count, then I'd iterate the list of list at the end to find the top 10...

Solution 12 - C#

The method below will only read your data once and can be tuned for memory sizes.

  • Read the file in chunks of say 1GB
  • For each chunk make a list of say the 5000 most occurring words with their frequency
  • Merge the lists based on frequency (1000 lists with 5000 words each)
  • Return the top 10 of the merged list

Theoretically you might miss words, althoug I think that chance is very very small.

Solution 13 - C#

Storm is the technogy to look at. It separates the role of data input (spouts ) from processors (bolts). The chapter 2 of the storm book solves your exact problem and describes the system architecture very well - http://www.amazon.com/Getting-Started-Storm-Jonathan-Leibiusky/dp/1449324010

Storm is more real time processing as opposed to batch processing with Hadoop. If your data is per existing then you can distribute loads to different spouts and spread them for processing to different bolts .

This algorithm also will enable support for data growing beyond terabytes as the date will be analysed as it arrives in real time.

Solution 14 - C#

Very interesting question. It relates more to logic analysis than coding. With the assumption of English language and valid sentences it comes easier.

You don't have to count all words, just the ones with a length less than or equal to the average word length of the given language (for English is 5.1). Therefore you will not have problems with memory.

As for reading the file you should choose a parallel mode, reading chunks (size of your choice) by manipulating file positions for white spaces. If you decide to read chunks of 1MB for example all chunks except the first one should be a bit wider (+22 bytes from left and +22 bytes from right where 22 represents the longest English word - if I'm right). For parallel processing you will need a concurrent dictionary or local collections that you will merge.

Keep in mind that normally you will end up with a top ten as part of a valid stop word list (this is probably another reverse approach which is also valid as long as the content of the file is ordinary).

Solution 15 - C#

Try to think of special data structure to approach this kind of problems. In this case special kind of tree like trie to store strings in specific way, very efficient. Or second way to build your own solution like counting words. I guess this TB of data would be in English then we do have around 600,000 words in general so it'll be possible to store only those words and counting which strings would be repeated + this solution will need regex to eliminate some special characters. First solution will be faster, I'm pretty sure.

http://en.wikipedia.org/wiki/Trie

here is implementation of tire in java
http://algs4.cs.princeton.edu/52trie/TrieST.java.html

Solution 16 - C#

MapReduce
WordCount can be acheived effciently through mapreduce using hadoop. https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v1.0 Large files can be parsed through it.It uses multiple nodes in cluster to perform this operation.

public void map(LongWritable key, Text value, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
       String line = value.toString();
       StringTokenizer tokenizer = new StringTokenizer(line);
       while (tokenizer.hasMoreTokens()) {
	         word.set(tokenizer.nextToken());
	         output.collect(word, one);
       }
	     }

public static class Reduce extends MapReduceBase implements Reducer<Text, IntWritable, Text, IntWritable> {
	     public void reduce(Text key, Iterator<IntWritable> values, OutputCollector<Text, IntWritable> output, Reporter reporter) throws IOException {
       int sum = 0;
	       while (values.hasNext()) {
	         sum += values.next().get();
	       }
       output.collect(key, new IntWritable(sum));
     }
	   }

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionVIRAView Question on Stackoverflow
Solution 1 - C#zeFrenchyView Answer on Stackoverflow
Solution 2 - C#Jerry CoffinView Answer on Stackoverflow
Solution 3 - C#amitView Answer on Stackoverflow
Solution 4 - C#Will HartungView Answer on Stackoverflow
Solution 5 - C#alienhardView Answer on Stackoverflow
Solution 6 - C#Matt SearlesView Answer on Stackoverflow
Solution 7 - C#amitView Answer on Stackoverflow
Solution 8 - C#Clay Ver ValenView Answer on Stackoverflow
Solution 9 - C#zanegrayView Answer on Stackoverflow
Solution 10 - C#CloudyMarbleView Answer on Stackoverflow
Solution 11 - C#Chibueze OpataView Answer on Stackoverflow
Solution 12 - C#IvoTopsView Answer on Stackoverflow
Solution 13 - C#NiladriBoseView Answer on Stackoverflow
Solution 14 - C#user1088520View Answer on Stackoverflow
Solution 15 - C#blueberry0xffView Answer on Stackoverflow
Solution 16 - C#SaurabhView Answer on Stackoverflow