Object Oriented Programming - how to avoid duplication in processes that differ slightly depending on a variable

C#Oop

C# Problem Overview


Something that comes up quite a lot in my current work is that there is a generalised process that needs to happen, but then the odd part of that process needs to happen slightly differently depending on the value of a certain variable, and I'm not quite sure what's the most elegant way to handle this.

I'll use the example that we usually have, which is doing things slightly differently depending on the country we're dealing with.

So I have a class, let's call it Processor:

public class Processor
{
    public string Process(string country, string text)
    {
        text.Capitalise();

        text.RemovePunctuation();

        text.Replace("é", "e");

        var split = text.Split(",");

        string.Join("|", split);
    }
}

Except that only some of those actions need to happen for certain countries. For example, only 6 countries require the capitalisation step. The character to split on might change depending on the country. Replacing the accented 'e' might only be required depending on the country.

Obviously you could solve it by doing something like this:

public string Process(string country, string text)
{
    if (country == "USA" || country == "GBR")
    {
        text.Capitalise();
    }

    if (country == "DEU")
    {
        text.RemovePunctuation();
    }

    if (country != "FRA")
    {
        text.Replace("é", "e");
    }

    var separator = DetermineSeparator(country);
    var split = text.Split(separator);

    string.Join("|", split);
}

But when you're dealing with all the possible countries in the world, that becomes very cumbersome. And regardless, the if statements make the logic harder to read (at least, if you imagine a more complex method than the example), and the cyclomatic complexity starts to creep up pretty fast.

So at the moment I'm sort of doing something like this:

public class Processor
{
    CountrySpecificHandlerFactory handlerFactory;

    public Processor(CountrySpecificHandlerFactory handlerFactory)
    {
        this.handlerFactory = handlerFactory;
    }

    public string Process(string country, string text)
    {
        var handlers = this.handlerFactory.CreateHandlers(country);
        handlers.Capitalier.Capitalise(text);

        handlers.PunctuationHandler.RemovePunctuation(text);

        handlers.SpecialCharacterHandler.ReplaceSpecialCharacters(text);

        var separator = handlers.SeparatorHandler.DetermineSeparator();
        var split = text.Split(separator);

        string.Join("|", split);
    }
}

Handlers:

public class CountrySpecificHandlerFactory
{
    private static IDictionary<string, ICapitaliser> capitaliserDictionary
                                    = new Dictionary<string, ICapitaliser>
    {
        { "USA", new Capitaliser() },
        { "GBR", new Capitaliser() },
        { "FRA", new ThingThatDoesNotCapitaliseButImplementsICapitaliser() },
        { "DEU", new ThingThatDoesNotCapitaliseButImplementsICapitaliser() },
    };

    // Imagine the other dictionaries like this...

    public CreateHandlers(string country)
    {
        return new CountrySpecificHandlers
        {
            Capitaliser = capitaliserDictionary[country],
            PunctuationHanlder = punctuationDictionary[country],
            // etc...
        };
    }
}

public class CountrySpecificHandlers
{
    public ICapitaliser Capitaliser { get; private set; }
    public IPunctuationHanlder PunctuationHanlder { get; private set; }
    public ISpecialCharacterHandler SpecialCharacterHandler { get; private set; }
    public ISeparatorHandler SeparatorHandler { get; private set; }
}

Which equally I'm not really sure I like. The logic is still somewhat obscured by all of the factory creation and you can't simply look at the original method and see what happens when a "GBR" process is executed, for example. You also end up creating a lot of classes (in more complex examples than this) in the style GbrPunctuationHandler, UsaPunctuationHandler, etc... which means that you have to look at several different classes to figure out all of the possible actions that could happen during punctuation handling. Obviously I don't want one giant class with a billion if statements, but equally 20 classes with slightly differing logic also feels clunky.

Basically I think I've got myself into some sort of OOP knot and don't quite know a good way of untangling it. I was wondering if there was a pattern out there that would help with this type of process?

C# Solutions


Solution 1 - C#

I would suggest encapsulating all options in one class:

public class ProcessOptions
{
    public bool Capitalise { get; set; }
    public bool RemovePunctuation { get; set; }
    public bool Replace { get; set; }
    public char ReplaceChar { get; set; }
    public char ReplacementChar { get; set; }
    public bool SplitAndJoin { get; set; }
    public char JoinChar { get; set; }
    public char SplitChar { get; set; }
}

and pass it into the Process method:

public string Process(ProcessOptions options, string text)
{
    if(options.Capitalise)
		text.Capitalise();

    if(options.RemovePunctuation)
		text.RemovePunctuation();

    if(options.Replace)
		text.Replace(options.ReplaceChar, options.ReplacementChar);

    if(options.SplitAndJoin)
    {
        var split = text.Split(options.SplitChar);
        return string.Join(options.JoinChar, split);
    }

    return text;
}

Solution 2 - C#

When the .NET framework set out to handle these sorts of problems, it didn't model everything as string. So you have, for instance, the CultureInfo class:

> Provides information about a specific culture (called a locale for unmanaged code development). The information includes the names for the culture, the writing system, the calendar used, the sort order of strings, and formatting for dates and numbers.

Now, this class may not contain the specific features that you need, but you can obviously create something analogous. And then you change your Process method:

public string Process(CountryInfo country, string text)

Your CountryInfo class can then have a bool RequiresCapitalization property, etc, that helps your Process method direct its processing appropriately.

Solution 3 - C#

Maybe you could have one Processor per country?

public class FrProcessor : Processor {
	protected override string Separator => ".";
	
	protected override string ProcessSpecific(string text) {
		return text.Replace("é", "e");
	}
}

public class UsaProcessor : Processor {
	protected override string Separator => ",";
	
	protected override string ProcessSpecific(string text) {
		return text.Capitalise().RemovePunctuation();
	}
}

And one base class to handle common parts of the processing:

public abstract class Processor {
	protected abstract string Separator { get; }
	
	protected virtual string ProcessSpecific(string text) { }
	
	private string ProcessCommon(string text) {
        var split = text.Split(Separator);
        return string.Join("|", split);
	}
	
	public string Process(string text) {
		var s = ProcessSpecific(text);
		return ProcessCommon(s);
	}
}

Also, you should rework your return types because it won't compile as you wrote them - sometimes a string method doesn't return anything.

Solution 4 - C#

You can create a common interface with a Process method...

public interface IProcessor
{
	string Process(string text);
}

Then you implement it for each country...

public class Processors
{
	public class GBR : IProcessor
	{
		public string Process(string text)
		{
			return $"{text} (processed with GBR rules)";
		}
	}

	public class FRA : IProcessor
	{
		public string Process(string text)
		{
			return $"{text} (processed with FRA rules)";
		}
	}
}

You can then create a common method for instantiating and executing each country related class...

// also place these in the Processors class above
public static IProcessor CreateProcessor(string country)
{
	var typeName = $"{typeof(Processors).FullName}+{country}";
	var processor = (IProcessor)Assembly.GetAssembly(typeof(Processors)).CreateInstance(typeName);
	return processor;
}

public static string Process(string country, string text)
{
	var processor = CreateProcessor(country);
	return processor?.Process(text);
}

Then you just need to create and use the processors like so...

// create a processor object for multiple use, if needed...
var processorGbr = Processors.CreateProcessor("GBR");
Console.WriteLine(processorGbr.Process("This is some text."));
	
// create and use a processor for one-time use
Console.WriteLine(Processors.Process("FRA", "This is some more text."));

Here's a working dotnet fiddle example...

You place all the country-specific processing in each country class. Create a common class (in the Processing class) for all the actual individual methods, so each country processor becomes a list of other common calls, rather than copy the code in each country class.

Note: You'll need to add...

using System.Assembly;

in order for the static method to create an instance of the country class.

Solution 5 - C#

A few versions ago, the C# swtich was given full support for pattern matching. So that "multiple countries match" case is easily done. While it still has no fall through ability, one input can match multiple cases with pattern matching. It could maybe make that if-spam a bit clearer.

Npw a switch can usually be replaced with a Collection. You need to be using Delegates and a Dictionary. Process can be replaced with.

public delegate string ProcessDelegate(string text);

Then you could make a Dictionary:

var Processors = new Dictionary<string, ProcessDelegate>(){
  { "USA", EnglishProcessor },
  { "GBR", EnglishProcessor },
  { "DEU", GermanProcessor }
}

I used functionNames to hand in the Delegate. But you could use the Lambda syntax to provide the entire code there. That way you could just hide that whole Collection like you would any other large collection. And the code becomes a simple lookup:

ProcessDelegate currentProcessor = Processors[country];
string processedString = currentProcessor(country);

Those are pretty much the two options. You may want to consider using Enumerations instead of strings for the matching, but that is a minor detail.

Solution 6 - C#

I would perhaps (depending on the details of your use-case) go with the Country being a "real" object instead of a string. The keyword is "polymorphism".

So basically it would look like this:

public interface Country {
   string Process(string text);
}

Then you can create specialized countries for those that you need. Note: you don't have to create Country object for all countries, you can have LatinlikeCountry, or even GenericCountry. There you can collect what should be done, even re-using others, like:

public class France {
   public string Process(string text) {
      return new GenericCountry().process(text)
         .replace('a', 'b');
   }
}

Or similar. Country may be actually Language, I'm not sure about the use-case, but I you get the point.

Also, the method of course should not be Process() it should be the thing that you actually need done. Like Words() or whatever.

Solution 7 - C#

You want to delegate to (nod to chain of responsibility) something that knows about its own culture. So use or make a Country or CultureInfo type construct, as mentioned above in other answers.

But in general and fundamentally your problem is you are taking procedural constructs like 'processor' and applying them to OO. OO is about representing real world concepts from a business or problem domain in software. Processor does not translate to anything in the real world apart from software itself. Whenever you have classes like Processor or Manager or Governor, alarm bells should ring.

Solution 8 - C#

> I was wondering if there was a pattern out there that would help with > this type of process

Chain of reponsibility is the kind of thing you may be looking for but in OOP is somewhat cumbersome...

What about a more functional approach with C#?

using System;


namespace Kata {

  class Kata {


    static void Main() {

      var text = "     testing this thing for DEU          ";
      Console.WriteLine(Process.For("DEU")(text));

      text = "     testing this thing for USA          ";
      Console.WriteLine(Process.For("USA")(text));

      Console.ReadKey();
    }

    public static class Process {

      public static Func<string, string> For(string country) {

        Func<string, string> baseFnc = (string text) => text;

        var aggregatedFnc = ApplyToUpper(baseFnc, country);
        aggregatedFnc = ApplyTrim(aggregatedFnc, country);

        return aggregatedFnc;

      }

      private static Func<string, string> ApplyToUpper(Func<string, string> currentFnc, string country) {

        string toUpper(string text) => currentFnc(text).ToUpper();

        Func<string, string> fnc = null;

        switch (country) {
          case "USA":
          case "GBR":
          case "DEU":
            fnc = toUpper;
            break;
          default:
            fnc = currentFnc;
            break;
        }
        return fnc;
      }

      private static Func<string, string> ApplyTrim(Func<string, string> currentFnc, string country) {

        string trim(string text) => currentFnc(text).Trim();

        Func<string, string> fnc = null;

        switch (country) {
          case "DEU":
            fnc = trim;
            break;
          default:
            fnc = currentFnc;
            break;
        }
        return fnc;
      }
    }
  }
}

NOTE: It does not have to be all static of course. If Process class need state you can use a instanced class or partially applied function ;) .

You can build the Process for each country on startup, store each one in a indexed collection and retrieve them when needed with O(1) cost.

Solution 9 - C#

>I’m sorry that I long ago coined the term “objects” for this topic because it gets many people to focus on the lesser idea. The big idea is messaging. > >~ Alan Kay, On Messaging

I would simply implement routines Capitalise, RemovePunctuation etc. as subprocesses that can be messaged with a text and country parameters, and would return a processed text.

Use dictionaries to group countries that fit a specific attribute (if you prefer lists, that would work as well with only a slight performance cost). For example: CapitalisationApplicableCountries and PunctuationRemovalApplicableCountries.

/// Runs like a pipe: passing the text through several stages of subprocesses
public string Process(string country, string text)
{
    text = Capitalise(country, text);
    text = RemovePunctuation(country, text);
    // And so on and so forth...

    return text;
}

private string Capitalise(string country, string text)
{
    if ( ! CapitalisationApplicableCountries.ContainsKey(country) )
    {
        /* skip */
        return text;
    }

    /* do the capitalisation */
    return capitalisedText;
}

private string RemovePunctuation(string country, string text)
{
    if ( ! PunctuationRemovalApplicableCountries.ContainsKey(country) )
    {
        /* skip */
        return text;
    }

    /* do the punctuation removal */
    return punctuationFreeText;
}

private string Replace(string country, string text)
{
    // Implement it following the pattern demonstrated earlier.
}

Solution 10 - C#

I feel that the information about the countries should be kept in data, not in code. So instead of a CountryInfo class or CapitalisationApplicableCountries dictionary, you could have a database with a record for each country and a field for each processing step, and then the processing could go through the fields for a given country and process accordingly. The maintenance is then mainly in the database, with new code only needed when new steps are needed, and the data can be human readable in the database. This assumes the steps are independent and don't interfere with each other; if that is not so things are complicated.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionJohn DarvillView Question on Stackoverflow
Solution 1 - C#Michał TurczynView Answer on Stackoverflow
Solution 2 - C#Damien_The_UnbelieverView Answer on Stackoverflow
Solution 3 - C#Corentin PaneView Answer on Stackoverflow
Solution 4 - C#Reinstate Monica CellioView Answer on Stackoverflow
Solution 5 - C#ChristopherView Answer on Stackoverflow
Solution 6 - C#Robert BräutigamView Answer on Stackoverflow
Solution 7 - C#FrankView Answer on Stackoverflow
Solution 8 - C#jlvaqueroView Answer on Stackoverflow
Solution 9 - C#Igwe KaluView Answer on Stackoverflow
Solution 10 - C#Steve JView Answer on Stackoverflow