An Interface Segregation Challenge: Part 1

Chris Rodgers · 20 April, 2017

A couple of years ago I created an extension to the wonderful CsvHelper to allow support for Excel. The project has very few users (~13k downloads from nuget) and, aside from bugfixes, hasn’t changed much since the initial implementation.

One of the minor issues encountered during the initial implementation was having to implement interface members that weren’t actually required by the classes that would make use of the implementations, and didn’t actually make sense to implement. Classic interface segregation principle breakage!

A note before I continue: none of this is meant as a criticism of the underlying library. Quite the opposite, in fact - the library itself is very well designed and the code is of a uniformly high standard! For the internal use cases, why shouldn’t the interfaces have all of the members that may be required?

For example: the interface ICsvParser requires you to implement the member long CharPosition { get; }. The class CsvReader does not actually make use of this member, so it’s a trivial matter to just return a default value. However, in the beta’s for version 3, more members have been introduced, such as TextReader TextReader { get; }. This makes it look more attractive to make changes, particularly as a major version increment is more accomodating of breaking changes.

For some background, my Excel extension provides implementations of ICsvParser and ICsvSerializer which are used by CsvReader and CsvWriter respectively. The primary issue then, with the ICsvParser and ICsvSerializer interfaces.

I started by identifying the members of each interface that are actually required by CsvReader and CsvWriter.

public interface ICsvParser : IDisposable
{
    TextReader TextReader { get; }
    ICsvParserConfiguration Configuration { get; }
    long CharPosition { get; }
    long BytePosition { get; }
    int Row { get; }
    int RawRow { get; }
    string RawRecord { get; }
    string[] Read();
}

From the members defined by ICsvParser only Configuration, Row and Read are required by CsvReader, meaning TextReader, CharPosition, BytePosition, RawRow and RawRecord can be pulled out to another interface. In this scenario I actually pulled Configuration, Row and Read out to another interface IParser and had ICsvParser inherit from it. This means CsvReader can just depend on the IParser interface, making the rest of the values optional.

The next stage is to make it so that that rest of the members are accessible without casting or any other unpleasantness. I planned to do this by making CsvReader an abstract generic class, where the parser is a generic type argument. I also added a concrete class of the same name containing all of the custom constructors - this allows the end users a seamless experience.

public interface ICsvParser : IParser
{
    TextReader TextReader { get; }
}
public interface IExcelParser : IParser
{
    XLWorkbook Workbook { get; }
}
public abstract class CsvReader<TParser> : ICsvReader<TParser> where TParser : IParser
{
    public TParser Parser { get; }
}
public class CsvReader : CsvReader<ICsvParser>
{ 
    // The same api surface as previous
}
public class ExcelReader : CsvReader<IExcelParser>
{ 
    // A potential implementation for my extension.
}

This means all of the members of ICsvParser can continue to be accessed from the Parser property on CsvReader - for example csvReader.Parser.TextReader and the (hypothetical) members of IExcelParser can be accessed in a similar manner, for example excelReader.Parser.Workbook.

I approached ICsvSerializer in a similar manner.

public interface ICsvSerializer : IDisposable
{
    TextWriter TextWriter { get; }
    ICsvSerializerConfiguration Configuration { get; }
    void Write( string[] record );
}

Of these members, only Configuration and Write are used by the CsvWriter class, so they can be safely extracted to new interface (ISerializer) from which ICsvSerializer will inherit.

I made similar changes to CsvWriter, making it abstract and giving it a type argument representing the type of serializer it works with. A concrete derived type with the same name provides seamless access to the same api surface area as before.

I have submitted all of this in a pull request, but that doesn’t make it a given that it will be accepted in its current format, or indeed, at all.

Read part 2

Twitter, Facebook