A couple of years ago I created an extension to the wonderful CsvHelper to allow support for Excel. The project has very few users (~13k downloads from nuget) and, aside from bugfixes, hasn’t changed much since the initial implementation.
One of the minor issues encountered during the initial implementation was having to implement interface members that weren’t actually required by the classes that would make use of the implementations, and didn’t actually make sense to implement. Classic interface segregation principle breakage!
A note before I continue: none of this is meant as a criticism of the underlying library. Quite the opposite, in fact - the library itself is very well designed and the code is of a uniformly high standard! For the internal use cases, why shouldn’t the interfaces have all of the members that may be required?
For example: the interface ICsvParser
requires you to implement the member long CharPosition { get; }
. The class CsvReader
does not actually make use of this member, so it’s a trivial matter to just return a default value. However, in the beta’s for version 3, more members have been introduced, such as TextReader TextReader { get; }
. This makes it look more attractive to make changes, particularly as a major version increment is more accomodating of breaking changes.
For some background, my Excel extension provides implementations of ICsvParser
and ICsvSerializer
which are used by CsvReader
and CsvWriter
respectively. The primary issue then, with the ICsvParser
and ICsvSerializer
interfaces.
I started by identifying the members of each interface that are actually required by CsvReader
and CsvWriter
.
public interface ICsvParser : IDisposable
{
TextReader TextReader { get; }
ICsvParserConfiguration Configuration { get; }
long CharPosition { get; }
long BytePosition { get; }
int Row { get; }
int RawRow { get; }
string RawRecord { get; }
string[] Read();
}
From the members defined by ICsvParser
only Configuration
, Row
and Read
are required by CsvReader
, meaning TextReader
, CharPosition
, BytePosition
, RawRow
and RawRecord
can be pulled out to another interface. In this scenario I actually pulled Configuration
, Row
and Read
out to another interface IParser
and had ICsvParser
inherit from it. This means CsvReader
can just depend on the IParser
interface, making the rest of the values optional.
The next stage is to make it so that that rest of the members are accessible without casting or any other unpleasantness. I planned to do this by making CsvReader
an abstract generic class, where the parser is a generic type argument. I also added a concrete class of the same name containing all of the custom constructors - this allows the end users a seamless experience.
public interface ICsvParser : IParser
{
TextReader TextReader { get; }
}
public interface IExcelParser : IParser
{
XLWorkbook Workbook { get; }
}
public abstract class CsvReader<TParser> : ICsvReader<TParser> where TParser : IParser
{
public TParser Parser { get; }
}
public class CsvReader : CsvReader<ICsvParser>
{
// The same api surface as previous
}
public class ExcelReader : CsvReader<IExcelParser>
{
// A potential implementation for my extension.
}
This means all of the members of ICsvParser
can continue to be accessed from the Parser
property on CsvReader
- for example csvReader.Parser.TextReader
and the (hypothetical) members of IExcelParser
can be accessed in a similar manner, for example excelReader.Parser.Workbook
.
I approached ICsvSerializer
in a similar manner.
public interface ICsvSerializer : IDisposable
{
TextWriter TextWriter { get; }
ICsvSerializerConfiguration Configuration { get; }
void Write( string[] record );
}
Of these members, only Configuration
and Write
are used by the CsvWriter
class, so they can be safely extracted to new interface (ISerializer
) from which ICsvSerializer
will inherit.
I made similar changes to CsvWriter
, making it abstract and giving it a type argument representing the type of serializer it works with. A concrete derived type with the same name provides seamless access to the same api surface area as before.
I have submitted all of this in a pull request, but that doesn’t make it a given that it will be accepted in its current format, or indeed, at all.
Read part 2