Frank’s Scratchpad: generics

Showing posts with label generics. Show all posts

Sunday, April 25, 2010

A minimalistic native 64 bit array implementation for .NET (updated code)

If you ever felt the need to process huge amounts of data via a algorithm implemented using .NET/the CLR, you’ve surely ran into the 2^31-items-limit of the CLR’s current array implementation that only supports Int32 array indices (this also affects other collections like List<T> as those use standard arrays for storage internally).
You can try to circumvent this limitation by implementing your own array-like data-structure, either by emulating continuous storage via a collection of standard .NET-arrays (partition your data in chunks with 2^31 items each), or you can use native APIs and some evil pointer arithmetic to get maximum performance.
A while ago I tried to implement the latter approach in C#, which isn’t a big deal, only a matter of some unsafe-blocks for pointer arithmetic and a call to Marshal.AllocHGlobal for allocating memory on the unmanged heap. However, when I tried to make that custom collection into a generic one, I ran into an unsolvable problem:

public unsafe T this[long index]

get

        return *((T*)pBase + index);

set

        *((T*)pBase + index) = value;

This code does not compile. The reason for that is, that there is no way to tell the C# compiler that T shall be constrained to unmanaged types.
Interestingly, F# 2.0 does feature such a constraint! This is how a minimalistic F# implementation of such an native 64 bit array could look like:

namespace NativeTools

#nowarn "9"

#nowarn "42"

open System

open System.Runtime

open Microsoft.FSharp.NativeInterop

module internal PointerArithmetic =

    [<CompiledName("AddIntPtrToIntPtr")>]

    [<Unverifiable>]

    let inline addNativeInt (x: nativeptr<'T>) (n: nativeint) : nativeptr<'T> =

        (NativePtr.toNativeInt x) + n * (# "sizeof !0" type('T) : nativeint #) |> NativePtr.ofNativeInt

    // "reinterpret_cast<IntPtr>(x)"... EVIL!

    [<CompiledName("Int64ToIntPtr")>]

    [<Unverifiable>]

    let inline int64ToNativeint (x: int64) = (# "" x : nativeint #)

    [<CompiledName("AddInt64ToIntPtr")>]

    [<Unverifiable>]

    let inline addInt64 (x: nativeptr<'a>) (o: int64) : nativeptr<'a> = addNativeInt x (int64ToNativeint o)

[<Sealed>]

type NativeArray64<'T when 'T: unmanaged>(length: int64) =

    let itemSize: int64 = (int64)(InteropServices.Marshal.SizeOf(typeof<'T>))

    let mutable isDisposed = false

    let allocatedBytes = length * itemSize

    let blob = InteropServices.Marshal.AllocHGlobal(nativeint allocatedBytes)

    let pBlobBase: nativeptr<'T> = NativePtr.ofNativeInt blob

    let disposeLock = new Object()

    member this.Length = length

    member this.BaseAddress = pBlobBase

    member this.ItemSize = itemSize

    member this.IsDisposed = isDisposed

    member this.AllocatedBytes = allocatedBytes

    member private this.Free () =

        lock disposeLock (fun () ->

            if isDisposed

                then ()

                else InteropServices.Marshal.FreeHGlobal blob

                     isDisposed <- true

    member this.Item

        with get (idx: int64) =

                        NativePtr.read (PointerArithmetic.addInt64 pBlobBase idx)

        and  set (idx: int64) (value: 'T) =

                        NativePtr.write (PointerArithmetic.addInt64 pBlobBase idx) value

    member private this.Items = seq {

            for i in 0L .. length - 1L do

                yield this.[i]

    override this.Finalize () = this.Free()

    interface IDisposable with

        member this.Dispose () =

            GC.SuppressFinalize this

            this.Free()

    interface Collections.Generic.IEnumerable<'T> with

        member this.GetEnumerator () : Collections.Generic.IEnumerator<'T> =

            this.Items.GetEnumerator()

        member this.GetEnumerator () : Collections.IEnumerator =

            this.Items.GetEnumerator() :> Collections.IEnumerator

UPDATE 2010-04-25: Removed a few bugs.

You can use this data structure in your C# code like a normal array:

var length = 8L * 1024L * 1024L * 1024L;

// allocate a byte-array of 8 GiB

using(arr = new NativeTools.NativeArray64<byte>(length))

    arr[0] = 123;

    arr[length-1] = 222;

    Console.WriteLine("Allocated " + arr.AllocatedBytes);

// auto-disposed ...

Saturday, April 11, 2009

Generic .NET Math — First Benchmarks

As promised, I did some first benchmarking of different methods to implement generic arithmetic for .NET. I implemented simple vector classes using the Operator<T> class found in the MiscUtil library, another generic vector class using the approach described in my last posting (using INumeric/Calculators) and compared that to a completely non-generic vector implementation. The benchmarking method solved the following recursive problem for i = 100,000,000 on a Core 2 Quad Q9550:

sum_i := sum_(i-1) + <u_i, v> + <u_i, t>
u_i := u_(i-1) * sum_i

As the figure above shows, the non-generic version is far superior to the other implementations, especially in Long Mode (x64). The x86 version of the interface constraint approach (INumeric/Calculator) is only a tiny bit faster than MiscUtil’s Operator<T> class using runtime code generation. For x64 however, it’s about twice as fast.

References:

Friday, April 10, 2009

Generic Arithmetic for .NET

To me, one of the major design flaws and sources of frustration in .NET’s generics system is the absence of suppport for some sort of common arithmetic interface for numeric types (float, double, int, etc.).

But why should anyone care about such things? Well, the problem becomes obvious as soon as you try to implement—say—a generic vector class:

(Note: I'll only show the relavant parts of the code here to avoid clutter and confusion)

   1:  // a vector in R^n

   2:  class Vector<T>

   3:  {

   4:      T[] components;

5:

   6:      // ctors, properties etc. ...

7:

   8:      // operator defintion, here: vector addition

   9:      public staticVector<T> operator +(Vector<T> lhs,

  10:                                        Vector<T> rhs)

  11:      {

  12:          Vector<T> res = new Vector<T>(lhs.Length);

13:

  14:          for (int i = 0; i < res.Length; ++i)

  15:          {

  16:                       // compiler error!

  17:              res[i] = lhs[i] + rhs[i];

  18:          }

19:

  20:          return res;

  21:      }

  22:  }

The reason for the compiler error is, that there’s simply not enough information available to the compiler to check, whether T supports op_Addition. Now, if the CLI designers had included some interface definition like

   1:  interface INumeric<T>

   2:  {

   3:      T Add(T rhs);

   4:      T Sub(T rhs);

   5:      T Mul(T rhs);

   6:      T Div(T rhs);

   7:      // etc.

... (or similar) and if they had implemented this interface for every numeric primitive type, we could simply use an interface constraint to solve our problem:

   1:  // a vector in R^n

   2:  class Vector<T> where T: INumeric<T>

   3:  {

   4:      T[] components;

5:

   6:      // ctors, properties ...

7:

   8:      public static Vector<T> operator +(Vector<T> lhs, Vector<T> rhs)

   9:      {

  10:          Vector<T> res = new Vector<T>(lhs.Length);

11:

  12:          for (int i = 0; i < res.Length; ++i)

  13:          {

  14:              res[i] = lhs[i].Add(rhs[i]);       // OK, T implements INumeric<T>

  15:          }

16:

  17:          return res;

  18:      }

  19:  }

(Note: Operators are always statically defined in .NET, hence we can only use normal prefix method calls as static methods can never be virtual, thus neither abstract nor part of an interface defintion.)

Unfortunately, such an interface does not exist. Several people already tried to circumvent this limitation. One of the first approaches uses special numerical interfaces that implement arithmetic operations for concrete types. This technique works reasonably well, besides having the disadvantage of forcing the user to explicitly provide an implementation of such an interfaces (or at least a type name of a type implementing the according interface).

More recently, a new solution came up, which uses runtime code generation by building and compiling expression trees on the fly (or using LCG for older .NET version). This indeed is a very elegant solution to the problem. Using an implementation of this concept like the one found in the MiscUtil library you can simply write something like

   1:  T sum = Operator<T>.Add(lhs, rhs);

... inside your generic math algorithms.

The appropriate "Add" method for type T is generated on-the-fly upon the first call and cached for later reuse. The drawback of this method is the need for delegate invocations, which tend to be a lot more expensive than interface calls (at least my first benchmarks indicate that, especially on x64; curiously, the effect is much less severe on x86).

Interestingly, the latest F# CTP also provides generic vector and matrix classes (namespace Microsoft.FSharp.Math in assembly FSharp.PowerPack.dll). You can instantiate e.g. matrices like that:

   1:  Vector<float> v = VectorModule.Generic.create<float>(3, 3, 0f);

This returns a 3×3 matrix of floats initialized to 0.0f. So how did they do that? As far as I understand, their approach is very similar to the first method I described (using interfaces to define "calculator" types). To avoid having the user explicitly specify the operators’ implementation they use some kind of “type registry”, mapping each known numeric type to an appropriate implementation of an interface that defines basic arithmetic operations.

I’ve tried to do a quick n’ dirty implementation of this idea in C# which resulted in something like this (I haven’t yet checked if this actually runs...):

   1:  namespace GenericMath

   2:  {

   3:      public interface INumeric<T>

   4:      {

   5:          T Add(T lhs, T rhs);

   6:          T Sub(T lhs, T rhs);

   7:          T Mul(T lhs, T rhs);

   8:          T Div(T lhs, T rhs);

   9:          T Abs(T x);

  10:          T Zero { get; }

  11:          T One { get; }

  12:      }

13:

  14:      class FloatNumerics: INumeric<float>

  15:      {

  16:          public float Add(float lhs, float rhs)

  17:          {

  18:              return lhs + rhs;

  19:          }

20:

  21:          public float Sub(float lhs, float rhs)

  22:          {

  23:              return lhs - rhs;

  24:          }

25:

  26:          public float Mul(float lhs, float rhs)

  27:          {

  28:              return lhs * rhs;

  29:          }

30:

  31:          public float Div(float lhs, float rhs)

  32:          {

  33:              return lhs / rhs;

  34:          }

35:

  36:          public float Abs(float x)

  37:          {

  38:              return Math.Abs(x);

  39:          }

40:

  41:          public float Zero

  42:          {

  43:              get { return 0f; }

  44:          }

45:

  46:          public float One

  47:          {

  48:              get { return 1f; }

  49:          }

  50:      }

51:

  52:      // implementations for other types ...

53:

  54:      public static class ArithmeticAssociations

  55:      {

  56:          static readonly IDictionary<Type, object> operationsDict = new Dictionary<Type, object>

  57:          {

  58:              { typeof(float), new FloatNumerics() },

  59:              { typeof(double), new DoubleNumerics() },

  60:              { typeof(int), new Int32Numerics() }

  61:          };

62:

  63:          static string numericIfaceName = typeof(INumeric<>).FullName;

  64:          static string numericIfaceNameShort = typeof(INumeric<>).Name;

65:

  66:          public static bool UnregisterType<T>()

  67:          {

  68:              return operationsDict.Remove(typeof(T));

  69:          }

70:

  71:          public static void RegisterType<T>(INumeric<T> arithmetics)

  72:          {

  73:              operationsDict[typeof(T)] = arithmetics;

  74:          }

75:

  76:          public static INumeric<T> TryGetNumericOps<T>()

  77:          {

  78:              Type t = typeof(T);

79:

  80:              if (!operationsDict.ContainsKey(t))

  81:              {

  82:                  throw new ArgumentException("No implementation of "

  83:                      + numericIfaceNameShort + "<" + t.Name + "> registered.");

  84:              }

85:

  86:              Type opsType = operationsDict[t].GetType();

87:

  88:              if (opsType.GetInterface(numericIfaceName) == null)

  89:              {

  90:                  throw new ArgumentException("Arithmetic object associated with "

  91:                      + t.Name + " does not implement " + numericIfaceNameShort);

  92:              }

93:

  94:              return (INumeric<T>)operationsDict[t];

  95:          }

  96:      }

97:

  98:      // a sample class that uses INumeric + ArithmeticAssociations

  99:      public sealed class Vector<T>

 100:      {

 101:          static INumeric<T> ops = ArithmeticAssociations.TryGetNumericOps<T>();

 102:

 103:          T[] components;

 104:

 105:          public T this[int i]

 106:          {

 107:              get { return components[i]; }

 108:              set { components[i] = value; }

 109:          }

 110:

 111:          public int Length { get { return components.Length; } }

 112:

 113:          public Vector(int size)

 114:          {

 115:              components = new T[size];

 116:          }

 117:

 118:          public static T operator *(Vector<T> lhs, Vector<T> rhs)

 119:          {

 120:              T res = ops.Zero;

 121:

 122:              for (int i = 0; i < lhs.Length; ++i)

 123:              {

 124:                  res = ops.Add(res, ops.Mul(lhs[i], rhs[i]));

 125:              }

 126:

 127:              return res;

 128:          }

 129:

 130:          public static Vector<T> operator *(T lambda, Vector<T> vec)

 131:          {

 132:              Vector<T> result = new Vector<T>(vec.Length);

 133:

 134:              for (int i = 0; i < result.Length; ++i)

 135:              {

 136:                  result[i] = ops.Mul(vec[i], lambda);

 137:              }

 138:

 139:              return result;

 140:          }

 141:

 142:          // other properties, methods and operators ...

 143:      }

 144:

 145:      // ...

 146:  }

Objects of the type Vector can now be created by this simple line:

   1:  Vector<float> v = new Vector<float>(3);

As you can see, this is totally transparent to the user (no need to specifiy a "Calculator" type parameter) and doesn’t have the potential performance problems of the delegate/Func method.

Maybe someone finds this useful and can even improve the idea. I also plan to do a performance comparision of the different methods soon.

Update 2009-04-11:
I found another nice blog entry that covers the same topic and also presents a solution similar to the one showed above.