You would probably hear about new types in
C# 8 such as Index
and Range.
If you answer no, it’s high time to take a look and see how they help us with collections
in C#. Also for supporting these types Microsoft added two new operators such
as ^ and ..
- System.Index represents an index into a sequence.
- The index from end operator ^, which specifies that an index is relative to the end of the sequence.
- System.Range represents a sub range of a sequence.
- The range operator .., which specifies the start and end of a range as its operands.
System.Index operator ^(int
fromEnd);
System.Range operator ..(Index start = 0,
Index end = ^0);
Microsoft came up with the idea of using
this sequence quite a long time ago. They have already implemented work with
sequences in F#
// Sequence that has an increment.
seq { 0 .. 10 .. 100 }
seq { for
i in 1 .. 10 -> i * i }
Probably if you work with Python, you will
probably use or see similar functionality there:
>>>
"Hello,
world!"[3:9]
'lo, wo'
>>>
string = "Hello,
world!"
>>>
string[:5]
'Hello'
>>>
string[-6:-1]
'world'
>>>
string[-9:]
'o, world!'
>>>
string[:-8]
'Hello'
>>>
string[:]
'Hello, world!'
So, let’s take a look how it was
implemented in C#.
static void Main(string[] args)
{
var words = new string[]
{
// index from
start index from end
"The", // 0 ^9
"quick", // 1 ^8
"brown", // 2 ^7
"fox", // 3 ^6
"jumped", // 4 ^5
"over", // 5 ^4
"the", // 6 ^3
"lazy", // 7 ^2
"dog" // 8 ^1
}; // 9 (or
words.Length) ^0
Console.WriteLine($"The last
word is {words[^1]}");
var quickBrownFox = words[1..4];
var lazyDog = words[^2..^0];
var allWords = words[..]; // contains
"The" through "dog".
var firstPhrase = words[..4]; // contains
"The" through "fox"
var lastPhrase = words[6..]; // contains
"the", "lazy" and "dog"
Range phrase = 1..4;
var text = words[phrase];
}
This example was grabbed from Microsoft
documentation about C#
8 new features. I would suggest checking
out how it works under the hood.
private static void Main(string[] args)
{
string[] obj = new string[9];
obj[0] = "The";
obj[1] = "quick";
obj[2] = "brown";
obj[3] = "fox";
obj[4] = "jumped";
obj[5] = "over";
obj[6] = "the";
obj[7] = "lazy";
obj[8] = "dog";
string[] array = obj;
string[] array2 = array;
Console.WriteLine("The last word
is "
+ array2[new Index(1, true).GetOffset(array2.Length)]);
string[] subArray = RuntimeHelpers.GetSubArray(array, new Range(1, 4));
string[] subArray2 = RuntimeHelpers.GetSubArray(array, new Range(new Index(2, true), new Index(0, true)));
string[] subArray3 = RuntimeHelpers.GetSubArray(array, Range.All);
string[] subArray4 = RuntimeHelpers.GetSubArray(array, Range.EndAt(4));
string[] subArray5 = RuntimeHelpers.GetSubArray(array, Range.StartAt(6));
Range range = new Range(1, 4);
string[] subArray6 = RuntimeHelpers.GetSubArray(array, range);
}
In this example, you will find how ^
operator works. In original code we have line of code for output the text “The
last word is dog”.
Console.WriteLine($"The last
word is {words[^1]}");
Compiler just creates extension method
using new type Index and function GetOffset.
Console.WriteLine("The last word
is "
+ array2[new Index(1, true).GetOffset(array2.Length)]);
If we decide to take 3 elements from the
end, compiler will generate for us something like this.
Original:
Console.WriteLine($"The last
word is {words[^3]}");
Compiler generated:
Console.WriteLine("The last word
is "
+ array2[new Index(3, true).GetOffset(array2.Length)]);
Here the compiler creates Index structure
with the first parameter index value which has to be more than zero, and the
second parameter indicating the index direction, false means from the start and true
means that we start from the end of index.
//
// Summary:
//
Initializes a new System.Index with a specified index position and a
value that
// indicates
if the index is from the start or the end of a collection.
//
// Parameters:
// value:
// The index
value. It has to be greater or equal than zero.
//
// fromEnd:
// A boolean
indicating if the index is from the start (false) or from the end (true)
// of a
collection.
public Index(int value, bool fromEnd = false);
And function GetOffset calculates the
offset from the start using the given collection length.
The
new indexer will be implemented by converting the argument of type Index into
an int and emitting a call to the int based indexer. For discussion purposes, let’s
use the example of receiver[expr]. The conversion of expr to int will occur as
follows:
- When the argument is of the form ^expr2 and the type of expr2 is int, it will be translated to receiver.Length - expr2.
- Otherwise, it will be translated as expr.GetOffset(receiver.Length).
Range:
C#
has no syntactic way to access "ranges" or "slices" of
collections. Usually users are forced to implement complex structures to
filter/operate on slices of memory, or resort to LINQ methods like
list.Skip(5).Take(2). With the addition of System.Span<T> and other
similar types, it becomes more important to have this kind of operation
supported on a deeper level in the language/runtime, and have the interface
unified.
The
language will introduce a new range operator x..y. It is a binary infix operator
that accepts two expressions. Either operand can be omitted (examples below),
and they have to be convertible to System.Index. It will be lowered to the
appropriate System.Range factory method call.
In
example above for creating new Range structure we copied the information from
Index(2, true) and Index(0, true) (where Index(0, true) mean .Length of your
collection)
var lazyDog = words[^2..^0];
So, compiler generated for you code like
this:
string[] subArray2 = RuntimeHelpers.GetSubArray(array, new Range(new Index(2, true), new Index(0, true)));
Here you can find how this GetSubArray
function is implemented:
public static T[] GetSubArray<T>(T[] array, Range range)
{
if (array == null)
{
ThrowHelper.ThrowArgumentNullException(ExceptionArgument.array);
}
(int offset, int length) = range.GetOffsetAndLength(array.Length);
if (default(T)! != null || typeof(T[]) == array.GetType()) // TODO-NULLABLE:
default(T) == null warning (https://github.com/dotnet/roslyn/issues/34757)
{
// We know the type
of the array to be exactly T[].
if (length == 0)
{
return Array.Empty<T>();
}
var dest = new T[length];
Buffer.Memmove(
ref Unsafe.As<byte, T>(ref dest.GetRawSzArrayData()),
ref Unsafe.Add(ref Unsafe.As<byte, T>(ref array.GetRawSzArrayData()), offset),
(uint)length);
return dest;
}
else
{
// The array is
actually a U[] where U:T.
T[] dest = (T[])Array.CreateInstance(array.GetType().GetElementType()!, length);
Array.Copy(array, offset, dest, 0, length);
return dest;
}
}
Net core compiler (roslyn) under the hood
uses Array.Copy to clone your information from one array to another.
Range structure has three important
properties called All, End, Start.
var allWords = words[..]; // contains
"The" through "dog".
var firstPhrase = words[..4]; // contains
"The" through "fox"
var lastPhrase = words[6..]; // contains
"the", "lazy" and "dog"
And here you will see how compiler uses
properties described above for generated code.
string[] subArray3 = RuntimeHelpers.GetSubArray(array, Range.All);
string[] subArray4 = RuntimeHelpers.GetSubArray(array, Range.EndAt(4));
string[] subArray5 = RuntimeHelpers.GetSubArray(array, Range.StartAt(6));
These EndAt() and StartAt() methods are
called the property, which I have mentioned above.
/// <summary>Create a Range
object starting from start index to the end of the collection.</summary>
public static Range StartAt(Index start) => new Range(start, Index.End);
/// <summary>Create a Range
object starting from first element in the collection to the end Index.</summary>
public static Range EndAt(Index end) => new Range(Index.Start, end);
Range constructor has two Index parameters
which allow you to setup start and end index range.
/// <summary>Construct a Range
object using the start and end indexes.</summary>
/// <param name="start">Represent the
inclusive start index of the range.</param>
/// <param name="end">Represent the exclusive
end index of the range.</param>
public Range(Index start, Index end)
{
Start = start;
End = end;
}
Here are a few examples how you can
combine and use your Range in C#.
Range phrase = 1..4;
Range phrase1 = 1..^3;
Range phrase2 = ^3..^1;
Range slice1 = 2..^3; //Range.Create(2,
Index.CreateFromEnd(3))
Range slice2 = ..^3; //Range.ToEnd(Index.CreateFromEnd(3))
Range slice3 = 2..; //Range.FromStart(2)
Range slice4 = ..; //Range.All
You
need to know some important things about Index and Range support in existing
library types.
Implicit
Index support
If
you want to use Index implicitly in you code, your .net code should meet the
following criteria:
- The type is Countable.
- The type has an accessible instance indexer which takes a single int as the argument.
- The type does not have an accessible instance indexer which takes an Index as the first parameter. The Index must be the only parameter or the remaining parameters must be optional.
By
Countable type Microsoft means the
type which support has property named Length
or Count with an accessible getter
and a return type of int.
Implicit Range support
If
you want to use Range implicitly in
you code, you .net code should meet the following criteria:
- The type is Countable.
- The type has an accessible member named Slice which has two parameters of type int.
- The type does not have an instance indexer which takes a single Range as the first parameter. The Range must be the only parameter or the remaining parameters must be optional.
For
such types, the language will bind as if there is an index member of the form T
this [Range range] where T is the return type of the Slice method including any
ref style annotations. The new member will also have matching accessibility
with Slice. Below you will find example provided by Microsoft how you can
implement your type for supporting Range
type.
C#
language has special cases of the following known types:
- string: the method Substring will be used instead of Slice.
- array: the method System.Reflection.CompilerServices.GetSubArray will be used instead of Slice.
So,
if you want to create your own collections and want them to support Index and
Range, your collection must be countable and support slicing (methods Slice(),
Substring(), GetSubArray() or indexer T this[Range range]).
Conclusions:
The
new operators (^ and ..) are syntactic sugar. The functionality
can be implemented by explicit calls to Index
and Range factory methods, but it
will result in a lot more boilerplate code, and the experience will be
unintuitive. In addition, I would like to say that C# tries to grab the best
things from functional languages to C# to improve development experience. I
like this “syntactic” sugar and I will definitely use this syntax to improve
existing codebase.
No comments:
Post a Comment