Sterling has too many projects Blogging about programming, microcontrollers & electronics, 3D printing, and whatever else...

Reintroducing ArrayHash

| 1245 words | 6 minutes | raku
Array of markers

One of the earliest modules I wrote for Raku (then Perl 6) was ArrayHash. ArrayHash is basically a Hash that preserves key insertion order. I don’t remember what I originally wrote it to do, but I do still use it in one application I use every year or so to help me with a complicated task at work.

I don’t actually think the work itself is very interesting, but you might learn some bits about Raku internals through my work, so I hope this is useful to someone.

Going back to ArrayHash, the problem with it was that I wrote it at the same time that Perl 6 was initially being released. This means the code was mostly written before the GLR, The Great List Refactoring. It also means that it was written a long time ago and Raku has actually come a long way since then. I don’t have any particular reason to work on this thing, but it seemed like a fun project to dink around with right now, so I did.

To solve the problems it had, I re-engineered it. I didn’t fully rewrite it, but I did decide to go back to first principles to make sure it works like Raku Array and Hash does now. Since the initial writing, all sorts of details like how binding works got rejiggered a little, the way list flattening works got firmed up, etc. I wanted to take that into account. I think I got it mostly right.

The major changes are the way arguments are passed to methods now.

  1. You can no longer pass named arguments to any of the constructors or methods (unless it is an option modifying the constructor or method).

  2. You can no longer setup bound scalars during construction.

  3. You can now pass either Pairs or pairs of values the constructors and methods.

Let’s consider each of those in detail.

Named Arguments No More

There are several reasons for this change. The clearest reason is that the constructors for neither Hash nor Array accept named arguments. For example:

my @array := Array.new(a => 1, 'b' => 2);
dd @array; #> Array element = [:b(2)]

The first argument to the Array constructor in the code above is a named argument. The second is positional argument, a Pair object. The named argument is ignored, the positional argument goes into the constructed value. This can be confusing, but when being passed to a function or method call, Raku treats anything in named form as a named argument:

Array.new(named-a => 1, :named-b(2), :named-c<3>);

All of those are above are named forms. There are some workarounds to this. Consider the following modification to the first example:

my @array := Array.new( (a => 1, 'b' => 2) );
dd @array; #> Array element = [:a(1), :b(2)]

By adding that extra set of parenthesis, you are changing the context of the Raku parser so that now it is encountering named Pairs in a list rather than in a call. In this case, any Pair will be constructed as a Pair. Then, because the arguments to new are flattened, the Pairs get passed in as positional arguments.

This is not my favorite nuance of Raku, but it is there, so you need to be aware of it.

In addition to the obvious reason I gave above, there are a couple other practical reasons. Named arguments are intended for use as settings and configuration, so using them to populate the ArrayHash presented problems. The ArrayHash constructor itself took a named argument, :multivalued. That being present meant that passing a key with that name would likely not work as expected. This is not a problem if you use the multi-hash constructor, but still an overall inconsistency in the interface.

Another problem is that allowing named arguments meant testing was complicated. Named arguments have no order. In fact, the order will probably be different every time you run the program. Therefore, I had to do all sorts of acrobatics in the tests to be able to cope with some elements being inserted in a different order each time.

It also made certain things work unnaturally because it is perfectly legal to mix positional and named arguments while calling a function. But when the method receives the arguments, it only knows the order of the positional ones and named ones have no position at all. This mean someone might expect a named argument to come before a positional, but they never would. I would process the positionals in order first and then process the named arguments in whatever order I got them, which generally didn’t match the order given to function call.

Anyway, this change makes everything cleaner and nicer.

No Binding at Construction Time

Previously, I did some acrobatics to ensure that values passed at construction time were bound. This was a mistake. It caused headaches. Binding has consequences and resulted in action at a distance. Consider this example:

my $b = 10;
my %hash := array-hash('a' => 1, 'b' => $b);
%hash<a> = 10;

The code above works in the new code because there is not binding. In the old code, though, it would have failed with an error like:

Cannot assign to an immutable value

This is because in Raku, a binding effectively ties two names to the same container. The container in the case of $b is a Scalar. The container in the case of the 1 value set on key 'a' has no container. Therefore, a value binding is direct to the constant. Constants cannot be changed, so any attempt to change it will lead to the exception.

And so, to avoid these problems and the fact that neither Hash nor Array work anything like this, I have gotten rid of this mistake. If you want to bind now, you need to use the binding operator:

my $b = 10;
my %hash := array-hash('a' => 1);
%hash<b> := $b;

That works as expected.

Pairs and pairs

And now for a feature added rather than removed: You can send Pair objects or pairs of objects as arguments wherever elements are passed. This should work very similar to how Hash constructors and methods work. For example:

my %hash := ArrayHash.new: 'a', 1, 'b' => 2, 'c', 'd' => 3;
%hash.push: 'e', 4, 'f' => 5;

Basically, all the methods now have the ability to take either explicit Pairs as arguments or pairs of objects that can be made into Pair objects. Just so you know the specifics, it works according to the following rules, which are similar to how these arguments works when passed to Hash methods.

The first item is expected to be a key or Pair:

  1. If expecting a key or Pair and a Pair object is encountered, that Pair is used. The next item is expected to be a key or Pair.

  2. If expecting a key or Pair and a non-Pair object is encountered, it will be treated as a key. The next item is expected to be a value.

  3. If expecting a value, any object (even a Pair) encountered next will be treated as the value of a newly constructed Pair combined with the immediately preceding key.

  4. If expecting a value and the end of the argument list is encountered, an X::Hash::Store::OddNumber exception will be thrown.

So that is pretty much it. There are many other changes of smaller consequence and please file a bug report if you see an issue with the changes.

Cheers.

The content of this site is licensed under Attribution 4.0 International (CC BY 4.0).