Larray (aka Larry)

Overview

Larray offers the notion of “ticks”, but the axes themselves are not named. The model seems to be something like data with coordinates

Importantly,

  • Pure Python implementation
  • Is not an ndarray
    • therefore, lots of redefined functionality
    • also lots of presumed intention of data (shuffling labels, group means, ...)
    • not lightweight
  • Does not offer named axes
  • Only one (class of) dtype!!
  • Can do n-D
  • Good mixed indexing

Construction

Larrays can be constructed from an array-like object and tick names for each axis. Alternatively, Larrays can be constructed from a number of data-with-coordinates representations.

Here’s how to create a larry using fromtuples (note the cast to float, and the filled-in NaN):

>>> data = [('a', 'a', 1), ('a', 'b', 2), ('b', 'a', 3)]
>>> larry.fromtuples(data)
label_0
    a
    b
label_1
    a
    b
x
array([[  1.,   2.],
       [  3.,  NaN]])

Here are examples of fromdict and fromlist:

>>> data = {('a', 'c'): 1, ('a', 'd'): 2, ('b', 'c'): 3, ('b', 'd'): 4}
>>> larry.fromdict(data)
label_0
    a
    b
label_1
    c
    d
x
array([[ 1.,  2.],
       [ 3.,  4.]])

>>> data = [[1, 2, 3, 4], [('a', 'c'), ('a', 'd'), ('b', 'c'), ('b', 'd')]]
>>> larry.fromlist(data)
label_0
    a
    b
label_1
    c
    d
x
array([[ 1.,  2.],
       [ 3.,  4.]])

Indexing

Indexing using the bracket syntax arr[ <slicer> ] seems to return you exactly what numpy would slice out of the underlying array. All slicing works, with the exception of “fancy” indexing, and ellipsis indexing, and the use of np.newaxis.

There is also a smart slicer riding along with the larrays that can slice with label information. It seems to nicely blend labels and regular integer slicing. To disambiguate possible integer labels and integer indexing, labels always must be enclosed in a list.:

>>> arr = la.larry(np.arange(6).reshape(2,3), [ ['u', 'v'], [2,5,3], ])
>>> arr
label_0
    u
    v
label_1
    2
    5
    3
x
array([[0, 1, 2],
       [3, 4, 5]])
>>> arr.lix[['u']]
label_0
    2
    5
    3
x
array([0, 1, 2])
>>> arr.lix[['u'],2:5]
2
>>> arr.lix[['u'],[2]:[5]]
0
>>> arr.lix[['u'],[2]:[3]]
label_0
    2
    5
x
array([0, 1])

Binary Operations (arithmetic)

Binary operations are not, in general, numpy-thonic

Alignment

Larray seems to want to only make binary operations on data with identical coordinates. Furthermore, it will re-align the data if necessary. Therefore, this example is ok:

>>> y1 = larry([1, 2], [['a', 'z']])
>>> y2 = larry([1, 2], [['z', 'a']])

What is y1 + y2?

>>> y1 + y2
label_0
    a
    z
x
array([3, 3])

But this fails

>>> z1 = larry([1, 2], [['a', 'b']])
>>> z2 = larry([3, 4], [['c', 'd']])

>>> z1 + z2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "la/la/deflarry.py", line 494, in __add__
    x, y, label = self.__align(other)
  File "la/la/deflarry.py", line 731, in __align
    raise IndexError, 'A dimension has no matching labels'
IndexError: A dimension has no matching labels

Intersections and Broadcasting

Binary ops can introduce an implicit intersection operation, for example (this would be illegal code in numpy):

>>> arr = la.larry(np.arange(6).reshape(2,3), [ ['u', 'v'], ['x','y','z']])
>>> arr2 = la.larry(np.arange(9).reshape(3,3), [ ['u', 'v', 'w'], ['x', 'y', 'z']] )
>>> arr2 + arr
label_0
    u
    v
label_1
    x
    y
    z
x
array([[ 0,  2,  4],
       [ 6,  8, 10]])

According to the matched-coordinates operation rule, broadcasting does not happen

>>> arr3 = la.larry([4,5,6], [['x','y','z']])
>>> arr3 + arr
------------------------------------------------------------
Traceback (most recent call last):
  File "<ipython console>", line 1, in <module>
  File "/Users/mike/usr/lib/python2.5/site-packages/la/deflarry.py", line 583, in __add__
    x, y, label = self.__align(other)
  File "/Users/mike/usr/lib/python2.5/site-packages/la/deflarry.py", line 820, in __align
    raise IndexError, msg
IndexError: Binary operation on two larrys with different dimension

Table Of Contents

Previous topic

DataArray

Next topic

Pandas

This Page