summary |
shortlog |
changelog |
graph |
tags |
bookmarks |
branches |
files |
changeset |
file |
latest |
revisions |
annotate |
diff |
comparison |
raw |
help

doc/tutorial.rst

author | Dmitriy Morozov <dmitriy@mrzv.org> |

Thu, 10 May 2012 15:42:21 -0700 | |

branch | dev |

changeset 247 | ad3aefb5a0e0 |

parent 211 | 347b3461965a |

permissions | -rw-r--r-- |

Added wasserstein_distance()

.. _tutorial: Brief Tutorial ============== It suffices to import only the necessary commands from module :mod:`dionysus`, but for the purposes of the tutorial, we import everything:: from dionysus import * Read in a points file (for :func:`points_file` the points should appear one per line with coordinates separated with spaces; of course, one can read in the points however one wants):: points = [p for p in points_file('points_filename')] print "Number of points:", len(points) Complexes --------- If the points are in :math:`\mathbb{R}^2` or :math:`\mathbb{R}^3`, one can construct an alphashape filtration:: simplices = Filtration() fill_alpha2D_complex(points, simplices) # for 2D, or fill_alpha3D_complex(points, simplices) # for 3D Functions :ref:`fill_alpha*_complex <alphashapes>` fill the ``simplices`` with all the :class:`simplices <Simplex>` of the Delaunay triangulation. Each one has its attribute ``data`` set to a pair: the smallest value of the squared distance function on the dual Voronoi cell and whether the simplex is critical or not (i.e. whether its dual cell does or does not contain a critical point of the distance function). See :ref:`alphashapes` for more details, and :ref:`alpha-shape-example` for a full example. As a result, if one wanted only those simplices whose alpha shape value did not exceed 10, one could obtain them as follows:: simplices10 = [s for s in simplices if s.data[0] <= 10] If the point set lies in higher dimensions, one may construct a Rips complex on it. This complex requires only pairwise distances, which makes it very versatile. One must first construct an instance of a class providing such distances (e.g. :class:`PairwiseDistances` for explicit points, see :ref:`distances` for more details), and then pass it to the :class:`Rips` complex class:: distances = PairwiseDistances(points) rips = Rips(distances) Usually, because of space restrictions, generation of a Rips complex has to be restricted to a :math:`k`-skeleton of the complex and some maximal parameter :math:`max`. In the following example :math:`k = 3` and :math:`max = 50`:: simplices = Filtration() rips.generate(3, 50, simplices.append) :meth:`Rips.generate` takes a skeleton and a maximum distance cutoffs, and a function which is called with each of the simplices in the complex (in this case, Python list `simplices`' ``append`` method). Persistence ----------- There are two ways of computing persistence in Dionysus. The first *offline* way, required by :class:`StaticPersistence` (and its derivatives), is to set up the entire filtration at once, compute its persistence in one operation, and then examine the pairings. The second way is to feed simplices one by one in an *online* manner and manually keep track of the pairs that are created. :class:`ZigzagPersistence` and :class:`CohomologyPersistence` accept their input this way, Offline ^^^^^^^ For the first approach, i.e. to use :class:`StaticPersistence`, one must put the sort the filtration with respect to some ordering (for example, :func:`data_dim_cmp` for alpha shapes or :meth:`Rips.cmp` for the Rips complex):: simplices.sort(data_dim_cmp) # for the alpha shapes simplices.sort(rips.cmp) # for the rips complex Creating an instance of :class:`StaticPersistence` initialized with the filtration really initializes a boundary matrix. The subsequent call to :meth:`~StaticPersistence.pair_simplices` reduces the matrix to compute the persistence of the filtration:: p = StaticPersistence(simplices) p.pair_simplices() Once the simplices are paired, one may examine the pairings by iterating over the instance of :class:`StaticPersistence`. We can use an auxilliary map ``smap`` to remap the persistence indices into the actual simplices:: smap = p.make_simplex_map(simplices) for i in p: if i.sign(): birth = smap[i] if i.unpaired(): print birth.dimension(), birth.data, "inf" continue death = smap[i.pair()] print birth.dimension(), birth.data, death.data The iterator ``i`` over the instance ``p`` of :class:`StaticPersistence` is of type :class:`SPNode`, and represents individual simplices taken in the filtration order. It knows about its own :meth:`~SPNode.sign` and :meth:`~SPNode.pair` as well as whether it is :math:`~SPNode.unpaired`. :meth:`StaticPersistence.make_simplex_map` creates a map that we use to get the actual simplices: ``smap[i]`` and ``smap[i.pair()]``. The previous code snippet prints out the persistence diagrams of the given filtration. Online ^^^^^^ Class :class:`ZigzagPersistence` accepts additions and removals of the simplices one by one, and returns an internal representation of the simplices. (:class:`CohomologyPersistence` works in the same way, but accepts only additions of the simplices.) When one adds a simplex via :meth:`ZigzagPersistence.add`, one must provide its boundary in this internal representation together with a *birth value* which :class:`ZigzagPersistence` will store in case a class is born as a result of the addition. :meth:`~ZigzagPersistence.add` returns a pair ``(i,d)``. ``i`` is the internal representation of the newly added simplex, which one must record for future use in the boundaries of its cofaces (below it is recorded in the dictionary ``complex``). ``d`` is `None` in case of the birth, otherwise, it is the previously recorded birth value of the class that dies as a result of the addition. The following code adds all the ``simplices`` to a zigzag:: simplices.sort(data_dim_cmp) complex = {} zz = ZigzagPersistence() for s in simplices: i,d = zz.add([complex[sb] for sb in s.boundary], (s.dimension(), s.data)) complex[s] = i if d is not None: # we have a death dimension, birth = d # we previously stored the (dimension, data) pair print dimension, birth, s.data Similarly, one can remove simplices by calling :meth:`ZigzagPersistence.remove` with the internal index previously returned by :meth:`~ZigzagPersistence.add`:: for s in reversed(simplices): d = zz.remove(complex[s], (s.dimension() - 1, s.data)) del complex[s] if d is not None: dimension, birth = d print dimension, birth, s.data Naturally, :meth:`~ZigzagPersistence.remove` returns only `d`. If one wants to add or remove an entire set of simplices "at once" (i.e. with all births being assigned the same value), there are two helper functions: :func:`add_simplices` and :func:`remove_simplices`. See the :ref:`bindings reference <python-bindings>` for more details, and :ref:`triangle-zigzag-example` for an example of :class:`ZigzagPersistence`. See :ref:`rips-pairwise-cohomology` for an example of :class:`CohomologyPersistence`. .. _speed-up-suggestions: Speed-up suggestions -------------------- Currently, when the choice comes between efficiency and flexibility, the Python bindings err on the side of flexibility. There is hope that in the future the choice won't really be necessary. Meanwhile, one can use a few techniques that speed up computation at the expense of memory. Note, however, that since the recent switch of :class:`PairwiseDistances` to C++ rather than pure Python, it is not clear whether these deliver a substantial speed-up: * To avoid (possibly expensive) computation of distances during Rips complex generation, store :class:`ExplicitDistances` (see :ref:`distances`):: distances = PairwiseDistances(points) distances = ExplicitDistances(distances) * To avoid the computation of simplex sizes in the Rips complex during the initialization of a :class:`Filtration`, store them explicitly in :attr:`Simplex.data` attribute (this is not done by default to save memory); then use :func:`data_dim_cmp` when sorting the :class:`Filtration`:: rips = Rips(distances) simplices = Filtration() rips.generate(..., simplices.append) for s in simplices: s.data = rips.eval(s) simplices.sort(data_dim_cmp) .. include:: substitutions.aux