(A jupyter notebook with math and code(spark) is available on github repo )
Isomap stands for isometric mapping. Isomap is a non-linear dimensionality reduction method based on the spectral theory which tries to preserve the geodesic distances in the lower dimension. Isomap starts by creating a neighborhood network. After that, it uses graph distance to the approximate geodesic distance between all pairs of points. And then, through eigenvalue decomposition of the geodesic distance matrix, it finds the low dimensional embedding of the dataset. In non-linear manifolds, the Euclidean metric for distance holds good if and only if neighborhood structure can be approximated as linear. If neighborhood contains holes, then Euclidean distances can be highly misleading. In contrast to this, if we measure the distance between two points by following the manifold, we will have a better approximation of how far or near two points are. Let’s understand this with an extremely simple 2-D example. Suppose our data lies on a circular manifold in a 2-D structure like in the image below.
In order to follow along with this article, you will need experience with Python code, and a beginners understanding of Deep Learning. We will operate under the assumption that all readers have access to sufficiently powerful machines, so they can run the code provided.
If you do not have access to a GPU, we suggest accessing it through the cloud. There are many cloud providers that offer GPUs. DigitalOcean GPU Droplets are currently in Early Availability, learn more and sign up for interest in GPU Droplets here.
For instructions on getting started with Python code, we recommend trying this beginners guide to set up your system and preparing to run beginner tutorials.
We will reduce the data to 1-D using the Euclidean distances and approximate geodesic distances. Now, if we look at the 1-D mapping based on the Euclidean metric, we see that for points which are far apart(a & b) have been mapped poorly. Only the points which can be approximated to lie on a linear manifold(c & d) give satisfactory results. On the other hand, see the mapping with geodesic distances, it nicely approximates the close points as neighbors and far away points as distant. The geodesic distances between two points in the image are approximated by graph distance between the two points. Thus, Euclidean distances should not be used for approximating the distance between two points in non-linear manifolds while geodesic distances can be used.
Isomap uses the above principle to create a similarity matrix for eigenvalue decomposition. Unlike other non-linear dimensionality reduction like LLE & LPP which only use local information, isomap uses the local information to create a global similarity matrix. The isomap algorithm uses euclidean metrics to prepare the neighborhood graph. Then, it approximates the geodesic distance between two points by measuring shortest path between these points using graph distance. Thus, it approximates both global as well as the local structure of the dataset in the low dimensional embedding.
Let’s have a basic understanding of few concepts which we need to implement the Isomap algorithm.
Pregel is a distributed programming model developed by Google for processing large-scale graphs. It is the inspiration behind the Apache giraph project and GraphX library of spark. Pregel is basically a message-passing interface based on the idea that a vertex’s state should depend on its neighbors. A pregel computation takes as input a graph and a set of vertex states. At every iteration which is called superstep, it processes messages received at a vertex and updates the vertex state. After that, it decides which of its neighbors should receive the message at next superstep and what should be the message. Thus, messages are passed along edges and computation happens only at the vertices. The graph is not passed across the network only messages. Computation stops at maximum iterations or when no messages are left to pass. Let’s understand it using a simple example. Suppose, We need to find the degree of each vertex for the graph given below. Image shown below represents a single iteration of pregel model.
At initialization, every vertex’s degree is 0. We can send an empty message as an initial message to start the computation. At the end of superstep 1, Each vertex sends message 1 through each of its edges. At next superstep, each vertex sums the messages received and update its degree.
Isomap is closely related to the original multidimensional scaling algorithm proposed by the Torgerson and Gower. In fact, it is an extension of the classical multidimensional scaling. The Classical multidimensional algorithm gives a closed form solution to the dimensionality reduction problem. Classical MDS uses the euclidean distances as the similarity metric while isomap uses geodesic distances. Steps of classical MDS are
Isomap differs from classical MDS in initial few steps only. Instead of using euclidean metric for dissimilarity, it uses graph distances. Steps of the Isomap algorithm are:
Neighbourhood graph: Create a neighborhood graph and adjacency matrix from the dataset.
Dissimilarity Matrix: After neighborhood search, we will use spark’s graphX library for calculating the geodesic distances between the points. While creating our neighborhood network, we have to make sure that the resulting graph is a single connected component. If not, then our similarity matrix will remain incomplete and results will be incoherent. We need to iterate over the different values of neighborhood selection parameter to get the fully connected graph. As of now, spark does not have the shortest path function for the weighted graph. We will have to implement it. The code below presents a shortest path algorithm using pregel like api of graphX. The code is from the shortest path function for unweighted graphs of graphX lib. Functions addMaps and sendMessage have been modified to support weighted graphs.
def ShortestPath(Verts: RDD[(VertexId, imMap[Long, Double])],
Edges: RDD[Edge[Double]], landmarks: Seq[Long] = Seq()):
Graph[imMap[Long,Double],Double] = {
val g = Graph(Verts, Edges)
type SPMap = Map[VertexId, Double]
def makeMap(x: (VertexId, Double)*) = Map(x: _*)
def incrementMap(spmap1: SPMap, spmap2: SPMap, d: Double): SPMap = {
spmap1.map { case (k, v) =>
if (v + d < spmap2.getOrElse(k, Double.MaxValue)) k -> (v + d)
else -1L -> 0.0
}
}
def addMaps(spmap1: SPMap, spmap2: SPMap): SPMap = {
(spmap1.keySet ++ spmap2.keySet).map {
k => k -> math.min(spmap1.
getOrElse(k, Double.MaxValue),
spmap2.getOrElse(k,
Double.MaxValue))
}(collection.breakOut)
}
var spGraph: Graph[imMap[Long,Double],Double] = null
if (landmarks.isEmpty){
spGraph = g.mapVertices { (vid, attr) => makeMap(vid -> 0)}
}
else{
spGraph = g.mapVertices { (vid, attr) =>
if (landmarks.contains(vid)) makeMap(vid -> 0) else makeMap()}
}
val initialMessage = makeMap()
def vertexProgram(id: VertexId, attr: SPMap, msg: SPMap): SPMap = {
addMaps(attr, msg)
}
def sendMessage(edge: EdgeTriplet[SPMap, Double]):
Iterator[(VertexId, SPMap)] = {
val newAttr = incrementMap(edge.srcAttr, edge.dstAttr, edge.attr) - (-1)
if (!newAttr.isEmpty) Iterator((edge.dstId, newAttr))
else Iterator.empty
}
val h = Pregel(spGraph, initialMessage)(vertexProgram, sendMessage, addMaps)
return(h)
}
Before eigenvalue decomposition, we have to square the distance and double center the squared similarities matrix. After eigenvalue decomposition select the first K eigenvectors with K highest eigenvalues.
What we have implemented is the vanilla version isomap. It requires a lot of time and computing power. It has two bottlenecks first Calculation of dissimilarity matrix requires O(N2) operations where N is the number of the samples and the second calculation of pairwise graph distances. If N is huge, which is true generally in case of big datasets, it becomes impractical. Solution to this problem is Landmark Isomap. Landmark isomap is based on landmark MDS. Landmark MDS selects a group of points termed as Landmarks and implements classical MDS on them. Based on the mapping obtained from classical MDS, remaining points are mapped in the low dimensional embedding using distance-based triangulation. Steps for Landmark classical scaling
Isomap performs poorly when manifold is not well sampled and contains holes. As mentioned earlier neighborhood graph creation is tricky and slightly wrong parameters can produce bad results.
In this article, we discussed another manifold learning algorithm IsoMap(Isometric Mapping).
In the beginning of the post, we talked about what is isometric mapping and how it is different from other dimension reduction algorithms. Then, we had a brief discussion on pregel API. Later on, we implemented an isomap algorithm in scala using spark’s GraphX library. For people wishing to go deeper in distributed graph processing GraphX is a good starting point.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!