Subgraphs generator algorithm

In this review, we present a survey of frequent subgraph mining applications that deal with biomolecular graphs, such as interaction networks, protein graph structures and chemical structures.īy definition, a graph consists of nodes and edges. One common problem in graph theory consists of finding the underlying subgraph patterns in graphs, which are also referred to as network motifs or graphlets.

Graphs or networks are ubiquitous data types, pervasive in multiple domains, from social sciences to medicine, biology and chemistry. We give an overview of various subgraph mining algorithms from a bioinformatics perspective and present several of their potential biomedical applications. In this review, we provide an introduction to subgraph mining for life scientists. Thus far, information about the bioinformatics application of subgraph mining remains scattered over heterogeneous literature. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. The definition of which subgraphs are interesting and which are not is highly dependent on the application. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. Messaging Protocols (Some are no longer needed.Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. tasks for the next stage) which would further be distributed (again maybe randomly) among all the task queues. After processing it will produce many subgraphs(i.e. A slave will pick a task from any of the task queue(maybe randomly) and process it. Each slave also has a task queue which actually consists of subgraphs.This way no slave gets loaded and checking uniqueness is properly distributed. And each slave knows which bloom filter it must contact to check a hash value of a subgraph. Each slave will maintain a bloom filter for a particular range of hash values. Initally, we shard the range of hash values.

We use a distributed bloom filter on these values to maintain only distinct subgraphs.

Suppose, we map(hash) each subgraph to a value(128 bit number).

Master is just for initialization and bookeeping.

We implement this algorithm in a distributed environment using a master-slave architecture.

Keep only distinct subgraphs in each step using bloomfilter (which maybe cleared after each step to improve accuracy).

In this way you will obtain all the subgraphs with exactly i edges from subgraphs of (i-1) edges((i-1)th stage).

In ith stage pick all subgraphs generated in (i-1)th stage, and extend the graphs by exactly one edge.

In the ith stage, generate all the subgraphs of the given graph that have exactly i edges.

Initailly each edge represents a connected subgraph of the given graph.

Distributed/Parallel Graph Algorithms Generating all connected subgraphs of a given graph Presentation: Algorithm

YOUR CART

Subgraphs generator algorithm