top of page

ImaGraph - An improved representation of images for GNNs.

If you're a CXO, founder or investor - follow me on LinkedIn & Twitter, or join my newsletter on my website here. I share latest simplified AI research and tactical advice on building AI products.

Please note this is my original research work. Do not use without authorization and citation.


Currently there are 3 ways of creating image representation data for neural networks. First one is converting image into a grid and process different parts but it has too much computation cost. Second approach uses tiling windows on image to convert into a sequence of with tiling windows to reduce computation cost. Both of these approaches are not suitable for graph processing. Lastly, third approach which takes tiles from previous approach and generates a linked graph. Even this approach suffers from limitation in that it is not object oriented. Every image related application in real world deals with objects as a unit, whether it is image editing, image generation or image translation.

Here we propose a contour based image representation for GNNs, which is much closer to ideal object based representation. Benefit of this approach will be tremendous boost in neural understanding of image data thereby reducing computation cost and time, which are limitations of GNN right now.


Images are built from objects and objects are focus of each image processing task. However current image representation methods do not focus on objects.

Here we propose a contour based approach since contours are fundamental properties of objects in a image. Every object has a contour otherwise it will merge with background or other objects and will not be identifiable.

In this approach, image is processed first to identify contours with a pre-defined tolerance threshold. Then image is broken into different layers with (uneven shape and size) closed contours. With a low threshold, we will have mix of broken objects and background. All of these are added as nodes in a graph, linked by proximity of contours.

Contour layers which are touching each other are directly connected to each other with connection being more stronger with more length of shared boundary.

This representation also necessitates understanding of only inclusion of different layers to form objects and this learning is easier to learn than current approach.

With different convolutions with different thresholds even objects which will be undetectable with current tiling based approaches will be identified.



bottom of page