Color clustering – Part #1: Clustering by color relevance

Hi all,

in a previous post, i promised you to post an article about color clustering & gif creation. After around half a year, i decided to refactor a bit the code & tell more about it. Although, i wouldn’t like to tire you with a huge article about the whole process. Thus, i have divided it in three parts which consist the Color Clustering Epic.

The parts are the following:

  • Part 1: Color Clustering by color relevance
  • Part 2: Image generation by color aggregation
  • Part 3: Gif generation

Let’s dive into it!

Color Clustering by color relevance

But what exactly do i call Color Clustering? The way that someone could group relevant colors by examining their actual pixel populations in a target image and their rgb relevance, where a pixel’s color is specified by the values of red,green & blue ( 0 <= value <= 255). The rgb relevance of a color with another can be found by:

  • finding the Euclidean distance between the rgb components of one color with another
  • compare the distance with one fixed offset number in order to determine if the colors are relevant.

If the offset number is quite small, then the set of the groups will be quite big. For instance, consider the following image that is quite “monolithic” color-wise

27118-2-1353337757

This image consists of majorly red, black, white & some blue. Let’s take two “almost” red pixels; one from the center of the image, and one to the upper right. We decide upon a small offset equal to 4. Well, the first pixel (p) has values: r=220, g=8,b=10 and the second one (q): r=240,g=40,b=10 . By applying the Euclidean distance

849f040fd10bb86f7c85eb0bbe3566a4

their distance is ~37, and 37 < 4 (offset value). Although, both pixels are mainly red, by picking a small offset, we decide that these two red pixels are irrelevant, when they should be. Thus, we should augment the value of our offset in order to capture relevant colors but without overdoing it. Otherwise, if we pick a quite big offset (as 1000 for example), it is possible that all colors will be marked as relevant of just the first color encountered in the image. In my example. i have assigned to the offset the value 100.

This was the basic idea and what you need to parse in order to understand what i am showing in the follow-up of this article.

Setup

The code has been developed on Ubuntu Linux, and the dependencies are the following

  • Python 2.7
  • PIL Python module for reading & writing pixel values from/to image

Code

1. Read image file

def image_pixels(file):

    img = Image.open(file)

    pixels = img.load() # this is not a list, nor is it list()’able
    width, height = img.size

    all_pixels = []
    for x in range(width):
        for y in range(height):
            cpixel = pixels[x, y]
            all_pixels.append(cpixel)
    return all_pixels,img.size

We use PIL’s Image module in order to read the pixel value tuples & store them in a list.

2. Build color table

def color_counter(pixels):
    T = {}
    colors = 0
    total = 0
   for i in xrange(0,len(pixels)):
        if len(pixels[i]) == 3:
            r,g,b = pixels[i]
        else:
            r,g,b,a = pixels[i];print pixels[i]#;exit(1)
          key = ‘%s,%s,%s’ % (r,g,b)
        if key in T:
            T[key] += 1
            colors += 1
        else:
            T[key] = 1
            total += 1
    print ‘Different:’,colors ,‘Amount of pixels:’,total
     assert(len(pixels) == total)
    return T,colors,total

By having the pixel tuple list, we can build a color table, a dictionary that has as keys all unique colors encountered in the image and as values the population of pixels of each associated color.

3. Color Clustering

By having obtained the color table, we know all colors that are contained in the target image and also the amount of pixels of each color. We are ready to perform the euclidean distance to each color contained in the color table and group all colors in color clusters.

def cluster_relevant_colors(table,offset,img_name,size):
    substitute = copy.deepcopy(table)
    keys = table.keys()
    deleted = 0
    
    # Holds  key-[value] => : ‘color’=> [absorbed color1,absorbed color2,absorbed color3]. 
    # So we know which color was absorbed by another one
absorbed = {} 
  for i in xrange(0,len(keys)):
        key = keys[i]
        if key not in substitute:
            continue
        r1,g1,b1 = [int(c) for c in key.split(‘,’)]
        removed = 0
        on_absorb = False
       for j in xrange(i+1,len(keys)):
             key2 = keys[j]
           if key not in substitute:
               break
           if key2 not in substitute:
               continue
             #Find euclidean distance between these two colors, 
             #to check if they are relevant with respect to their rgb components
             r2,g2,b2 = [int(c) for c in key2.split(‘,’)]
             dr,dg,db = abs(r1  r2), abs(g1  g2), abs(b1  b2)
             ediff = sqrt ( (dr*dr) + (dg*dg) + (db*db) )
           if ediff <= offset:
                on_absorb = True            
                assert (ediff <= offset)
                #Transfer pixel populations + remove the weak color 
               #from the table (it has been absorbed)
               if substitute[key] >= substitute[key2]:
                     absorbed = transfer_absorbed_colors(key2,key,absorbed)
                     substitute[key] += substitute[key2]
                     del substitute[key2]
                elif  substitute[key] < substitute[key2]:
                     absorbed = transfer_absorbed_colors(key,key2,absorbed)
                     substitute[key2] += substitute[key]
                     del substitute[key]
                   break
                else:
                    assert(None)       
                  removed += 1
                  deleted += 1
         #Current color wasn’t absorbed by any color
         if on_absorb == False:
            if key not in absorbed:
                absorbed[key] = []    keyssubstitute = substitute.keys()
    keysabsorbed = absorbed.keys()
    #Output the absorbed table & the remaining colors
    json_out(absorbed, name = get_filename ( img_name,‘json’) )
    json_out(substitute, name = get_filename ( img_name,‘clusterjson’) )

 

The first step is to create a copy of the original color table, in which we will operate (substitute). Then, for each color, we iterate all color keys & calculate the euclidean distance between the current color with color value key and the encountered color during the iteration with color value key2. If the distance is smaller or equal than the provided offset,  one of the colors is absorbed by another. In order to find that, we need to check the populations of each color in our table. We will assume that the color that has larger pixel population will be the dominant color. The dominant color will absorb the second color and its population. In the end, the substitute color table will contain all dominant colors that are quite different with each other as keys, and as values, the total amount of pixels for each color along with the absorbed color table, containing all dominant colors along with the colors that were absorbed by them.

{
    “252,156,105”: 34951,
    “50,68,88”: 92838,
    “255,30,0”: 187878,
    “171,44,131”: 3198,
    “0,4,0”: 339535,
    “100,126,175”: 505,
    “104,9,0”: 136452,
    “156,196,248”: 20,
    “179,113,17”: 34,
    “255,214,190”: 4588,
    “255,86,177”: 1 
}

This json depicts the contents of the substitute dict, that contains all dominant colors along with their populations. As we can see, the majority of the image’s pixels are black (0,4,0: 339.535 pixels), bordeaux (104,9,0:136.452 pixels) and red (255,30,0:187.878 pixels) as expected. All variations of the dominant colors have been absorbed drastically.

The contents absorbed & substitute tables will be used in the next phase/part of the epic. For now, i won’t provide any code until the last & final chapter of the Color Clustering epic.

Although, i will provide some gifs in order to show you what you could generate after the end of the Color Clustering epic. Of course, i do not own any of the initial artwork that was used to generate the gifs.

Hope you enjoyed,

kazeone

 

Appetizers

baroness_blue

Baroness – Blue Record

 

pelican_australasia_aggr

Pelican – Australasia

vol_4_aggr

Black Sabbath – Vol. 4

om_agios_aggr

Om – Advaitic Songs

( We do not own any of the artwork that was used to generate the presented gifs.)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s