Redis data structure change notifications

Hi all,

everyone knows Redis, right? Redis is an open source (BSD licensed) data structure server with extended pub/sub features, allowing applications to subscribe in Redis channels & be able to be notified when a channel update occurs, while other apps (in the other side) generate the updates, publishing their update in particular Redis channels.

Although, in this post, i am already assuming that you are familiar with Redis and already know how to code a script that publishes to/subscribes in a channel, & you already know how to manipulate some of Redis data structures, such as the Hashes, the Sets & the List.

We are interested in receiving data structure updates in order to know that something has changed & if our app is interested in that particular change, then it should do something useful with it. In addition, we would like our app to be interested in only one specific data structure of redis, one specific structure that holds data of interest for our app.

Redis supports pubsub for channels but, can a data structure has its own channel? Yes, by using Redis Keyspace Notifications, from version 2.8.0. Thus, when a data structure update change occurs, a Redis event is generated & propagated through the channel. Your app just needs to listen to that channel in order to be notified about the event.

The name of that particular channel is formed as : __keyspace@0__:" "data structure name"Thus, if your app is interested in that specific list named “registeredArtists“, then your app should subscribe & listen to the channel __keyspace@0__:registeredArtists.

Demo

Requires:

  • linux
  • redis
  • Python
  • python-pip redis module

1.1 Client

In order to show you the functionality, i prepared a small Python demo for it:

from redis import StrictRedis
from time import sleep
from sys import argv  def show_msg(msg):
 print ‘Latest List Operation =>’,msg

 def listen_for_data_structure_operations(dt_key,host=‘localhost’):
     redis = StrictRedis(host=host)
     s = redis.pubsub()
     s.subscribe(**{ ( ‘__keyspace@0__:%s’ % dt_key) :show_msg})

    while True:
         msg = s.get_message()
       if msg : print msg
         sleep(1)

if __name__ == ‘__main__’:
    if len(argv) > 1:
          key = argv[1]
    else:
          key = ‘alist’
      listen_for_data_structure_operations(key)

 

If you run the code with argument the name of a redis data structure, your script will subscribe to the channel of changes/operations applying in that particular data structure. The data structure does not necessarily need to exist in order to subscribe to it.

 gclkaze@tzertzelos:~/Tzertzelos/Redis$ python dtUpdateTest.py registeredArtists
{‘pattern’: None, ‘type’: ‘subscribe’, ‘channel’: ‘__keyspace@0__:registeredArtists’, ‘data’: 1L}

1.2 Server

Now, use the terminal to connect to redis, and

gclkaze@tzertzelos:~/Tzertzelos/Scriptz/Redis$ redis-cli
127.0.0.1:6379> LPUSH registeredArtists “Motorhead”
(integer) 1

OK, our new list named registeredArtists has length equal to 1, and its only element is the string Motorhead.

1.3 Client-side update

The notification generated after the LPUSH in the server should be visible in the client side!

gclkaze@tzertzelos:~/Tzertzelos/Scriptz/Redis$ python dtUpdateTest.py registeredArtists
{‘pattern’: None, ‘type’: ‘subscribe’, ‘channel’: ‘__keyspace@0__:registeredArtists’, ‘data’: 1L}

Latest List Operation =>  

{‘pattern’: None, ‘type’: ‘message’, ‘channel’: ‘__keyspace@0__:registeredArtists’, ‘data’: ‘lpush’}

Yea! Our client was able to capture the event generated by the LPUSH that applied to the list that resides in the server.

Redis’ pubsub feature really simplifies the implementation of a push mechanism between theserver & the app, that is actually a core feature of the system.

Now go & experiment with Redis!

kazeone

Linx

An introduction to Redis data types and abstractions

Redis Commands

Redis Keyspace Notifications

Musix

Truckfighters – Gravity X (2005)

Advertisements

Python function invocation from C++

Hi there,

today ‘s mission is to load a Python module & call a module function from C++ by using Python’s C API. OK, that is cool, but why should we do that? Why not just rewrite our Python modules on C/C++ & perform the job in a native way?de

  • well because you are going to lose a significant amount of time in order to rewrite the module(s), test them & verify that everything is fine
  • to rewrite a small python module in C/C++ is fine, but what if you just want to use an existing functionality already implemented in Python? Let’s not reinvent the wheel!
  • you really want to just use the module & treat it as a black box. Implementation details about the module’s functionality may be way too out of subject. Let’s keep the requirements simple!

Let’s put some nice music (all tracks/artists that were listened & contributed to this production are provided in the Musix session) & start our experiment!

Setup

The code has been developed on Ubuntu Linux, and the dependencies are the following

  • C++ 11
  • Python 2.7
  • Python-dev :
    gclkaze@tzertzelos$: sudo apt-get install python-dev

Assumptions

Just to be clear & precise:

  • we are not taking into account multi-threading; consider a case where our program loads a module & attempts to call some Python functions in an multi-threaded manner, we do not guarantee that the parallel function calls will succeed.
  • for now, all test Python functions that accept an argument or arguments, all arguments will hold integer values.
  • didn’t check at all exception handling (a nice TODO)

Our test module

spam.py

def foo():
    print ‘Foo’
    return 6
def foo2(x,y):
    print ‘Foo2’
    return x*y

 

Steps

0. Our class, the PythonInterpreter class

The following snippet depicts the PythonInterpreter class, responsible for loading & calling a requested Python function from a given module.

#include <Python.h> // The Python (Interpreter) C API
#include <iostream> // for std::cout printing utility
#include <vector>
#include <string>
#include <utility> // std::pair
#include <map>

namespace Interpret{
    //Some enums for error checking
    enum InterpreterStatus{
        OK = 0,
        FUNCTION_NOT_FOUND = -1,
        MODULE_NOT_FOUND = -2,
        RETURNED_NULL = -3,
        NOT_CALLABLE_OBJECT = -4,
        UNKNOWN_ERROR = -5,
        ARGUMENT_ERROR = -6,
        UNKNOWN_INTEGER_ERROR = -7
    };

    class PythonInterpreter 
    {

    private:
        // The sys.path Python module need to be imported, and add the current
        // directory in the path in order to run the Python Interpreter 
   	std::vector<PyObject*> m_Globals; 
        PyObject * m_CurrentModule = nullptr; // The current loaded Python module
        // The dictionary object of the current loaded Python module, 
        //contains module namespace information, such as module functions,
        // module globals etc 
        PyObject * m_CurrentTable = nullptr; 
        // We need to create this from a string containing the 
        //module's name, in order to import the module later.
        PyObject * m_CurrentModuleName = nullptr; 
        bool m_Initialized = false;
        std::map<std::string, std::pair<PyObject*, PyObject*>> m_LoadedModules;
    public:
        PythonInterpreter(){}
    };
};

After we have instantiated our PythonInterpreter, we would like to be able first load our Python module, and if everything goes fine (that particular module exists for instance), we would like to specify a particular module function, load the module, pass the arguments (if it accepts any) and call the function.

1. Load Module

InterpreterStatus PythonInterpreter::loadModule(const std::string& module)
{
    if(!m_Initialized) {
        Py_Initialize();
        addCwdInPythonPath();
	m_Initialized = true;
        //std::cout << "PythonInterpreter initialized!" << std::endl;
    }
    //If the module isnt here yet, or the same module has being loaded earlier,
   // load it
    if(m_LoadedModules.find(module.c_str()) == m_LoadedModules.end() ) {
        //std::cout << "Module init" << std::endl;
        m_CurrentModuleName = PyString_FromString(module.c_str());

        // Load the module object
        m_CurrentModule = PyImport_Import(m_CurrentModuleName);

        if(!m_CurrentModule) return MODULE_NOT_FOUND;
        // m_CurrentTable is a borrowed reference 

        m_CurrentTable = PyModule_GetDict(m_CurrentModule);
        m_LoadedModules[module] = std::make_pair(m_CurrentModule,
                                              m_CurrentModuleName);
    }
    else {

    	m_CurrentModuleName = m_LoadedModules[module.c_str()].second;
    	m_CurrentModule = m_LoadedModules[module.c_str()].first; 
        m_CurrentTable = PyModule_GetDict(m_CurrentModule);

    }
    return OK;
}

1.1 Add current working directory in the Python path lazily

The first call to loadModule will call Py_Initialize in order to prepare the Python interpreter’s runtime in order to accept the subsequent Python C API calls and add the current working directory to the Python’s Path, by calling the class method addCwdInPythonPath that is depicted here.

void PythonInterpreter::addCwdInPythonPath()
{
    //The following is equivalent to:
    //from sys import path
    //path.append('.')
    PyObject *sys = PyImport_ImportModule("sys");
    PyObject *path = PyObject_GetAttrString(sys, "path");
    PyList_Append(path, PyUnicode_FromString("."));
    m_Globals.push_back(sys);
    m_Globals.push_back(path);
}

This step is of great importance, if it is omitted, the Python runtime won’t be able to know where to find our Python module (spam.py). If the module lives in a different directory, well, we will need to add it in our Python path by using a variation of the previous method.

 

1.2 Store loaded module pointers in the map

In the Python C API, every Python object (module, function, string, integer, float, list, dictionary) is treated as a PyObject, then based on type checking, we treat each of them differently (as a module, string, integer, float, dictionary & so on).

In our class, we are using a map of strings<=>PyObject* in order to know which modules have been already loaded in the past, allowing us to omit loading a module that has already being loaded in the past.

Thus, we first need to check if the requested module exists in our map, if not we should try to load it and then add it in our map. Otherwise, we load its PyObject* from our map.

2. Call Module Function

2.1 Pass the argument to the Python function

We will need to pass the integer arguments (if any) to the Python function. In order to do that, we will need :

  • first to create a Python tuple list with size equal to the amount of marshaling arguments by using PyTuple_New
  • iterate our string argument vector,
  • for each argument, extract its integer value and set it to our argument tuple list by calling PyTuple_SetItem

The PythonInterpreter::setFunctionArgs performs the previous operations & returns a filled PyObject* tuple list that will be used in our function invocation.

PyObject* PythonInterpreter::setFunctionArgs(const std::vector<std::string>& args)
{
    int argsN = (int)args.size();
    auto pArgs = PyTuple_New(argsN );
    PyObject *pValue = nullptr;

    for (auto i = 0; i < argsN ; i++) {
        pValue = PyInt_FromLong(atoi(args[i].c_str()));
        if (!pValue) {
            PyErr_Print();
            return nullptr;//UNKNOWN_INTEGER_ERROR;
        }
        PyTuple_SetItem(pArgs, i, pValue);  
    }
    return pArgs;
}

 

2.2 Call Module Function with arguments

The PythonInterpreter::callFunction takes as arguments the name of the function that we want to call and a list of string arguments that will be passed in our call.

InterpreterStatus PythonInterpreter::callFunction(const std::string& function, 
                               const std::vector<std::string>& args)
{

    PyObject* currentFunction = PyDict_GetItemString(m_CurrentTable, 
                                                    function.c_str());
    if(!currentFunction) return FUNCTION_NOT_FOUND;

    if (PyCallable_Check(currentFunction)) {
        PyObject *pValue = nullptr;
        // Prepare the argument list for the call
        int argsN = (int)args.size();
        if( argsN > 0 ) {
            auto pArgs = setFunctionArgs(args);
            if(!pArgs) return UNKNOWN_INTEGER_ERROR;   
            pValue = PyObject_CallObject(currentFunction, pArgs);
            if (pArgs != nullptr) {
                Py_DECREF(pArgs);
            }
        } 
        else {
            pValue = PyObject_CallObject(currentFunction, nullptr);
        }

        if (pValue != nullptr)  {
            printf("Return of call : %d\n", (int)PyInt_AsLong(pValue));
            Py_DECREF(pValue);
        }
        else {
            PyErr_Print();
            return ARGUMENT_ERROR;
        }
    } 
    else {
        //std::cout << "Cant call it" << std::endl;
        PyErr_Print();
        return NOT_CALLABLE_OBJECT;
    }
    return OK;
}

We first need to check if the requested function is included in our module’s table, which we saved in the previous step, while calling PythonInterpreter::loadModule.

If the function is not included in the module, ofc we won’t be able to call it. Thus, we return a nice & precise FUNCTION_NOT_FOUND.

Otherwise, we will check if there are any arguments & if any we will set up the tuple list holding our integer arguments (by calling PythonInterpreter::setFunctionArgs) & finally call PyObject_CallObject by passing the argument tuple list. If there are no arguments to be passed, we will just call PyObject_CallObject with a nullptr as tuple list.

After the call to the Python function, the variable pValue will hold a reference to the return results of the Python function. In our case, after calling “foo()“, pValue will store the return value “6”. After calling “foo2(2,4)”, pValue will store the return value “8”. Because pValue is a PyObject*, we will need to cast it to int by using PyInt_AsLong in order to retrieve the integer value, that is the result of our function call.

After executing the function call, pValue along with the tuple list that we created earlier (if there were arguments) will need to be  cleaned by calling Py_DECREF, marking them as not needed anymore, allowing the Python run-time to garbage collect them.

3. Cleaning stuff up

Before destroying our PythonInterpreter instance, we would like first to clean up our stored PyObjects such as the ones storing the “sys” module & the “path” modules (filled up after the call to pythonInterpreter::addCwdInPythonPath in step 1.1 ) & the module pointers stored in our module map.

void PythonInterpreter::cleanUp()
{
    // Clean up
    //std::cout << "Destroying " << m_Locals.size() << " locals" << std::endl;
    for(auto var : m_LoadedModules) {
        Py_DECREF((var).second.first);
        Py_DECREF((var).second.second);
    }

    m_LoadedModules.clear();
    //std::cout << "Destroying " << m_Globals.size() << " globals" << std::endl;
    for(auto &var : m_Globals) {
        Py_DECREF(var);
    }
}

PythonInterpreter::~PythonInterpreter()
{
    cleanUp();
    // Finish the Python Interpreter
    if(m_Initialized) {
        //std::cout << "PythonInterpreter destroyed!" << std::endl;
        Py_Finalize();
    }
}

After the previous step, we can stop the Python run-time safely, by calling Py_Finalize, which deallocates all the memory of the Python interpreter in the C API side effectively.

4. Call them!

4.1 demoMain.cpp

#include "demo.h" // Contains the version of PythonInterpreter, that was used for this demo
#include <assert.h> // Our favorite buddy

int main(int argc, char** argv)
{
    auto py = new Interpret::PythonInterpreter();

    //import spam
    auto res = py->loadModule("spam");
    assert (res == 0 );

    //spam.foo2(1,3)
    res = py->callFunction("foo2", {"1","3"});
    assert (res == 0);

    //spam.foo()
    res = py->callFunction("foo", {});

    assert (res == 0);
    return 1;
}

 

4.2 Compile them

gclkaze@tzertzelos:~/C++/CallPythonFromC$ make demo
g++ -std=c++11 -Wall -o demo -g -I/usr/include/python2.7 demoMain.cpp demo.h -lpython2.7

4.3 Run

gclkaze@tzertzelos:~/C++/CallPythonFromC$ ./demo
Foo2
Return of call : 3
Foo
Return of call : 6

You just called foo & foo2 functions of the Python module spam from C++ using the Python C API. Congratz!

I hope you enjoyed this tutorial, i am providing the demo’s code & the full version of my class in github (i am providing the link below). The canonical version of the class contains all showed code plus some extensions for verifying your data returned by the Python function call & not done functionality for handling the results in a threaded manner (many calls, many threads, many results).

Now go extend your C++ base with your existing Python utilities,

cheers

kazeone

Linx

Musix

 

C++ 11 helpers: char to wchar (& vice versa) easy conversions

Hello all,

i wanted to share two useful functions in order to convert between char & wchar strings in C++ 11. A colleague shared the info with me & i thought that may be interesting to share it with you guys. Thus, no more calls to mbtowc & wcstomb (da fuck name is this?) of stdlib & details about the max buffer size.

C++ 11 style char 2 wchar converters

#include <locale>
#include <codecvt>

std::wstring s2ws(const std::string& str)
{
       typedef std::codecvt_utf8<wchar_t> convert_typeX;
       std::wstring_convert<convert_typeX, wchar_t> converterX;

       return converterX.from_bytes(str);
}

std::string ws2s(const std::wstring& wstr)
{
       typedef std::codecvt_utf8<wchar_t> convert_typeX;
       std::wstring_convert<convert_typeX, wchar_t> converterX;

       return converterX.to_bytes(wstr);
}

For more info about the converters check the links below,

Kazeone

 

Links

wctomb Convert wide character to multibyte sequence

mbtowc Convert multibyte sequence to wide character

wstring_convert

codecvt_utf8

Testing & Code Generation: Generate function call permutations based on arguments

Hello all,

i’m kinda busy the last 2 years in a big home project which is evolving every day. New features are added, more web services written, more complicated database queries are formed…and after 2 years, this small baby has grown to a 17K lines of code distributed in more than 150 files.When i started it, i asked myself, “Should i write tests?” and after 0 seconds later i ofc took the decision to add tests for each feature in the project. Thus, a feature is considered complete, only if & only if there are appropriate tests for it, accompanied ofc with comments & encapsulated in test functions/cases with self-described names like “test_add_user_with_duplicated_email()” in order to know what this scenario/test case/function is about (yeah, if you write tests for each feature, by only looking at the test function names, all supported features of the system are revealed before your eyes 🙂 ).

OK, tests are good!

But…you need to admit it that OMG it is fucking boring sometimes! You need to think all corner cases, write much much mock code to simulate the behavior of the environment in which your feature will have an impact & you are really enthusiastic with continuing with the implementation of the next feature.

I am sure that many of you, when writing tests for a function, you often call the function multiple times by using different arguments & you assert the result of the function. In the faulty argument cases, you assert that the function returned false & true otherwise. Well, imagine you are testing a function that takes 4 arguments, well this is 2^4 = 16 different argument permutations for the function. What if the function takes 8, then it is 2^8 = 256 different permutations!! Are you going to write by hand 255 function calls with different arguments plus 1 correct in the end?

No man, i really have better things to do than this!

You know what i did? I wrote a program that generates all 256 function calls for me. And of course the amount of arguments could be arbitrary ofc.

def generate_truth_table(perms):
    b = str( bin ( perms ) )
    l = len(b)  1
    t = {}
    for i in xrange(0,perms):
        _b = str ( bin  (i) )
        s = ‘%s%s’ % ( ( l  len(_b ) ) * ‘0’ , _b)
        s = s.replace(‘b’,)
        s = ‘%s’ % s[1:]
        #print s
        t[i] = [int(c) for c in s]
    return tdef generate_function_call_permutation(func,args,indent=1):
    
    perms = pow(2, len(args) )
    t = generate_truth_table(perms )
    keys = t.keys()
    #print t,perms
    known_prefix = ‘known_’
    unknown_prefix = ‘unknown_’    func_perm = []
    for i in xrange(0,len(keys)):
        k = keys[i]
        #print k
        l = t[k]
        assert len(l) == len(args)
        gen_args = []
        for j in xrange(0,len(l)):
            arg = 
            if l[j] == 0:
                arg = ‘%s%s’ % (unknown_prefix,args[j])
            else:
                arg = ‘%s%s’ % (known_prefix,args[j])
            gen_args.append(arg)
        #print gen_args
        if i < len(keys) -1:
            current_func = ‘assert<b> </b>%s(%s)<b> </b>==<b> </b>False<b> </b>#%s’ % (func,‘,’.join(gen_args),‘,’.join([str(j) for j in t[k]]) )
        else:
            current_func = ‘assert<b> </b>%s(%s)<b> </b>#%s’ % (func,‘,’.join(gen_args),‘,’.join([str(j) for j in t[k]]) )
        #print current_func;exit(1)
        func_perm.append(current_func)
        gen_args = []    for i in xrange(0,len(func_perm)):
        print indent*3*‘<b> </b>’,func_perm[i]
if __name__ == ‘__main__’:
    func = ‘ap.add_url’
    args = [‘artist_id’,‘desc’,‘url’]
    generate_function_call_permutation(func,args)

 

The depicted python program (generate_function_call_permutations.py ) by getting as input a function name along with a list of the names of the function arguments, it generates first a truth table based on the amount of arguments:

for 3 arguments:

2^3 = 8 argument permutations
000
001
010
011
100
101
110
111

 

For each row of the previous table

For each argument:

If 0 is encountered, add the prefix “unknown_” in the argument’s name. Otherwise, add the prefix “known_” to the argument’s name.

With that way, we generate each tested function call based on the generated truth table & we assign the appropriate argument name based on the table.

Let’s see what our generator produces based on the previous output (func = ‘ap.add_url’ , args = [‘artist_id’,‘desc’,‘url’])

    assert ap.add_url(unknown_artist_id,unknown_desc,unknown_url) == False #0,0,0
    assert ap.add_url(unknown_artist_id,unknown_desc,known_url) == False #0,0,1
    assert ap.add_url(unknown_artist_id,known_desc,unknown_url) == False #0,1,0
    assert ap.add_url(unknown_artist_id,known_desc,known_url) == False #0,1,1
    assert ap.add_url(known_artist_id,unknown_desc,unknown_url) == False #1,0,0
    assert ap.add_url(known_artist_id,unknown_desc,known_url) == False #1,0,1
    assert ap.add_url(known_artist_id,known_desc,unknown_url) == False #1,1,0
    assert ap.add_url(known_artist_id,known_desc,known_url) #1,1,1

 

How to use it?

  • declare known_* & unknown_* variables & assign them a correct/expected value & an incorrect value respectively
  • copy & paste the generated code
  • & you are done!

Demo

def test_artist_portfolio(db_name=None):
    reset_db(db_name=db_name)    db = db_builder(db=db_name)
    attendees = 1000
    DIY = 1
    _city = ‘Athens’
    country = ‘Greece’
    name = ‘NoMeansNo’
    continent_id = 3
    country_id = 300
    state = 
    country_id = 300    setup_mock_city(‘Athens’,state,country_id,continent_id,db=db)
    setup_mock_artist(name,country,attendees,DIY,_city,db=db)    ap = artist_portfolio(db=db)

    known_artist_id, unknown_artist_id = 1, 2
    known_desc, unknown_desc = ‘this<b> </b>a<b> </b>description<b> </b>of<b> </b>the<b> </b>url’,
    known_url, unknown_url = https://www.youtube.com/watch?v=SIUmApXqJAk&#8217;,

    assert ap.add_url(unknown_artist_id,unknown_desc,unknown_url) == False #0,0,0
    assert ap.add_url(unknown_artist_id,unknown_desc,known_url) == False #0,0,1
    assert ap.add_url(unknown_artist_id,known_desc,unknown_url) == False #0,1,0
    assert ap.add_url(unknown_artist_id,known_desc,known_url) == False #0,1,1
    assert ap.add_url(known_artist_id,unknown_desc,unknown_url) == False #1,0,0
    assert ap.add_url(known_artist_id,unknown_desc,known_url) == False #1,0,1
    assert ap.add_url(known_artist_id,known_desc,unknown_url) == False #1,1,0
    assert ap.add_url(known_artist_id,known_desc,known_url) #1,1,1

 

Now go run your tests & if OK, go & implement some features, don’t waste your time in writing the same function over & over again!

Hope you enjoyed!

kazeone