Debugging Go applications on FreeBSD

Go uses a standardised debugging standard called DWARF. Yes, the pun is intended because DWARF was developed alongside ELF (Executable and Linking Format)

The be able to debug you need a debugger wich understands this standard.  The version Go build emits is DWARF3, GDB version 7.x supports this version. The standard debugger (used to compile the system) is version 6 9DWARF2 only), so you need to install a newer one. (/usr/ports/devel/gdb is version 7.x).

Instead of the usual `go run MyBuggyProgram.go`, you need to first build the program:
go build MyBuggyProgram.go

and start it with the debugger:
/usr/local/bin/gdb74 EchoSCTPServer -d $GOROOT

The environment variable GOROOT is optional but is useful if you want to debug the Go code itself, it has to point to a directory where Go is build from source.

Once within the debugger the usual commands are valid:

  • Ctrl+l: clear screen.
  • run: run programme
  • break: set break point
  • next: execute current line and move on the next
  • print (variable): print the value of (variable)
  • step: step into a function while executing

An example debug session:

(gdb) b 33
Note: breakpoint 1 also set at pc 0x400c8b.
Breakpoint 2 at 0x400c8b: file /home/olivier/projects/msc-report-src/echo-go-sctp/server/EchoSCTPServer.go, line 33.
(gdb) run
(gdb) s
net.ListenPacket (net="sctp", addr="localhost:4242", noname=void) at /home/olivier/projects/go-sctp/src/pkg/net/dial.go:213
213 func ListenPacket(net, addr string) (PacketConn, error) {
(gdb) s
214 afnet, a, err := resolveNetAddr("listen", net, addr)
(gdb) n
215 if err != nil {
(gdb) p err
$1 = 0
(gdb) p a
$2 = {IP = {array = "\177", len = 4, cap = 4}, Port = 4242}

More info can be found on the Go blog

Happy debugging 🙂

Olivier

Setting up a development environment for Go

A short blog post on how I’ve setup my development environment so I can develop on Go.

I’ve limited myself to FreeBSD (amd64) and Vim. However most of this setup should also work on other Unix environments and editors (e.g Emacs).

First step: install Go specific vim plugins. The Go source code ships with vim plugins which can be found in $GOROOT/misc/vim. The readme.txt explains the different options to install the plugins. You can soft link to this directory or copy everything straight into your ~/.vim/ directory. These plugins enable syntax highlighting, indentation and documentation lookup.

The second step is to enable you to jump around the code and go to definitions of specific functions. To enable this I use the (rather) well known ctags tools. A version of ctags which support Go can be found here: https://github.com/lyosha/ctags-go.git

Installations is simple, make install (watch out this overwrites previously installed ctags in /usr/local/bin). After this do a `ctags –recurse –sort=yes` at the top of your source tree, this will generater a tags file which can will be used by vim to lookup locations of code definitions. (C-[ : jump to definition, C-t : jump back)

The third step (last) is code completions. This can be done with a plugin called Omni Complete in combination with the Go autocompletion daemon. If your Go environment variables are setup correctly, the following command should do the trick: go get -u github.com/nsf/gocode.  The daemon works in combination with the Go vim plugins. In Vim C-x C-o will open the menu with context sensitive suggestions.

Happy coding!

 

Varnish reordering query string

(Update: now listed as a module on the official Varnish site)
(Update: this code is being used in production without any problems in several companies I worked for)

In Varnish the URL is the key to the caching. If it recognises a previously requested URL it will look if it’s available in its cache and deliver this back.  There is a small problem with URLs which have parameters. Take a look at the following queries:

http://localhost/test?ddd=444&bbb=222&ccc=333&aaa=111
http://localhost/test?ccc=333&aaa=111&ddd=444&bbb=222
http://localhost/test?bbb=222&ccc=333&ddd=444&aaa=111

Each of them will return the same result, the parameters are the same, only the order is different.  Varnish treats each of them as a different query and will, in this case, do three separate requests to the backend and cache all of them.

To deal with this I’ve written a small bit of C code that can be embedded in the varnish configuration file which will order the parameters so URLs with unordered parameters will become the ordered and therefor have an equal cache key.

To follow our example the three URLs all get ordered like this:

http://localhost/test?aaa=111&bbb=222&ccc=333&ddd=444

How it works:
I tokenise the url, put the parameters in a binary tree and do an in order traversal to get them out again. Performance of this method is on average O(n log(n)).

Code can be found here: https://github.com/cyberroadie/varnish-urlsort
It has a FreeBSD license so please feel free to use it. One warning though: I haven’t used it in a live environment, so do take care!!

Olivier

Update: As suggested on the Varnish mailing list it now also compiles as a Varnish module and can be used like this (once installed):

import urlsort;
sub vcl_recv {
  set req.url = urlsort.sortquery(req.url);
}

Update:

Checked memory usage with valgrind, no leaks 🙂

valgrind -v --dsymutil=yes ./urlsort "http://localhost/test?ddd=444&bbb=222&ccc=333&aaa=111"
==66758== 
==66758== HEAP SUMMARY:
==66758== in use at exit: 6,241 bytes in 33 blocks
==66758== total heap usage: 39 allocs, 6 frees, 6,445 bytes allocated
==66758== 
==66758== Searching for pointers to 33 not-freed blocks
==66758== Checked 488,280 bytes
==66758== 
==66758== LEAK SUMMARY:
==66758== definitely lost: 0 bytes in 0 blocks
==66758== indirectly lost: 0 bytes in 0 blocks
==66758== possibly lost: 0 bytes in 0 blocks
==66758== still reachable: 6,241 bytes in 33 blocks
==66758== suppressed: 0 bytes in 0 blocks
==66758== Rerun with --leak-check=full to see details of leaked memory
==66758== 
==66758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
--66758-- 
--66758-- used_suppression: 1 OSX107:blah
==66758== 
==66758== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 1 from 1)
iPhone travel apps for London

London travel iPhone apps

So here is a list of essential iPhone apps you need when travelling in London.

iPhone London travel apps

Before you start your journey check the tube status for line and station closures and the state of the service, particularly handy during the weekends when there’s lots of planned engineering work.

Plan your route with TubeMap, it also tells you departure times, has ‘find station’ and a ‘shortest route calculator’ functionality.

Tube Exits

Tube Exits is pure genius. It tells you what tube carriage you have to get on so you walk the shortest distance underground when changing lines or finding the exit. The developer of this app travelled the whole London Undergound the find the shortest route in all the stations.

Travel Deluxe app

Travel Deluxe helps you plan your journey through London on any public transport available. (Yes Boris bikes too).

With the London Cycle app you can find the closet docking station near you of the Barclay Cycle Hire scheme. It also tells you how many bikes and empty spaces are available per station. Comes with a timer so you can keep an eye on the costs.(<30mins is free + days access).

Bus Checker app

And last but not least: London Bus Checker. It helps you locate the bus stop you’re at and tells you how long it takes for the bus to arrive. Comes with map and bus routes.

One-to-one or one-to-many? That is the question!

Overview

There are two models of programming in SCTP, one-to-one and one-to-many. The one-to-many comes with all SCTP has to offers, the one-to-one model only a limited set of features can be implemented. However implementing the one-to-one model is very similar to how you would implement the same functionality in TCP. This makes migrating an existing application a relatively painless exercise. If you want to upgrade your existing TCP application to an one-to-many SCTP application significant retooling is needed [1].

Spot the difference

The easiest way to spot the difference between the two is by looking at how the endpoint for communication is created:

One-to-One style

sd = socket(PF_INET, SOCK_STREAM, IPPROTO_SCTP);

SOCK_STREAM stands for stream socket, the data stream is clearly associated with one socket.

One-to-Many style

sd = socket(PF_INET, SOCK_SEQPACKET, IPPROTO_SCTP);

SOCK_SEQPACKET stands for sequenced packet stream. The data stream is sequenced but there is no mention of a socket in the name.

The slightly less obvious way to see the difference is the way the connection between the client and server is set up.

Program Flow

Here is the one-to-one abstraction model:

One to One TCP lookalike

The server creates a passive socket with a listen() followed by an accept() and waits for a connection to come in.  The client creates an active socket and establishes a connection with a connect(). The moment accept() gets a connection request it creates a new socket and allocates a new file descriptor. The association between the two systems gets created explicitly and incoming connections are handled iteratively.

This is the one-to-many model:

One to many, full on SCTP

After the listen() a connection can come in from multiple clients. The association between the systems will be set up implicitly as soon as the client sends a message. As soon as the client closes the association the server releases the association resources too.

So what’s the difference?

Sending data during connection setup

Only the one-to-many is capable of sending data on the third leg of the four-way handshake SCTP does during connection setup [3].  This speeds up data transmission.

Iterative or Concurrent? (another question)

With the one-to-many model multiple associations can transport data over the same socket. The different connections are handled iteratively. With the sctp_peeloff() function an association can be detached in its own separate socket and/or thread. So if you want to you can use a concurrent model.

Connection State

Because of the connectionless nature of the one-to-many mode a lot of the connection state gets handled by the underlying SCTP transport stack and is of no concern for the application.

Conclusion

The one-to-many model has many advantages; it gives you a clear choice on how to handle your connections. It is even possible to combine an iterative server model with a concurrent one. Data can already be sent whilst setting up the association. And last but not least the application has less connection state to maintain.

The one-to-one model on the other hand not only gives you an easy migration path from an existing TCP application it also makes it easy to switch between TCP and SCTP in the same application; the only difference is the socket() call and maybe a setsockopt().

In a following post I will get into more detail on how to implement this. A good book with lots of examples and detailed explanation on how SCTP (and other protocols) work is Unix Network Programming, well worth a read.

References

[1][2] W. R. Stevens, B. Fenner, and A. M. Rudoff, Unix Network Programming: Sockets Networking API v. 1, 3rd ed. Addison Wesley, 2003. p.267, p271

[3] “sctp,” FreeBSD Man Pages. [Online]. Available: http://www.freebsd.org/cgi/man.cgi?query=sctp&manpath=FreeBSD+8.2-RELEASE. [Accessed: 24-Dec-2011].

Zookeeper Single File Leader Election with Retry Logic

In a previous post I explained how to implement leader election as suggested on the Zookeeper website. I posted my solution on the Zookeeper mailing list and got some useful tips. The first one was on how to do leader election. The method suggested is a single file leader election. Works like this:

  1. Try creating ephemeral ZNode
    1. Succes, become leader
    2. Fail, stay inactive
  2. Set watch on ZNode
  3. If ZNode disappears goto 1.

All clients try create the same ephemeral node, which has to be unique, so only one client will be able to create the node. The client who creates the node first becomes the leader, the rest  of the clients stay inactive and wait for the node to disappear before trying to create a node again.

The second suggestion I was given is to keep the connection to Zookeeper in some sort of retry logic. Assuming things will go wrong we have to make sure there is a system in place which can recover from a bad situation.
Here a class diagram of all objects involved:
Class diagram single file leader election
The following sequence diagram explains the flow of the program in case we have a successful election, followed by a signal indicating the node has disappeared.

The ZnodeMonitor implements the Watcher interface which has the process method. As soon as the ZNodeMonitor is set as  the Watcher the Zookeeper can talk back to it in case something changes.

First the client connects to the zookeeper by constructing a RecoverableZookeeper. This happens in the start method, called by the SpeakerServer:

public void start() throws IOException {
    this.zk = new RecoverableZookeeper(connectionString, this);
}

The ZNodeMonitor sets itself as a Watcher (this). This gives Zookeeper the opportunity to call back the ZNodeMonitor as soon as it is connected to the server. It does this by sending a None event type with a SyncConnected state.

@Override
public void process(WatchedEvent watchedEvent) {
    switch (watchedEvent.getType()) {
        case None:
            processNoneEvent(watchedEvent);
            break;
        case NodeDeleted:
            listener.stopSpeaking();
            createZNode();
    }
    try {
        zk.exists(ROOT, this);
    } catch (Exception e) {
        shutdown(e);
    }
}

public void processNoneEvent(WatchedEvent event) {
    switch (event.getState()) {
        case SyncConnected:
            createZNode();
            break;
        case AuthFailed:
        case Disconnected:
        case Expired:
        default:
            listener.stopSpeaking();
            break;
    }
}

Next the client tries to create a ZNode:

public void createZNode() {
    try {
        zk.create(ROOT, listener.getProcessName().getBytes());
        listener.startSpeaking();
    } catch (Exception e) {
        // Something went wrong, lets try set a watch first before
        // we take any action
    }
}

After this (successful or not) it tries setting a watch via the exist() method (also in the process method, see above).

The retry logic is encapsulated in the RecoverableZookeeper. The create() method wraps the original Zookeeper create method in the following retry logic:

public String create(String path, byte[] data) throws KeeperException, InterruptedException {
    RetryCounter retryCounter = retryCounterFactory.create();
    while (true) {
        try {
            return zk.create(path, data, ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL);
        } catch (KeeperException e) {
            logger.debug("Error code: " + e.code());
            switch (e.code()) {
                case NODEEXISTS:
                    if (retryCounter.shouldRetry()) {
                        byte[] currentData = zk.getData(path, false, null);
                        if (currentData != null && Arrays.equals(currentData, data)) {
                            return path;
                        }
                        throw e;
                    }
                    throw e;
                case CONNECTIONLOSS:
                case OPERATIONTIMEOUT:
                    if (!retryCounter.shouldRetry()) {
                        break;
                    }
                default:
                    throw e;
            }
        }
        retryCounter.sleepUntilNextRetry();
        retryCounter.useRetry();
}

If the creation of the Znode fails it retries until the reties get exhausted. In case the node already exists it first checks if it created the node itself, otherwise it will throw an exception. In case of a loss of connection or an operation timeout, it keeps retrying.

The complete code is available in my github account. Any questions or suggestions please feel free to send me an email or post a comment. This code also include some simple Ruby scripts to create configuration files and to start and stop multiple Zookeeper servers on a single machine. Handy if you want to test different scenarios.

For a full implementation of retry logic with Zookeeper I recommend the Netflix Zookeeper Library (Curator) which implements all of this and much more and is tested in a large scale environment.

Code Gem: Retry Logic in HBase

Found this retry logic whilst reading HBase codebase. It encapsulates retry logic in what I think is an elegant manner. It keeps track of retries and the time between retries grows exponential.

This is how you can use it in your code:

public String doAndRetryThis() throws InterruptedException {
    RetryCounter retryCounter = retryCounterFactory.create();
    while (true) {
        try {
            return doThis();
        } catch (Exception e) {
            if (!retryCounter.shouldRetry()) {
                throw e;
           }
        }
        retryCounter.sleepUntilNextRetry();
        retryCounter.useRetry();
    }
}

And this is how it is implemented:


public class RetryCounter {
 private static final Log LOG = LogFactory.getLog(RetryCounter.class);
 private final int maxRetries;
 private int retriesRemaining;
 private final int retryIntervalMillis;
 private final TimeUnit timeUnit;

public RetryCounter(int maxRetries,
 int retryIntervalMillis, TimeUnit timeUnit) {
 this.maxRetries = maxRetries;
 this.retriesRemaining = maxRetries;
 this.retryIntervalMillis = retryIntervalMillis;
 this.timeUnit = timeUnit;
 }

public int getMaxRetries() {
 return maxRetries;
 }

/**
 * Sleep for a exponentially back off time
 * @throws InterruptedException
 */
 public void sleepUntilNextRetry() throws InterruptedException {
 int attempts = getAttemptTimes();
 long sleepTime = (long) (retryIntervalMillis * Math.pow(2, attempts));
 LOG.info("The " + attempts + " times to retry after sleeping " + sleepTime
 + " ms");
 timeUnit.sleep(sleepTime);
 }

public boolean shouldRetry() {
 return retriesRemaining > 0;
 }

public void useRetry() {
 retriesRemaining--;
 }

 public int getAttemptTimes() {
 return maxRetries-retriesRemaining+1;
 }
}

And it comes with its little Factory:

public class RetryCounterFactory {
 private final int maxRetries;
 private final int retryIntervalMillis;

public RetryCounterFactory(int maxRetries, int retryIntervalMillis) {
 this.maxRetries = maxRetries;
 this.retryIntervalMillis = retryIntervalMillis;
 }

public RetryCounter create() {
 return new RetryCounter(
 maxRetries, retryIntervalMillis, TimeUnit.MILLISECONDS
 );
 }
}

Code can be found here and here