Posts Tagged ‘C++’

Smoothie – Smooth Animation Engine

Thursday, June 10th, 2010

Smoothie

What is Smoothie?

Smoothie is a C++ implementation of a physical simulation engine for GUI animations. It calculates the current (per frame) position of a UI element by it’s final position.

What is Smoothie good for (features) ?

  • Animations are smooth, velocity does not change abruptly, feels natural & easy for the eye to track.
  • Animations are parametric yet simple, only a final position is needed to start an animation.
  • Final position may be changed during an ongoing animation.
  • Animation time can be limited, a distant final position doesn’t mean a long animation.
  • Animations can be started by a drag & release action, resulting animation may be constrained.
  • Low CPU usage, works on mobile devices.
  • OS independent.
  • Open source (MIT license)

Usage

Setup

The following code will be used in all the examples.

#include "Smoothie.h"
// this function simulates 25 fps
unsigned int simulator25fps(void)
{
    static unsigned int ticks = 0;
    ticks += 40;
    return ticks;
}

// assign an implementation
Smoothie::getTimeMsecFuncPtr Smoothie::getTimeMsec = simulator25fps;
// print x space marks followed by an asterisk
void show(int x)
{
    while(x--)
        printf(" ");
    printf("*\n");
}

In the following examples I will use a smoothie as defined below:

// create a smoothie positioned at 0, that traverses a distance of 180 win 2000ms (=2 seconds)
Smoothie smoothie(0, 180, 2000);

Requesting an animation of a distance less than 180, will take less that 2000ms.

Requesting an animation of a distance greater than 180, will cause the smoothie to skip the section with the highest speed, so that the total time of the animation will be 2000ms.

Example 1: Going from 0 to 120

// start animation
smoothie.setFinalPosition(120);

unsigned int timeTillIdle;
do
{
    // calculate current position
    int x = smoothie.calcCurrentPosition(&timeTillIdle);
    show(x); // show on the screen the value of x
} while(timeTillIdle); // stop when smoothie is idle: timeTillIdle==0

Example 2: Going from 0 to 120, aborting at 60

// start animation
smoothie.setFinalPosition(120);

unsigned int timeTillIdle;
do
{
    // calculate current position
    int x = smoothie.calcCurrentPosition(&timeTillIdle);
    // request to go back to 0 if we reached 60

    if(x >= 60 && smoothie.getFinalPosition() == 120)
        smoothie.setFinalPosition(0);

    show(x); // show on the screen the value of x

} while(timeTillIdle); // stop when smoothie is idle: timeTillIdle==0

Example 3: “glide” animation after a drag-release action

// drag was released at position=0 with velocity=500.
// final position is in the range of [50..120], bounce size is 20
smoothie.setCurrentConstraint(0, 500, 50, 120, 20);

unsigned int timeTillIdle;
do
{
    // calculate current position
    int x = smoothie.calcCurrentPosition(&timeTillIdle);
    show(x); // show on the screen the value of x
} while(timeTillIdle); // stop when smoothie is idle: timeTillIdle==0

Download Smoothie Source

C++ MicroSleep(int microsecs) from userland Windows/UNIX

Monday, January 11th, 2010

Recently I was investigating various strategies to measure point-to-point bandwidth on the Internet. Most of the recent methods use the packet-pair technique. They involve sending multiple packets to the destination with a precise time spacing between the packets and then inferring network properties based on the change in time spacing at the receiving end. In order to get reliable results and especially given the speed of the Internet today, the sending packet spacing needs to be in the order of microsecs.

The ideal platform for making these measurements would be via a kernel driver or a real-time OS. However, I needed them from userland on both Windows and Linux. How can we achieve a microsec sleep on these platforms? Windows only provides a millisec-resolution sleep via Sleep() and the UNIX function usleep() doesn’t garanttee precision, nor even a resolution better than a millisec. Both Windows and UNIX can also provide microsec sleeping via the select() function, but measurements showed that the resolution was no better.

Below is my implementation of a quasi-precise microsec sleep on both Windows and UNIX. The idea is to obtain the start time in microsecs and then sit in a tight loop waiting for the timer to reach the required sleep period. The problem with this approach is that it consumes CPU during the entire wait. To alleviate this, we can measure the time resolution of the system sleep call and the system scheduler yield call. Then, we can use these functions in a controlled manner in order to perform a nice sleep for most of the desired period, reserving the tight wait loop for the last part.

I observed that the scheduler yield call (on Linux this is sched_yield() and on Windows this is Sleep(0)) typically takes a few microseconds, so it is possible to use this call during the tight wait loop if precision down to this level is not required or if you want to avoid aggresive use of the CPU. The function below accepts a boolean (aggressive) to control whether or not to do this.

    #ifdef _WINDOWS
    #include <windows.h>
    #else
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    #include <time.h>
    #include <sys/time.h>
    #include <unistd.h>
    #include <sched.h>
    #include <math.h>
    #endif
    
    #ifdef _WINDOWS
    #define MICROTIME_SAVE(t) QueryPerformanceCounter( (LARGE_INTEGER*)&t )
    #define MICROTIME_DIFF(t1, t2, freq) (int)((t2.QuadPart >= t1.QuadPart ? \
      t2.QuadPart - t1.QuadPart : \
      0xffffffffULL - t1.QuadPart + 1 + t2.QuadPart) * 1000000ULL / freq.QuadPart)
    #define USLEEP(t) Sleep((int)((t > 1000) ? t / 1000 : 1))
    #define SCHED_YIELD() Sleep(0)
    #define MICROTIME_ADD(t, i, freq) \
      (t.QuadPart += ((unsigned long long)i * freq.QuadPart / 1000000ULL))
    #define MICROTIME_SUB(t, i, freq) \
      (t.QuadPart -= ((unsigned long long)i * freq.QuadPart / 1000000ULL))
    #define MICROTIME_REACHED(curr, exp) ((curr.QuadPart < exp.QuadPart) ? \
      (exp.QuadPart - curr.QuadPart > 0x80000000) :\
      (curr.QuadPart - exp.QuadPart < 0x80000000))
    #else
    #define MICROTIME_SAVE(t) gettimeofday(&t, NULL)
    #define MICROTIME_DIFF(t1, t2, freq) (int)((t2.tv_sec - t1.tv_sec) * 1000000 + \
      (t2.tv_usec - t1.tv_usec))
    #define USLEEP(t) usleep(t)
    #define SCHED_YIELD() sched_yield()
    #define MICROTIME_ADD(t, i, freq) { t.tv_usec += i; if(t.tv_usec >= 1000000) \
      { t.tv_sec++; t.tv_usec -= 1000000; } }
    #define MICROTIME_SUB(t, i, freq) { if(i > t.tv_usec) \
    { t.tv_sec--; t.tv_usec = 1000000 - (i - t.tv_usec); } \
      else { t.tv_usec -= i; } }
    #define MICROTIME_REACHED(curr, exp) (curr.tv_sec > exp.tv_sec || \
      (curr.tv_sec == exp.tv_sec && curr.tv_usec > exp.tv_usec))
    #endif
    int MicroSleep(int microsecs, bool aggressive)
    {
        static int  minSleep = 1;
        static int  minYield = 1;
        static bool bNotInit = true;
        int interval, diff;
    #ifdef _WINDOWS
        static ULARGE_INTEGER freq;
        ULARGE_INTEGER tv1, tv2, exp, exp2, curr;
    #else
        static int freq;
        struct timeval tv1, tv2, exp, exp2, curr;
    #endif
    
        if(bNotInit)
        {
    #ifdef _WINDOWS
            // Get clock frequence
            QueryPerformanceFrequency( (LARGE_INTEGER*)&freq );
    #else
            freq = 1;
    #endif
    
            // Find out the resolution of usleep().
            for(int i = 0; i  minSleep)
                {
                    minSleep = diff;
                }
            }
            minSleep *= 2;
    
            // Find out the resultion of sched_yield()
            for(int i = 0; i  minYield)
                {
                    minYield = interval;
                }
            }
            minYield *= 2;
    
            bNotInit = false;
        }
    
        MICROTIME_SAVE(tv1);
        exp = tv1;
        MICROTIME_ADD(exp, microsecs, freq);
        interval = microsecs - minSleep - 2 * minYield;
        if(interval > 0)
        {
            USLEEP(interval);
        }
        if(!aggressive && (microsecs > minYield))
        {
            // We can only use sched_yield() until we reach within minYield of the
            // dead-line. Work out the cutoff for its use.
            exp2 = exp;
            MICROTIME_SUB(exp2, 4 * minYield, freq);
            while(1)
            {
                MICROTIME_SAVE(curr);
                if(MICROTIME_REACHED(curr, exp2))
                {
                    break;
                }
                SCHED_YIELD();
            }
        }
    
        while(1)
        {
            MICROTIME_SAVE(curr);
            if(MICROTIME_REACHED(curr, exp))
            {
                break;
            }
        }
    
        MICROTIME_SAVE(tv2);
    
        return MICROTIME_DIFF(tv1, tv2, freq);
    }

Testing shows that this code produces precise results with occasional lack of precision of the order of 10 – 50 microsecs. These outliers are the result of context switching. This can be reduced by running the thread at a higher priority level. In a later blog, I will demonstrate how to create a worker thread that runs at a high priority and provides a precise microsec frequency signal that other threads can listen to via a semaphore.

C++ Bitwise Shift Left Acting Like a Rotation

Monday, November 16th, 2009

Just when I assumed that I knew everything about the C/C++ programming language …

Simple C/C++ question: What is the value of the following expression on a x86 32-bit system?

unsigned int i = 1;
i <<= 33;

Answer: i = 2

What!? It can’t be. It should be 0. A 32-bit integer a logical 33-bit shift to the left should contain no more bits, shouldn’t it?

I am old enough that my first programming language was assembly, so I know a thing or two about logical shift operations and rotations. In the former, the most significant bit is shifted off into the carry flag and the least significant bit is cleared. In the latter, the most signigicant bit is moved into the least significant bit. So, given the above result, it looked like the expression was being coded as a rotation operation. But that I could not accept. So I dug a little deeper:

unsigned int i = 0×8000;
i <<= 1;

This gives i = 0, as expected. So we don’t have a rotation going on here.

Which implies that what is happening is this:

unsigned int shift = 33;
unsigned int i = 1;
i <<= (shift % 32);

We checked the assembly code to see how( i <<= 33) is encoded for an x86 CPU. It uses the “sall” instruction, arithmetic shift left with argument 33. In other words, it is the CPU that is doing the modulo (shift % 32) for a 32 bit value and (shift % 64) for a 64 bit value.

Whow, that was surprising news. I know next to nothing about CPU design, but I’m going to guess that this instruction is wired in the CPU in such a way that the shift amount is some kind of key that affects which part of the circuitry is used to perform the operation. Hence the maximum value of the key would be the maximum number of bits in the value being manipulated. They probably wire the lower 5 bits (for a 32 bit value) of the shift amount and use this to select the circuitry that will be doing the actual moving of bits.  Presumably this is some kind of optimization to perform the operation in one cycle instead of doing a shift by 1 for the number of times in the shift amount in which case it would take many cycles to perform the operation.

The moral of the story: if you have to do a shift operation by a variable amount, always check that the shift amount doesn’t exceed the bit-width of the value that you are manipulating.