Sunday, October 30, 2016

Caffe On Joules - Part two: Cmake soup

I assume that now you have Cmake and OpenCV installed as detailed on the previous post on Caffe. So this post will be mostly about "cmake"ing stuff and sticking them together to compile caffe. First we need to go back into the docker container that the Intel® System Studio IoT uses. To do this fire up the ISSI workspace which also results in getting the docker container running, then probe the docker containers and get a bash shell running in the ISSI container:
docker ps
docker exec -i -t  /bin/bash
And now that we are are inside the ISSI docker container, with no further ado, let's dig in to download and cmake and install a fresh copy of Atlas[1]. Note that code was used from [2]:
cd /home/root/
wget http://freefr.dl.sourceforge.net/project/math-atlas/Stable/3.10.3/atlas3.10.3.tar.bz2
wget http://www.netlib.org/lapack/lapack-3.5.0.tgz
tar xjvf atlas3.10.3.tar.bz2
cd ATLAS
mkdir build
cd build
../configure --prefix=/usr/local --with-netlib-lapack-tarfile=/home/root/lapack-3.5.0.tgz --nof77 -v 2 --cripple-atlas-performance
make build
make install
Then we go onto installing the HDF5 library according to [3]:
cd /home/root/
wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.0-patch1/src/CMake-hdf5-1.10.0-patch1.tar.gz
tar xzvf CMake-hdf5-1.10.0-patch1.tar.gz
cd CMake-hdf5-1.10.0-patch1
./build-unix.sh
cd build
cmake .
make
make install
Then lets do protobuf:
cd /home/root/
wget https://github.com/google/protobuf/archive/master.zip -O protobuf.zip
unzip protobuf.zip
cd protobuf-master/
./autogen.sh
./configure
make
make check
make install
Then we have LMDB:
cd /home/root/
wget https://github.com/LMDB/lmdb/archive/mdb.master.zip
unzip mdb.master.zip
cd lmdb-mdb.master/libraries/liblmdb/
make && make install
Then gflags:
cd /home/root/
wget https://github.com/gflags/gflags/archive/master.zip -O gflags.zip
unzip gflags.zip
cd gflags-master/ && mkdir build && cd build
cmake .. && make && make install
Then glog, yes life is hard:
cd /home/root/
wget https://github.com/google/glog/archive/master.zip -O glog.zip
unzip glog.zip
cd glog-master/ && mkdir build && cd build
cmake .. && make && make install
Finally we get to caffe:
cd /home/root/
wget https://github.com/BVLC/caffe/archive/master.zip -O caffe.zip
unzip caffe.zip
cd caffe-master && mkdir build && cd build
ccmake ..
Notice that this time we are using ccmake instead of cmake. Indeed we need to do some surgery to get things working. In the ccmake dialog press "c" and you will be presented with an error. Something about HDF5 not being configured properly. To get past this quickly we are going to set the HDF5 variables manually. To do this first press "e" to exit the help dialog. And then press "t" to enter advanced mode, yes you are now officially CMAKE NINJA. We'll have to manually copy the following lines to their corresponding variables in ccmake:
HDF5_CXX_INCLUDE_DIR:/usr/local/myhdf5/include/
HDF5_C_INCLUDE_DIR:/usr/local/myhdf5/include/
HDF5_hdf5_LIBRARY_RELEASE:/usr/local/myhdf5/lib/libhdf5.a
HDF5_hdf5_cpp_LIBRARY_RELEASE:/usr/local/myhdf5/lib/libhdf5_cpp.a
HDF5_hdf5_hl_LIBRARY_RELEASE:/usr/local/myhdf5/lib/libhdf5_hl.a
HDF5_hdf5_hl_cpp_LIBRARY_RELEA:/usr/local/myhdf5/lib/libhdf5_hl_cpp.a
CMAKE_EXE_LINKER_FLAGS:-ldl -lszip -lz -L/home/root/CMake-hdf5-1.8.17/build/_CPack_Packages/Linux/TGZ/HDF5-1.8.17-Linux/HDF_Group/HDF5/1.8.17/lib/
While you are in ccmake you better turn off these flags: USE_LEVELDB, USE_CUDNN, BUILD_SHARED_LIBS, and turn on CPU_ONLY. Now press "c" and "g" then "make" caffe! :) With this Caffe is built, and you can run it on the docker container. Furthermore you can try to run it on Joule, but you will have missing libraries. For fixing this you can copy all the lib files from the docker container into a folder on the caffe /root/home/ directory. Of course you'd have to set the LD_LIBRARY_PATH to the said directory. With all this work, caffe still wont work on Joule. If you remember, we built atlas on the docker container, hence it uses instructions that work on your CPU and as soon as it start computing something, caffe will crash. To remedy this we should build the atlas libraries on the Joule device, but this is the story of another day. gl hf. [1] http://math-atlas.sourceforge.net/
[2] http://theworldofinteledison.blogspot.com.es/2015/04/getting-scipy-working-on-edison.html
[3] https://support.hdfgroup.org/HDF5/release/cmakebuild.html

Caffe on Intel Joule

Caffe On Joule. - Part one: return of OpenCV

TL;DR:
I have a Caffe. I have a Joule. UH: Caffe-Joule [1]:

Long story is Caffe is a deep learning framework that is primarily in C++[2]. Essentially it is a package that allows training and deploying deep neural networks. It uses protobufs to define a language for constructing different networks and uses GPUs and/or BLAS libraries to speed up the training and testing process.

Joule is a new hardware platform by Intel that was just released by Intel[3]. It features a quad-core Atom CPU, 4 GB of ram, and 16 GB of eMMC all in a package half the size of a credit card. That is it packs the computational power of a laptop in a package that fits in a pocket.

Now I am sure there is a ton of new things that we can do with the new Joule platform, but my prediction is that performing deep neural network computations is the main breakthrough that it achieves. With deep learning the device can use microphones and camera images to form an understanding of the surroundings that has previously been impossible to achieve.

In this article I will show how to setup and run a sample OpenCV program on the Intel Joule. The next article will continue with the installation of Caffe.

To get started first go a head to [4] and complete it up to and including the section: "Setting up the Intel® System Studio IoT Edition (Linux*)". So now you have a project setup that blinks the LED on the board. Now we'll transform this project to one that uses opencv to access a USB camera and save images to the root's home directory.

First, we shall copy the following piece of code in the place of the main file:
/*
Hey! this is a Hack. don't use for production.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/

#include <unistd.h>

#include <csignal>  //Library from the C/++ standard libraries to allow for clean exits.
#include <cstdlib>  //
#include <unistd.h> //
#include <iostream> //The most commonly-included file. This allows you to use cin and cout to print to the console and collect user input.

#include <mraa.hpp> //Intel's MRAA Library. This library allows the programmer to address the GPIO pins on the development board.

#include <cv.h>
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <stdio.h>


using namespace std; //Commonly considered bad practice, this command prevents some simple mistakes that can be difficult to track down for a beginner. Generally, though, you wouldn't do this.
using namespace cv;
//This is a global variable which will hold our GPIO object. (General Purpose In/Out)
mraa::Gpio *gpio;

//This function is called a "signal handler." It allows a signal like "SIGINT" to exit the program cleanly.
//In this case, the SIGINT will be generated by the OS when you press Ctrl+C on your keyboard.
void signal_handler(int sig) {
 delete gpio;
 cout << "Exiting." << endl;
 exit(0);
}

//The main entry point for your program. This is the function that the OS (OSTRO) calls when it "runs" your code.
int main(int argc, char **argv) {

 signal(SIGINT, signal_handler); //This sets the event handler to the SIGINT signal. This is the signal generated by Ctrl+c.

 cout << "Hello from Intel on Joule! again" << endl //Remember, c++ isn't whitespace-sensitive, so you can use "carriage returns" (split lines) in the middle of function calls.
 << "Press Ctrl+c to exit..." << endl;    //This is very useful when the function call will be long and otherwise unwieldy.

 gpio = new mraa::Gpio(100); //Instantiate the GPIO object so that we can access the pin. (Pin 100)


 Mat image;

 cout << "going to test new stuff" << endl;

 VideoCapture capture;

 int c = 0;
 if(!capture.open(c))
  cout << "Capture from camera #" <<  c << " didn't work" << endl;
 else
  cout << "Capture from camera #" <<  c << " actually worked holy shit!" << endl;

 Mat frame;

 if( capture.isOpened() )
 {
  cout << "Video capturing has been started ..." << endl;

  for(;;)
  {
   capture >> frame;
   if( frame.empty() ){
    cout << "empty frame recieved?" << endl;
   }

   Mat frame1 = frame.clone();
   imwrite("coolFile.jpg",frame1);
   cout << "new frame saved" << endl;
   usleep(100);
  }
 }


 delete gpio; //Dispose of the object, now that we no longer need it. This is only needed because the object was allocated on the heap instead of the stack. (I.e. created with a pointer.)
 return 1; //As always, don't forget to return something. In this case, if the code reached this point, there was a problem, so we'll return 1 here.
}

Now you should try to build this, and you will be presented with the error
fatal error: cv.h: No such file or directory
And this makes perfect sense because we have not yet installed OpenCV. To do so, first we find the docker container that has been created with the eclipse workspace:
docker ps
this will show you information about the running container:
CONTAINER ID        IMAGE                                   COMMAND    
257c608539a3        inteliotdevkit/intel-iot-ostro:latest   "/bin/bash" 
note the container ID and proceed to launch a shell inside the container:
docker exec -i -t  /bin/bash
Now you should be inside a bash shell. The next step is to download a copy of Cmake and install it. We can get a copy at [5]. untar. bootstrap and make install:
cd /home/root/
wget https://cmake.org/files/v3.7/cmake-3.7.0-rc3.tar.gz
tar xzvf cmake-3.7.0-rc3.tar.gz
cd cmake-3.7.0-rc3
./bootstrap
make
make install
That's it now you have a fresh copy of cmake installed.
Now for OpenCV, you can get a download link for version 3.1 at [6]. This has to be "cmake"ed, "make"ed and "make install"ed:
cd /home/root
wget https://github.com/Itseez/opencv/archive/3.1.0.zip
unzip 3.1.0.zip
cd opencv-3.1.0/
mkdir build
cd build
cmake ..
make
make install
Although OpenCV is now installed on the docker container, the project will not yet build. We need to show eclipseCDT where it can find the include files and the lib files. In the "project explorer" window right click the project and choose properties. A dialog pops up, in the left pane of the dialog, open the "C/C++ Build" branch by click on the black arrow beside it then choose "settings". In the right pane on the top there will be a dropdown box, select "[all configurations]" from the dropdown. Now on the bottom right part of the top pane select "Includes" from under the "IoT Ostro 64-bit G++ compiler". Finally add the following line to the "include paths (-l)" sub pane:
"//${DOCKER_IMAGE}${DOCKER_SYSROOT}/usr/local/include/opencv/"
The following image should make things clearer:

If we try to build the project now it will compile but it will it will give you linker errors such as:
/usr/local/include/opencv2/core/cvstd.hpp:625: undefined reference to `cv::String::allocate(unsigned long)'
To also fix this we had to link with the OpenCV libraries. This is done in the same dialog as the before. In the left pane of the dialog choose "libraries" from ""IoT Ostro 64-bit G++ Linker". Now add the following libraries one by one to right top pane:
opencv_core
opencv_imgcodecs
opencv_highgui
opencv_imgproc
opencv_videoio
And that's it folks, your OpenCV project should build and get deployed to the device in question. As soon as it starts running, it will probe the USB and open the camera, if any. Then it will start saving images to the file named "/home/root/coolFile.jpg" on the device.
In the next post I will elaborate how we can compile and run caffe on the device.

[1] https://youtu.be/Qu5G443dQ4A
[2] https://github.com/BVLC/caffe
[3] https://software.intel.com/en-us/iot/hardware/joule
[4] https://software.intel.com/en-us/getting-started-on-joule
[5] https://cmake.org/download/ [6] http://opencv.org/downloads.html

Caffe on Intel Joule - Part one: return of OpenCV

TL;DR:
I have a Caffe. I have a Joule. UH: Caffe-Joule [1]:

Long story is that Caffe is a deep learning framework that is primarily in C++[2]. Essentially it is a package that allows training and deploying deep neural networks. It uses protobufs to define a language for constructing different networks and uses GPUs and/or BLAS libraries to speed up the training and testing process.

Joule is a new hardware platform by Intel that was just released by Intel[3]. It features a quad-core Atom CPU, 4 GB of ram, and 16 GB of eMMC all in a package half the size of a credit card. That is it packs the computational power of a laptop in a package that fits in a pocket.

Now I am sure there is a ton of new things that we can do with the new Joule platform, but my prediction is that performing deep neural network computations is the main breakthrough that it achieves. With deep learning the device can use microphones and camera images to form an understanding of the surroundings that has previously been impossible to achieve.

In this article I will show how to setup and run a sample OpenCV program on the Intel Joule. The next article will continue with the installation of Caffe.

To get started first go a head to [4] and complete it up to and including the section: "Setting up the Intel® System Studio IoT Edition (Linux*)". So now you have a project setup that blinks the LED on the board. Now we'll transform this project to one that uses opencv to access a USB camera and save images to the root's home directory.

First, we shall copy the following piece of code in the place of the main file:
/*
Hey! this is a Hack. don't use for production.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*/

#include <unistd.h>

#include <csignal>  //Library from the C/++ standard libraries to allow for clean exits.
#include <cstdlib>  //
#include <unistd.h> //
#include <iostream> //The most commonly-included file. This allows you to use cin and cout to print to the console and collect user input.

#include <mraa.hpp> //Intel's MRAA Library. This library allows the programmer to address the GPIO pins on the development board.

#include <cv.h>
#include "opencv2/highgui.hpp"
#include "opencv2/imgproc.hpp"
#include <stdio.h>


using namespace std; //Commonly considered bad practice, this command prevents some simple mistakes that can be difficult to track down for a beginner. Generally, though, you wouldn't do this.
using namespace cv;
//This is a global variable which will hold our GPIO object. (General Purpose In/Out)
mraa::Gpio *gpio;

//This function is called a "signal handler." It allows a signal like "SIGINT" to exit the program cleanly.
//In this case, the SIGINT will be generated by the OS when you press Ctrl+C on your keyboard.
void signal_handler(int sig) {
 delete gpio;
 cout << "Exiting." << endl;
 exit(0);
}

//The main entry point for your program. This is the function that the OS (OSTRO) calls when it "runs" your code.
int main(int argc, char **argv) {

 signal(SIGINT, signal_handler); //This sets the event handler to the SIGINT signal. This is the signal generated by Ctrl+c.

 cout << "Hello from Intel on Joule! again" << endl //Remember, c++ isn't whitespace-sensitive, so you can use "carriage returns" (split lines) in the middle of function calls.
 << "Press Ctrl+c to exit..." << endl;    //This is very useful when the function call will be long and otherwise unwieldy.

 gpio = new mraa::Gpio(100); //Instantiate the GPIO object so that we can access the pin. (Pin 100)


 Mat image;

 cout << "going to test new stuff" << endl;

 VideoCapture capture;

 int c = 0;
 if(!capture.open(c))
  cout << "Capture from camera #" <<  c << " didn't work" << endl;
 else
  cout << "Capture from camera #" <<  c << " actually worked holy shit!" << endl;

 Mat frame;

 if( capture.isOpened() )
 {
  cout << "Video capturing has been started ..." << endl;

  for(;;)
  {
   capture >> frame;
   if( frame.empty() ){
    cout << "empty frame recieved?" << endl;
   }

   Mat frame1 = frame.clone();
   imwrite("coolFile.jpg",frame1);
   cout << "new frame saved" << endl;
   usleep(100);
  }
 }


 delete gpio; //Dispose of the object, now that we no longer need it. This is only needed because the object was allocated on the heap instead of the stack. (I.e. created with a pointer.)
 return 1; //As always, don't forget to return something. In this case, if the code reached this point, there was a problem, so we'll return 1 here.
}

Now you should try to build this, and you will be presented with the error
fatal error: cv.h: No such file or directory
And this makes perfect sense because we have not yet installed OpenCV. To do so, first we find the docker container that has been created with the eclipse workspace:
docker ps
this will show you information about the running container:
CONTAINER ID        IMAGE                                   COMMAND    
257c608539a3        inteliotdevkit/intel-iot-ostro:latest   "/bin/bash" 
note the container ID and proceed to launch a shell inside the container:
docker exec -i -t  /bin/bash
Now you should be inside a bash shell. The next step is to download a copy of Cmake and install it. We can get a copy at [5]. untar. bootstrap and make install:
cd /home/root/
wget https://cmake.org/files/v3.7/cmake-3.7.0-rc3.tar.gz
tar xzvf cmake-3.7.0-rc3.tar.gz
cd cmake-3.7.0-rc3
./bootstrap
make
make install
That's it now you have a fresh copy of cmake installed.
Now for OpenCV, you can get a download link for version 3.1 at [6]. This has to be "cmake"ed, "make"ed and "make install"ed:
cd /home/root
wget https://github.com/Itseez/opencv/archive/3.1.0.zip
unzip 3.1.0.zip
cd opencv-3.1.0/
mkdir build
cd build
cmake ..
make
make install
Although OpenCV is now installed on the docker container, the project will not yet build. We need to show eclipseCDT where it can find the include files and the lib files. In the "project explorer" window right click the project and choose properties. A dialog pops up, in the left pane of the dialog, open the "C/C++ Build" branch by click on the black arrow beside it then choose "settings". In the right pane on the top there will be a dropdown box, select "[all configurations]" from the dropdown. Now on the bottom right part of the top pane select "Includes" from under the "IoT Ostro 64-bit G++ compiler". Finally add the following line to the "include paths (-l)" sub pane:
"//${DOCKER_IMAGE}${DOCKER_SYSROOT}/usr/local/include/opencv/"
The following image should make things clearer:

If we try to build the project now it will compile but it will it will give you linker errors such as:
/usr/local/include/opencv2/core/cvstd.hpp:625: undefined reference to `cv::String::allocate(unsigned long)'
To also fix this we had to link with the OpenCV libraries. This is done in the same dialog as the before. In the left pane of the dialog choose "libraries" from ""IoT Ostro 64-bit G++ Linker". Now add the following libraries one by one to right top pane:
opencv_core
opencv_imgcodecs
opencv_highgui
opencv_imgproc
opencv_videoi
And that's it folks, your OpenCV project should build and get deployed to the device in question. As soon as it starts running, it will probe the USB and open the camera, if any. Then it will start saving images to the file named "/home/root/coolFile.jpg" on the device.
In the next post I will elaborate how we can compile and run caffe on the device.

[1] https://youtu.be/Qu5G443dQ4A
[2] https://github.com/BVLC/caffe
[3] https://software.intel.com/en-us/iot/hardware/joule
[4] https://software.intel.com/en-us/getting-started-on-joule
[5] https://cmake.org/download/ [6] http://opencv.org/downloads.html

Friday, January 16, 2015

cuda matrix transpose code and test

This is essentially the code from here:

http://devblogs.nvidia.com/parallelforall/efficient-matrix-transpose-cuda-cc/

I have expanded it to handle non-Square matrices whose sizes are not a multiple of the block size:

/**
 * writtern By Amir Hossein Bakhtiary, use as you wish. Shouldn't have any copyright problems.
 */
#include
#include
#include

#include
#include
#include
#include
#include
#include
#include


const int TILE_DIM = 32;
const int BLOCK_ROWS = 8;

template
__global__ void transposeCoalesced(Dtype *odata, const Dtype *idata, int rows,int cols)
{
  __shared__ Dtype tile[TILE_DIM][TILE_DIM+1];

  int x = blockIdx.x * TILE_DIM + threadIdx.x;
  int y = blockIdx.y * TILE_DIM + threadIdx.y;

//  if (x >= cols||y >= rows){
//      return;
//  }

  int maxJ = TILE_DIM;
  int maxJ2 = TILE_DIM;
  int otherMaxJ = rows - y;
  if (maxJ > otherMaxJ)
    maxJ = otherMaxJ;


  if ( x < cols ){
    for (int j = 0; j < maxJ; j += BLOCK_ROWS)
     tile[threadIdx.y+j][threadIdx.x] = idata[(y+j)*cols + x];
  }
  __syncthreads();

  x = blockIdx.y * TILE_DIM + threadIdx.x;  // transpose block offset
  y = blockIdx.x * TILE_DIM + threadIdx.y;

  int otherMaxJ2 = cols - y;
  if (maxJ2 > otherMaxJ2){
      maxJ2 = otherMaxJ2;
  }
  if ( x < rows){
    for (int j = 0; j < maxJ2; j += BLOCK_ROWS)
       odata[(y+j)*rows + x] = tile[threadIdx.x][threadIdx.y + j];
  }

}

template
thrust::device_vector invertMatrix(thrust::device_vector mat,unsigned rows,unsigned cols){
    thrust::device_vector retval(rows*cols);
    dim3 dimGrid((rows+TILE_DIM-1)/TILE_DIM,(cols+TILE_DIM-1)/TILE_DIM, 1);
    //dim3 dimGrid((nHashes)/TILE_DIM,(nSamples)/TILE_DIM, 1);
    dim3 dimBlock(TILE_DIM, BLOCK_ROWS, 1);

    transposeCoalesced<<< dimGrid, dimBlock>>>(retval.data().get(),mat.data().get(),rows,cols);
    return retval;
}


template
void testInvert(){
    thrust::host_vector h_vec(1000);
    thrust::generate(h_vec.begin(), h_vec.end(), rand);
    int rows = 3 , cols = 4;
    thrust::device_vector a = h_vec;

    thrust::device_vector aInv = invertMatrix(a,rows,cols);
    for (int i = 0 ; i < rows; i++){
        for (int j = 0; j < cols; j++){
            std::cout << (a[i*cols+j]) << " " << aInv[j*rows+i] << std::endl;
        }
    }
}


int main()
{
    testInvert();
}

Thursday, September 9, 2010

PIMPEL

... is a very nice method for use in C++ for abstraction, encapsulation and overall niceness. Why did I not know this before?

Wednesday, August 18, 2010

Posientation!

Finally I have found a word to describe a variable that holds both the position and orientation. Actually, I had to come up with it myself and it's a word I am proud of: Posientation = Position + Orientation.

I must say it is just very difficult to program when you don't have correct variable names. I had looked for this many times and now that I have it my design looks a lot nicer.

Sunday, August 1, 2010

gui complete

That new program I started... Now for approx 300 lines of code I have something that exits when I press 'q'! Lets get this rolling.