Dec 9, 2012

http://coccinellery.org/


Coccinellery is a gallery of semantic patches made to inspire users of Coccinelle. Currently the semantic patches comes from patches we have submitted to the Linux Kernel. We are working on improving Coccinellery and your suggestions and contributions are welcome.

http://coccinellery.org/

"Coccinelle is a program matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code. Coccinelle was initially targeted towards performing collateral evolutions in Linux. Such evolutions comprise the changes that are needed in client code in response to evolutions in library APIs, and may include modifications such as renaming a function, adding a function argument whose value is somehow context-dependent, and reorganizing a data structure. Beyond collateral evolutions, Coccinelle is successfully used (by us and others) for finding and fixing bugs in systems code."

Nov 4, 2012

Do you want to become a pilot? Read me first.

Fly by wire is a book written by William Langewiesche that has, as a central history, the successful landing of Airbus A320 on the Hudson river on January 15 2009.

But this is only the central history of the book. The author has a very funny way of writing, and shares details about many people involved, like bird strikes specialists, air traffic controllers, Airbus engineers and test pilots. There is a very interesting analysis about impact of Airbus fly by wire system and how turbulent, and boring, can be the career of commercial pilots.

If you are considering becoming a pilot, read this book first. It can direct you to the right kind of pilot, or convince you that there are better ways of living.

Sep 23, 2012

Free Mobile Wifi: FreeWifi_secure on Android


I'm on France and decided to give Free Mobile a try. They offer good pack of service for € 20 / month.

See: http://mobile.free.fr/

One important feature for me is having Internet at home, and the "Accès FreeWiFi illimité", or unlimited access to FreeWifi was what I was expecting to use at home.

The problem is that I bought my cell phone in Brazil and it was not working with FreeWifi. This is related to how the connection works. See the links:

http://code.google.com/p/seek-for-android/wiki/EapSimAka

http://forum.xda-developers.com/showthread.php?t=1639437

http://www.freenews.fr/spip.php?article12150

https://docs.google.com/spreadsheet/ccc?key=0AmQ-TvlJlw9-dHQtQjdCUFVWSnY5T2xMeXBvRGx1Z1E

But you may not need to do anything by hand on your phone. Just try FreeWifiConfig app:

https://play.google.com/store/apps/details?id=org.bubuabu.freewificonfig

It will automatically configure your phone for connecting on FreeWifi_secure.

BUT:
1 - Requires Free Mobile GSM chip
2 - May not work with all phones

Sep 13, 2012

Winter of the World is on my Kindle!

Today Amazon sent the Winter of the World by Ken Follett to my Kindle! I think I'll not sleep for some nights. :-D



Aug 5, 2012

Benchmark: SanDisk Cruzer Blade USB Flash Drive 16GB


Both Kingston and SanDisk 16GB pendrive has the same price range. I was curious for comparing them. The overall performance is poor, but is a bit better than Kingston DT101.

Write performance: 4.2 MB/s
ReWrite performance: 3.5 MB/s
Read performance: 24.5 MB/s
Random Seeks: 657.4 / sec

Comparing to Samsung S2 Portable 500GB USB 2.0 External HD:
8.2 times slower for write
4.9 times slower for rewrite
1.8 times slower for reading
5.3 times more seeks / sec

Full performance output:


[root@ace ~]# bonnie++ -n 0 -u 0 -r 7000 -f -b -d /mnt
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version 1.96 ------Sequential Output------ --Sequential Input- --Random-
Concurrency 1 -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP /sec %CP
ace.home 14G 4265 0      3576 0                            24593 3                         657.4 23
Latency           5727ms     11334ms                         1788ms                          86612us

1.96,1.96,ace.home,1,1344179290,14G,,,,4265,0,3576,0,,,24593,3,657.4,23,,,,,,,,,,,,,,,,,,,5727ms,11334ms,,1788ms,86612us,,,,,,

See: Benchmark: Samsung S2 Portable 500GB USB 2.0 External HD
See more benchmarks: http://blog.parahard.com/search?q=benchmark

Aug 4, 2012

Benchmark: Kingston DataTraveler 101 DT101 G2 16GB


I was curious about Kingston USB flash performance. The overall performance is poor.

Write performance: 3.7 MB/s
ReWrite performance: 3.0 MB/s
Read performance: 22.5 MB/s
Random Seeks: 4.3 / sec

Comparing to Samsung S2 Portable 500GB USB 2.0 External HD:
9.4 times slower for write
5.7 times slower for rewrite
2.0 times slower for reading
28.79 times less seeks / sec

Full performance output:

[root@ace ~]# bonnie++ -n 0 -u 0 -r 7000 -f -b -d /mnt
Using uid:0, gid:0.
Writing intelligently...done
Rewriting...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Version  1.96       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
ace.home        14G            3708   0  3063   0           22568   3   4.3   0
Latency                        6309ms   22399ms             11761us   29385ms

1.96,1.96,ace.home,1,1344111786,14G,,,,3708,0,3063,0,,,22568,3,4.3,0,,,,,,,,,,,,,,,,,,,6309ms,22399ms,,11761us,29385ms,,,,,,

See: Benchmark: Samsung S2 Portable 500GB USB 2.0 External HD
See more benchmarks: http://blog.parahard.com/search?q=benchmark

Jul 21, 2012

Mounting and converting vdi and qcow2 image files

1. Convert VirtualBox VDI disk file for use with KVM
$ qemu-img convert -O qcow2 diskname.vdi newdiskname.qcow2

2. Mounting raw disk images
losetup /dev/loop0 image.img
# kpartx -a /dev/loop0
# mount /dev/mapper/loop0p1 /mnt/image

3. Mount qcow2 disk images

# modprobe nbd max_part=63
# qemu-nbd -c /dev/nbd0 image.img
# mount /dev/nbd0p1 /mnt/image

Optional: LVM
Scaning for LVM on disk images:

# vgscan
# vgchange -ay VolGroupName
# mount /dev/VolGroupName/LogVolName /mnt/image

Finishing:

# umount /mnt/image
# vgchange -an VolGroupName
# killall qemu-nbd
# kpartx -d /dev/loop0
# losetup -d /dev/loop0




From: http://jaysonrowe.blogspot.com.br/2011/10/convert-virtualbox-vdi-disk-for-use.html
From: http://alexeytorkhov.blogspot.com.br/2009/09/mounting-raw-and-qcow2-vm-disk-images.html

Jun 20, 2012

Caboooo! Cabooooo! Acabooooo!



Assim que fiquei sabendo a última nota que faltava, me lembrei do Galvão bueno comemorando alucinadamente a final da copa de 1994.

Finalmente me formei!

Caboooo! Caboooo! Acabooo!

Jun 9, 2012

Enable power button on minimal install of Fedora 17

After fresh Fedora 17 minimal install, pressing the physical power button has no effect. This is valid for both virtual machines and physical servers. To fix it, just:

# yum install acpid
# reboot

Jun 3, 2012

Communication between two MSP430

I like the LauchPad kit due its price under USD 5. But I also like it due its simplicity.

The Lauchpad kit includes two MSP-430 chips. It is very easy to make the second chip to work with just a few components.

Schematic used for the second MSP430

This schematic is from: http://www.msp430launchpad.com/2010/07/schematic-images-and-explanation.html

I made two simple programs that communicate with each other. The source code is at:
https://github.com/petersenna/msp430/blob/master/rx-device.c
https://github.com/petersenna/msp430/blob/master/tx-device.c


Lightning fast boot with Fedora17

See Harald Hoyer article: Fedora 17 Boot Optimization (from 15 to 2.5 seconds).

One of the boot steps that consumes about 3 seconds is the decompression and running of initrd. If you do not use LVM, software RAID or partition encryption you may not need initrd.

But Grub2 is configured to always use initrd for booting. If you modify by hand the file /etc/grub2.cfg, it will last until your next Kernel update. To avoid this, I made a grub configuration file that will generate entries without initrd even for new Kernels. Check it out at:

https://github.com/petersenna/Fedora17-fastboot


[peter@ace Fedora17-fastboot]$ systemd-analyze
Startup finished in 1433ms (kernel) + 2150ms (userspace) = 3584ms

3.5 seconds is good for me and I still have a fully operational Fedora 17. :-D

May 1, 2012

Nobreak SMS no Linux

O software que a SMS disponibiliza para comunicar com os nobreaks é no mínimo precário. Não é bem acabado e consome uma enormidade de recursos. É difícil de acreditar que eles distribuem um servidor de aplicação apenas para isso. Resolvi fazer o meu software para comunicar com o nobreak. Veja a saída da versão 0.1:



É apenas um executável simples que não precisa de nada complexo. O código fonte está diponível para download em: https://github.com/petersenna/nobreakSMS.

Também tem alguns binários em: https://github.com/petersenna/nobreakSMS/tree/master/userland/binary


*** ATUALIZAÇÃO DEZEMBRO 2017 ***
Tente usar o nut, Network UPS Tools. Os nobreaks da SMS aparecem na lista de dispositivos compatíveis. O nut está disponível na maioria das distribuições Linux.

Mar 18, 2012

Emdebian Grip 2.0

The Grip flavor of Emdebian project allows to create thinner Debian root file systems. I've created one that includes apt, vim-tiny, net-tools, iputils-ping, and isc-dhcp-client. Later I've added grub and Kernel 2.6.32-5-686. The total size of this Emdebian is about 66MB, which is 3.5 times smaller than the usual 230 MB required for regular Debian root file system.

Want to test it? I've created a 128MB bootable image file for x86 architecture. You will need at least 48MB of RAM to this image work. You can download it here. For testing use USB pen drive and a personal computer that allows booting from USB. The procedure:
1 - After downloading, uncompress using gzip:
# gzip -d peters-emdebian.img.gz

2 - Copy the image do destination pen drive using dd. Be careful to use correct destination as dd will overwrite it. Replace "X" with the correct drive letter. All data on destination storage will be lost.
# dd if=peters-emdebian.img of=/dev/sdX bs=8k

3 - Boot! Using the storage you've created. Don't tell anyone but the root password is: 3dp

You can use $ df -h and $ cat /proc/meminfo to check the resource consumption of this Emdebian install.

Generating the Emdebian root filesystem from a Debian Squeeze box is easy. Making a bootable image require a little bit more work but also works fine. To generate the rootfs, download this file and:
# multistrap -d grip-squeeze -f grip-squeeze.conf

The Emdebian root filesystem will be saved at the folder grip-squeeze.

More at Emdebian Grip web site.


Mar 7, 2012

Performance Overhead and Comparative Performance of 4 Virtualization Solutions

Virtualization is being sold as a solution for data center hardware idleness. Increasing the hardware usage level from less than 20% to more than 70% is possible with virtualization solutions. This may represent advantages like more available computing power and less servers consuming electricity, space and services. But at what cost? What is the price of virtualization solutions in terms of computing power and I/O consumption? What is the overhead of the virtualization layer? 

Sample graph showing read performance for Linux

Feb 4, 2012

GitHub Social Coding




Github is simple to use and powerful git repository with great web interface. If your project is open source, there is no problem if your repository is also open, right? What could be better to open source projects than having free hosting on feature rich and reliable service?

But if you are not the good guy and want to have your git repositories private, looks fair to charge you little money to keep your code safe. 

This looks to be the basic business model do Github and the prices are attractive.

For those have never being in touch with versioning and code repositories, Github is also good place to start. The step-by-step documentation will guide you over the process of using git. It is really easy to start.

Github also encourages you to make friends and to grow your social network around the code you share. I would like to be your friend on GitHub, so you are invited to see my profile: https://github.com/petersenna

If you are looking for a job, Github can also help you. You can make your "Job Profile" and select: "Available for hire". The link to your code looks to be that your "open" code repositories are your portfolio.

Jan 31, 2012

C code optimization benchmark

Steve Oualline talks about C code optimization on his book: Practical C Programming. I was curious about the real performance gains. The benchmark test results are at the end of the post.

How can this C code be optimized?
matrix1.c

#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
int x,y;
for (x = 0; x < X_SIZE; ++x){
for (y = 0; y < Y_SIZE; ++y){
matrix[x][y] = -1;
}
}
}
void main()
{
initmatrix();
}
The first suggested optimization is to use the "register" qualifier for the indexes variables x and y:
matrix2.c

#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
register int x,y;
for (x = 0; x < X_SIZE; ++x){
for (y = 0; y < Y_SIZE; ++y){
matrix[x][y] = -1;
}
}
}
void main()
{
initmatrix();
}
 Then the optimization suggestion is to order the for loops so that the innermost for is the most complex:
matrix3.c

#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
register int x,y;
for (y = 0; y < Y_SIZE; ++y){
for (x = 0; x < X_SIZE; ++x){
matrix[x][y] = -1;
}
}
}
void main()
{
initmatrix();
}
The most tricky to understand is changing Y_SIZE from 30 to 32. This will activate one feature of most C compilers that converts multiples by a power of  2 (2, 4, 8, ...) into shifts. This will result in performance gains when the computer is doing pointer arithmetic to access the correspondent memory address of matrix[x][y]. The compiler will change one multiplication operation into one shift operation which is cheaper.
matrix4.c
#define X_SIZE 60
#define Y_SIZE 32
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
register int x,y;
for (y = 0; y < Y_SIZE; ++y){
for (x = 0; x < X_SIZE; ++x){
matrix[x][y] = -1;
}
}
}
void main()
{
initmatrix();
}
Reducing the number of loops and taking control of the pointer arithmetic is great performance optimization.
matrix5.c

#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
register int index;
register int *matrix_ptr;
matrix_ptr = &matrix[0][0];
for (index = 0; index < X_SIZE * Y_SIZE; ++index){
*matrix_ptr = -1;
matrix_ptr++;
}

}
void main()
{
initmatrix();
}
Reducing the number of variables that are necessary to do the pointer arithmetic also improves performance:
matrix6.c
#define X_SIZE 60
#define Y_SIZE 32
int matrix[X_SIZE][Y_SIZE];
void initmatrix(void)
{
register int *matrix_ptr;
for (matrix_ptr = &matrix[0][0];
matrix_ptr <= &matrix[X_SIZE - 1][Y_SIZE - 1];
++matrix_ptr){
*matrix_ptr = -1;
}

}
void main()
{
initmatrix();
}
Looks like that there is nothing more to optimize. You can always write some assembly but it may not be good idea. The library function memset() can be used to fill a matrix. "Frequent used library subroutines like memset are often coded into assembly language and may make use of special processor-dependent tricks to do the job faster than could be done in C".
matrix7.c
#include
#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
void initmatrix(){
memset (matrix, -1, sizeof(matrix));
}

void main()
{
initmatrix();
}
There is overhead in function call. It is possible to do better with macros.
matrix8.c
#include
#define X_SIZE 60
#define Y_SIZE 30
int matrix[X_SIZE][Y_SIZE];
#define initmatrix() \
memset (matrix, -1, sizeof(matrix));


void main()
{
initmatrix();
}
The improvements looks good, but how much efficient each optimizations are? I've measured it in clock cycles. And found that the optimization level is processor dependent.

For clock cycles, lower is better.

Results for: Intel(R) Core(TM) i7-2620M CPU @ 2.70GHz

clock cycles times faster
matrix1() 11102.151556 1
matrix2() 6400.36597 1.7346119906
matrix3() 6379.460394 1.740296337
matrix4() 5952.497506 1.8651249404
matrix5() 2154.262528 5.153574094
matrix6() 1907.350431 5.8207193474
matrix7() 792.123493 14.0156827239
matrix8() 780.254779 14.2288799182


Results for: Intel(R) Core(TM)2 Quad CPU    Q8400  @ 2.66GHz

clock cycles times faster
matrix1() 17175.114362 1
matrix2() 8153.467501 2.1064797719
matrix3() 8063.182452 2.1300664427
matrix4() 8497.82453 2.0211189701
matrix5() 4300.083046 3.9941355035
matrix6() 4321.695819 3.9741608575
matrix7() 1569.097383 10.945856228
matrix8() 1560.792718 11.0040969335


Results for: AMD Athlon(tm) 7750 Dual-Core Processor @ 2.7GHz

clock cycles times faster
matrix1() 25319.969906 1
matrix2() 10329.498185 2.4512294259
matrix3() 8558.934585 2.9583086136
matrix4() 9480.851235 2.6706430972
matrix5() 5544.608885 4.5665926003
matrix6() 5577.454075 4.5397002943
matrix7() 643.046753 39.3750062307
matrix8() 631.545791 40.0920570873


So, it is real! For Intel you can get 2 times faster performance by doing simple changes and not using pointer arithmetic. If you do take control of pointer arithmetic and trash some variables, the performance gain can go up to almost 6 times faster. The performance gain can reach 14 times faster by using ultra specialized subroutines. It is much better then I was expecting.

For AMD the use of the specialized functions can result in speedup of more than 40 times.

The clock cycle count is not an integer because the values shown are average mean of 256 measurements.

For the graphs that shows results in clock cycles, lower is better.
C code optimization benchmark: Core i7


C code optimization benchmark: Core 2 Quad


C code optimization benchmark: Athlon X2


rdtscbench was used to do the benchmark testing. The source code is available here. The command line for rdtscbench was: "# ./rdtscbench 256 8". It is also available at: https://github.com/petersenna/rdtscbench

Jan 23, 2012

How to recompile software with hardware optimization?


This may be useful for compiling local applications that you want to run faster.

Try this on your computer:
$ echo "" | gcc -march=native -v -E - 2>&1 | grep cc1
On my computer it has returned:
 /usr/libexec/gcc/x86_64-redhat-linux/4.6.1/cc1 -E -quiet -v - -march=corei7-avx -mcx16 -msahf -mno-movbe -maes -mpclmul -mpopcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mavx -msse4.2 -msse4.1 --param l1-cache-size=32 --param l1-cache-line-size=64 --param l2-cache-size=4096 -mtune=corei7-avx
This command probes the local computer for optimization flags. To use it:
$ CFLAGS="[blue string from above]" ./configure
You may consider adding the "-O3" flag. The -O3 flag enables levels 1, 2 and 3 of compile time optimization. There are more information about -O3 on gcc man page. For doing it, instead of previous line, use:
$ CFLAGS="-O3 [blue string from above]" ./configure

From: http://blog.mybox.ro/2011/11/02/how-to-recompile-software-with-hardware-optimizations/