Friday, October 31, 2008

Slower with age ... 2nd. part

Phoronix had prepared a second test, now is testing also Fedora 7 to 10 and compare the results with the big U. In my consideration, Fedora became slow in some parts, but there are some results showing big problems with the U like this:


So, if you improve your user-friendly don't means you need better hardware every release.

Testing many version of a same distro can show its evolution. Whatever, I'm happy with my Mandriva 2009.

Tuesday, October 28, 2008

BioPython

Last day, I need to download some sequences from NCBI GenBank, I have a list of ids, typically I used BioPerl to connect and get the fasta sequence of each one, with a code like this:

#!/usr/bin/perl -w
use strict;
use Bio::DB::GenBank;
my $gb = new Bio::DB::GenBank;
open F, "gene_list" or die "cannot open genes_list\n";
while (<F>) {
chomp;
my $seq = $gb->get_Seq_by_id($_);
print $seq->seq;
}


Because Broadcast with Mac OS X 10.5.X cannot compile the BioPerl modules (I found many problems when you try to compile from source code, because many dependencies are broken). So, I take a look to BioPython, install it (with some warnings and missing optional packages), and use the next script:

#!/usr/bin/python
from Bio import Entrez
Entrez.email = "mymail@something.org" # Always tell NCBI who you are
f = open("genes_list", "r")
while True:
myid = f.readline()
if not myid: break
handle = Entrez.efetch(db="nucleotide", id=myid, rettype="fasta")
print handle.read()

and this works well ...

I know this is not so much efficient because require a call for each id, but some times is better when you are downloading big sequences or so many.

My two recommendations:
1. Dominate one computer language but be familiar with others. Better if you dominate more than one.
2. Don't waste time when you know there are more than one solution, if one fails, try next option.

Monday, October 27, 2008

Slower with age ...

I will not comment this article, I still want to conserve my U-emo-friends, so please read the original in Phoronix: Ubuntu 7.04 to 8.10 Benchmarks: Is Ubuntu Getting Slower? Yes, but still looks nice in brown/gold ;)

Friday, October 24, 2008

I, computer

Are my computers happier?
Source: Abstruse Goose

Thursday, October 23, 2008

X-rays with a sticky tape

This week in one of most famous scientific journals has been published this report:

Camara CG, Escobar JV, Hird JR, Putterman SJ, "Correlation between nanosecond X-ray flashes and stick–slip friction in peeling tape.", Nature 455, 1089-1092 (23 October 2008) | doi:10.1038/nature07378
http://www.nature.com/nature/journal/v455/n7216/abs/nature07378.html

Abstract:
Relative motion between two contacting surfaces can produce visible light, called triboluminescence. This concentration of diffuse mechanical energy into electromagnetic radiation has previously been observed to extend even to X-ray energies. Here we report that peeling common adhesive tape in a moderate vacuum produces radio and visible emission, along with nanosecond, 100-mW X-ray pulses that are correlated with stick–slip peeling events. For the observed 15-keV peak in X-ray energy, various models give a competing picture of the discharge process, with the length of the gap between the separating faces of the tape being 30 or 300 mum at the moment of emission. The intensity of X-ray triboluminescence allowed us to use it as a source for X-ray imaging. The limits on energies and flash widths that can be achieved are beyond current theories of tribology.

Translation:
If you peel a tape in vacuum, it's emit X-ray which can be detected with a Geiger or a photo-film.

Please watch the video http://www.nature.com/nature/videoarchive/x-rays/

Wednesday, October 22, 2008

Björk teaches you about electronics



From: http://hackaday.com/2008/10/20/bjork-teaches-you-about-electronics/

Tuesday, October 21, 2008

Bioinformatics Career Survey 2008

Bioinformatics Zen had released the results in a text-file of the Bioinformatic Career Survey 2008, the survey include data from ~650 people from academia and industry, it's interesting to take a look in the data, I summarize this in some graphics:

Career
Career

Background
Background

Bioinformatics area
Bioinformatics

Computer Language
Background

Wednesday, October 15, 2008

Perl BioGolf

Do you know what is a Perl Golf problem? It's a general problem formulated and you try to resolve with a minimal number of characters in a perl script, who writes less win. Some times is a good habit to see, admire and think in this beautiful pearls. Commonly there are a lot in the Perl Monks website.

Today I was looking for a more simple and effective subroutine to translate a DNA/RNA sequence into the corresponding peptide version using the typical genetic code, I used the typical solution with a hash storing the code and call the sequence in block with substr or pop/shift.

I found this solutions in a Perl Golf challenge:


# Typical solution hashing the codes:
sub f0 { #by tadman
my %g = (
# . - Stop
'UAA'=>'.','UAG'=>'.','UGA'=>'.',
# A - Alanine
'GCU'=>'A','GCC'=>'A','GCA'=>'A','GCG'=>'A',
# C - Cysteine
'UGU'=>'C','UGC'=>'C',
# D - Aspartic Acid
'GAU'=>'D','GAC'=>'D',
# E - Glutamic Acid
'GAA'=>'E','GAG'=>'E',
# F - Phenylalanine
'UUU'=>'F','UUC'=>'F',
# G - Glycine
'GGU'=>'G','GGC'=>'G','GGA'=>'G','GGG'=>'G',
# H - Histidine
'CAU'=>'H','CAC'=>'H',
# I - Isoleucine
'AUU'=>'I','AUC'=>'I','AUA'=>'I',
# K - Lysine
'AAA'=>'K','AAG'=>'K',
# L - Leucine
'CUU'=>'L','CUC'=>'L','CUA'=>'L','CUG'=>'L',
'UUA'=>'L','UUG'=>'L',
# M - Methionine
'AUG'=>'M',
# N - Asparagine
'AAU'=>'N','AAC'=>'N',
# P - Proline
'CCU'=>'P','CCC'=>'P','CCA'=>'P','CCG'=>'P',
# Q - Glutamine
'CAA'=>'Q','CAG'=>'Q',
# R - Arginine
'CGU'=>'R','CGC'=>'R','CGA'=>'R','CGG'=>'R',
'AGA'=>'R','AGG'=>'R',
# S - Serine
'UCU'=>'S','UCC'=>'S','UCA'=>'S','UCG'=>'S',
'AGU'=>'S','AGC'=>'S',
# T - Threonine
'ACU'=>'T','ACC'=>'T','ACA'=>'T','ACG'=>'T',
# V - Valine
'GUU'=>'V','GUC'=>'V','GUA'=>'V','GUG'=>'V',
# W - Tryptophan
'UGG'=>'W',
# Y - Tyrosine
'UAU'=>'Y','UAC'=>'Y',
);
$_=pop;s/.{1,3}/$g{$&}/g;$_
}

# Second solution using the non-specific code.
sub f2{ #by MeowChow
my @r = qw(UA[AG]|UGA GC. - UG[UC] GA[UC] GA[AG] UU[UC] GG. CA[UC] AU[^G] - AA[AG] CU.|UU[AG] AUG AA[UC] - CC. CA[AG] CG.|AG[AG] UC.|AG[UC] AC. - GU. UGG - UA[UC] ^);
((my$t=pop)=~s|..?.?|chr 64+(grep$&=~/$r[$_]/,0..26)[0]|eg);$t=~y/@Z/./d;
$t
}

# Third solution including regex and substitutions
sub f3 { #by no_slogan
$_="KNNKtIIIMRSSRQHHQplr.YY.sLFFL.CCWEDDEavg";
s/[a-z]/uc$&x4/eg;@x=/./g;join"",@x[map{$x=0;$x=$x*4|6&ord for/./g;$x/2}pop=~/.../g]
}

# Fourth solution similar to 3rd.
sub f4 { #by srawls
$_="KNNKtIIIMRSSRQHHQplr.YY.sLFFL.CCWEDDEavg";s/[a-z]/uc$&x4/eg;
join"",(/./g)[map{$x=0;$x=$x*4|6&ord for/./g;$x/2}pop=~/.../g]
}

# Fifth solution inverse from 3rd and 4th
sub f5 { #by tachyon
@_{'UAAUAGUGAGCUGCCGCAGCGUGUUGCGAUGACGAAGAGUUUUUCGGUGGCGGAGGGCAUCACAUUAUCAUAAAAAAGCUUCUCCUACUGUUAUUGAUGAAUAACCCUCCCCCACCGCAACAGCGUCGCCGACGGAGAAGGUCUUCCUCAUCGAGUAGCACUACCACAACGGUUGUCGUAGUGUGGUAUUAC'=~/(...)/g}=split//,'...AAAACCDDEEFFGGGGHHIIIKKLLLLLLMNNPPPPQQRRRRRRSSSSSSTTTTVVVVWYY';
$_=pop;
s/..?.?/$_{$&}/g;
$_
}

# Sixth solution and this is the fastest solution
sub f5{ #by tadman
$_=pop;
y/UCAG/0123/;
s/(.)(.)(.)/substr"FFLLSSSSYY..CC.WLLLLPPPPHHQQRRRRIIIMTTTTNNKKSSRRVVVVAAAADDEEGGGG",$1<<4|$2<<2|$3,1/ge; y/0123//d;
$_
}
All solutions have less bytes but I added some break-lines to present a more clear code (really?).

I use the last solution, just change the code for ATGC (DNA code) and not AUGC (RNA code).

That's why Perl rules in Bioinformatic.

Monday, October 13, 2008

Mandriva 2009

Last week, Mandriva released the 2009 version, because I'm a mandriva fan, immediately I downloaded the One-KDE ISO, burn it and installed in my old HP laptop (PIII 1Ghz, 256 Mb RAM, 20 Gb HD, Wi-fi card).

The LiveCD run perfectly, show me the new KDE4 desktop and made a clean install without problems (many others Linux LiveCD have problems just to boot in old hardware like this). I like to use a different partition for the /home, so my partition table looks like:
  • / 5GB
  • swap - 512 MB
  • /geexbox - 100MB
  • /home - rest
Yes, I want to install the GeeXBoX, it's great for watch movies.

Some good points are the new design, fast boot, the best hardware detection and many friendly menus to configure all. Remarkable is the improved URPMI, it is fast, now support simultaneous package download and the best part is the --auto-orphans option, this check for unused or broken packages and suggest uninstall, cleaning the systems even the kernel, removing unused drivers or modules. Before this I need to do manually, now is automatic.

KDE is a little heavy for this laptop, so I install XFCE and LXDE, as alternatives.

sudo urpmi task-xfce
sudo urpmi task-lxde

A bad point is I need to wait for the x86_64 version to upgrade my other laptop with a AMD Athlon 64 X2.

My desktop:


Download Mandriva: http://www.mandriva.com/en/download
Notes of the release: http://wiki.mandriva.com/en/2009.0_Notes
A tour of the release: http://wiki.mandriva.com/en/2009.0_Tour

Wednesday, October 8, 2008

Rules for BioComputing Happiness

Inspired by this article "al3x's rules for computing happiness" of Alex Payne, I want to extend this theme to my areas: bioinformatic, computational biology and systems biology.

    Software

  1. Use as little software as possible.
  2. Use software that does one thing well.
  3. Do not use software that does many things poorly.
  4. Try to understand how a software works before to use.
  5. Do not use web applications that should be desktop applications.
  6. Do not use desktop applications that should be web applications.
  7. Do not use software that isn't made specifically for your operating system.
  8. Use a plain text editor that you know well. Not a word processor, a plain text editor.
  9. Do not use your text editor for tasks other than editing text.
  10. Do not use software that's unmaintained.
  11. Do not use software unpublished.
  12. Try to use Open Source code.
  13. Be in touch with the developers or users in forums, mail-list, ...
  14. If you don't have a formal IT department, learn to maintain your systems.

    Hardware

  1. Some basic analysis does not require powerful computers, you can run many locally or in web services. But if you are in a big project or the data is in order of GBs o more, consider to buy a multicore server or build a Linux Cluster.
  2. Use a Mac/Linux for personal computing or development, Windows ... don't waste your time.
  3. Use Linux or BSD on commodity hardware for server computing.
  4. The only peripheral you absolutely need is a hard disk or network drive to put backups on, but get one as big as possible.
  5. Buy as large an external display as you can afford if you'll be working on the computer for more than three hours at a time.
  6. If you'll work with DBs locally, be sure you have an appropriate internet connection.
  7. File Formats

  1. Keep as much as possible in plain text. Not Word or Excel documents, plain text.
  2. For tasks that plain text doesn't fit, store documents in an open standard file format if possible.

Coding

  1. Learn and dominate a computer language, C, Perl, Python, Ruby, Java, Bash, but be open to others.
  2. Learn to use the terminal and the commands, don't be afraid.
  3. Comment your code, try to update regularly.
  4. Automatize tasks in case of re-running for a DB update.
  5. Debug your code and try to be modular in designs.
  6. Remember your objetives, don't waste time solving common tasks or re-invent-the-wheel.
  7. First chek all works well, later put a pretty interface on it.
Finally, be prepared for parasites or people who don't believe in bioinformatics.

Tuesday, October 7, 2008

Phishing with Free Software

Last day I received this email, fortunately the spam engine detected it, but is different the content, other times I had receive similar emails for proprietary software, specially MS Office Suite, which is so expensive and many people want a cheap (an illegal) version. But this time the reference is a Free Software office suite, OpenOffice which in few days will release the new version 3.0.

This is the infamous message:

From Suite 2009
To XXXXXX@XXXXXX.XXX
Date October 6th 2008 18:55
Subject Download Open Office 2009


Open Office Suite 2009
Open, Create & Edit Your Files

Download Office Suite 2009??Here

Edit Word, Excel & Power Point files- 100% MS Office Compatible.
Read and write PDF files just like Adobe.

Here's how to download Open Office 2009:

1. Go to: Download Page
2. Download Open Office 2009
3. Receive access immediately

This software package is the best way to edit your documents.
Publish all of your documents online in the HTML format.

Thank you for choosing us, the worldwide leader in Open Office 2009.

For More Information Visit our Website

Thank You,

David Matthews
Office Solutions




If you want to stop receiving mail, please go to:
http://info--online-email.info/

or you may contact us at the following address:

Plaza Neptuno, local #7
Via ricardo J Alfaro, Tumba Muerto
Panama Ciudad
Republica de Panama
Of course than the message include some links to info--online-email.info and call a PHP document sending the email account (this is enough to confirm a valid address and add it to a spam list).

If you want OpenOffice just go to the official website: http://www.openoffice.org/


Friday, October 3, 2008

More Firefox add-ons

Today I install 2 more add-ons, both are to improve my GMail accounts.

1. Better Gmail 2. This utility modify a little (pimp probably) the normal view and use of Gmail, attachments are symbolic images describing the content, colorize the pointed message and more options are available. An excellent job of Gina Trapani from LifeHacker.com https://addons.mozilla.org/en-US/firefox/addon/6076


2. Gmail S/MIME. Talking about privacy, this add-on allow us to sign and crypt messages, a must-have for every-one, you cannot know who's watching in the upper cloud. https://addons.mozilla.org/en-US/firefox/addon/592