Alpha parameter doesn’t work on geom_rect!!! Sort of…

The parameter alpha in the R package ggplot2 is used to express the transparency of the fill colour of the function geom_

However for the function geom_rect it might not work as aspected.

In my latest work, I tried to combine different geom function but I was stuck when all was covered when I used geom_rect.
Let’s see an example:

df = 
    x = c(rep(1, 25), rep(2, 25), rep(3, 25)),
    y = c(sample(1:50, 25), sample(51:100, 25), sample(101:150, 25)),
    classes = c(rep("A", 25), rep("B", 25), rep("C", 25)))

#  x  y classes
#1 1 45       A
#2 1  4       A
#3 1 41       A
#4 1 32       A
#5 1  8       A
36 1 14       A

If we plot the data using geom_jitter and geom_boxplot we obtain the plot:

ggplot(data = df,
       aes(x = x, y = y, colour = classes)) + 
  geom_jitter() +
  geom_boxplot(alpha = 0.2) +
Continue reading “Alpha parameter doesn’t work on geom_rect!!! Sort of…”

Ho to do a simple SVM classification in R and python

Support Vector Machine (SVM) is a supervised learning model used for data classification and regression analysis.

It is one of the main machine learning methods used in modern-day artificial intelligence, and it has spread widely in all fields of research, not least, in bioinformatics.

The SVM classification method has, in general, a good classification efficiency, and it is flexible enough to be used with a great range of data.

Languages, like R or Python, offer several libraries to compute and work with SVMs in a simple and flexible way.

Let’s see how to create a classification of the database in R and Python using some basic code.

For this example we will use the Iris dataset.

Continue reading “Ho to do a simple SVM classification in R and python”

Testing in R

To test a given function in R, I use the package ‘testthat’.

A very nice intro to the package is present here:

However, in order to write good quality code and to do it only once, you got to carry out a paradigm shift in your writing procedure:

How to write a tests-oriented program

You might have the natural tendency to write the tests after your code.

However, this is not the best approach, indeed, after that, you might need to rewrite big part of the code, to make it more ‘testable’.

In order to avoid that, you need to write your test before the program.

Certainly, this is is a bit harder to implement, especially the first times.

Therefore, to make it easier, follow these guidelines:


  • Independent files:
    • One for the program, one for the tests
  • Code stile for the program:
    • Atomicity:
      • A single R file for each task/objective.
      • Single functions for each step of your algorithm. This will help later in the test
    • A main function to be invoked. This will list all the functions (steps) of the program.
  • Tests, in and out:
    • Within the program file
      • Tests that define and check the inputs
      • Tests that define and check the outputs
    • Within the test file
      • A test with correct inputs
      • Wrong inputs
      • Exceptions
Continue reading “Testing in R”

Come fare un grafico a barre con ggplot2

Il pacchetto ggplot2 è una delle risorse più potenti per la creazione di grafici in R.

Anche se, ggplot2 ha una curva di apprendimento piuttosto alta che potrebbe scoraggiare chi inizia a usarlo, ma credetemi ne vale sicuramente la pena.

Qui voglio mostrare un paio di esempi dei grafici a barre:

# Per questi grafici abbiamo le informazioni relative a un database di topi immunizzati con due diversi antigeni OVA e CFA, 
# e sono riportati i tempi  dopo la vaccinazione. 
# Il gruppo di controllo ha il tempo zero perché quelli non sono stati immunizzati.

library(ggplot2) time_days=c(0,0,0,0,0,0,0,0,0,5,5,5,5,5,5,7,7,7,7,7,14,14,14,14,14,14,60,60,60,60,60,60,60,60,60,60,60)
db <- data.frame(time_days,antigens) # Il database come data.framse 
ggplot2::qplot( data = db, factor(time_days), fill = factor(time_days), geom = "bar" )+
 ggplot2::scale_fill_discrete(name='Time in days') +
 ggplot2::ggtitle('Group of mice per date of sacrifice') + 
 ggplot2::xlab('Time in days') +
ggplot2::ylab('Number of mice') 

Con il seguente risultato:

Con questa trama vediamo il numero di topi e il tempo in cui sono stati sacrificati.

Ora se vogliamo vedere sia il numero di topi che gli antigeni usati potremmo fare il seguente:

ggplot2::qplot( data=db, geom="bar", factor(time_days), fill=factor(antigens) ) +
 ggplot2::theme_classic() +
 ggplot2::ggtitle('Group of mice per date of sacrifice and antigens')+
 ggplot2::scale_fill_discrete(name='Antigens') +
 ggplot2::xlab('Time in days') +

Con il seguente risultato:

Labelled plot in ggplot2

ggplot2 is an amazing tool for plotting in R. One of the latest feature I found out it is the possibility to label the plot automatically. Let’s check it out:

data <- data.frame(class=c('A','B','C'), 
                   value=c(50, 30, 20))

g <- ggplot2::ggplot(data = data,
                ggplot2::aes(x = class,
                             y = value)
                ) +
  ggplot2::geom_bar(stat = "identity", 
                    ) +
                      size=5) +

ggplot2::ggsave('labelled_plot.pdf', plot=g, device = 'pdf')

This would be the result:

Very easy and fast!

How to use R to create Latex documents

At work I have been asked to create a R code that could automatic generate a Latex document.

This was great! Latex was the first language that I learnt and it has a special place in my nerd heart.

It is nice to see that is possible to work with R for Latex and create a PDF document that can change in relation to the input given to the code.

All the code is present in my GitHub repository:

Continue reading “How to use R to create Latex documents”