Keggle Challenge: Chest X-Ray Images (Pneumonia)

General Project Overview

In this Kaggle competition set of 5,863 chest X-ray images (anterior-posterior) were selected from retrospective cohorts of paediatric patients, between the age of one to five years old, in Guangzhou. All chest X-ray imaging was performed as part of the patient’s routine clinical care.

All chest radiographs were screened by two expert physicians for quality control and removing all low quality or unreadable scans.

In the picture below are show the three type of chest X-ray present in the database:

On the left-hand side it is present an image of a healthy individual with clear lungs and no areas of abnormal opacification. In the middle and right-hand side images are present a patient affected by bacterial and viral pneumonia respectively. The latter, presents a more diffuse ‘‘interstitial’’ pattern in both lungs. while the first typically exhibits a focal lobar consolidation, in this case in the right upper lobe (white arrows).

For this particular challenge we are requested to discern only healthy vs pneumonia affected chest X-ray.

Continue reading “Keggle Challenge: Chest X-Ray Images (Pneumonia)”

How Probability Calibration Works

Probability calibration is the process of calibrating an ML model to return the true likelihood of an event. This is necessary when we need the probability of the event in question rather than its classification.

Image that you have two models to predict rainy days, Model A and Model B. Both models have an accuracy of 0.8. And indeed, for every 10 rainy days, both mislabelled two days.
But if we look at the probability connected to each prediction, we can see that Model A reports a probability of 80%, while Model B of 100%.

This means that model B is 100% sure that it will rain, even when it will not, while model A is only 80% sure. It appears that model B is overconfident with its prediction, while model A is more cautious.

And it’s this level of confidence in predictions that makes Model A a more reliable model with respect to Model B; Model A is better despite the two models having the same accuracy.

Model B offers a more yes-or-no prediction, while Model A tells us the true likelihood of the event. And in real life, when we look at the weather forecast, we get the prediction and its probability, leaving us to decide if, for example, a 30% risk of rain is acceptable or not.

Continue reading “How Probability Calibration Works”

Logging in Python

You know when you have coded your biggest project and every time it runs you can barely figure out what is doing, only by reading a series of print statements and the creation of strategically saved files?

Well if that is the case, you ought to learn logging and step up your game. 

With a proper system of logging. you will have a consistent, ordered and a more reliable way to understand your own code, to time and track its progression and capture bugs easily.

Let’s break down the advantages of logging:

  1. Formatting: Logging allows you to standardize every message using a format of your choosing.
  2. Time tracking: Alongside the message you can add the time when it is generated. 
  3. Compact: All messages are gathered in files, you don’t need to scroll up continuously. 
  4. Versatility: Print does not work everywhere (i.e., objects without __str__ methods).
  5. Flexibility: Logging allows different levels of importance to your messages so you regulate what to show.

With all of this, you won’t be the only one who can understand your code.

Let’s start!

Continue reading “Logging in Python”

How to do a Sankey Plot in Python

If you have been more than five seconds on r/dataisbeautiful/, you will have probably encountered a Sankey plot. Everyone uses to track their expenses, job searching and every multi step processes. Indeed, it is very suitable to visualize the progression of events and their outcome.
And in my opinion, they look great!

Therefore, let’s see how to do in Python:
Jupyter Notebook here

In matplotlib

Personally, in matplotlib they look awful.

An example of Sankey realized in matplotlib from the official website.

The above plot is probably closer to the original concept of a Sankey plot (originally invented in 1898), but it is not something I would use in a publication.

The other solution is to use the library Plotly.

In Plotly

Therefore, without further ado:

Continue reading “How to do a Sankey Plot in Python”

Create a weather forecast model with ML

How to create a simple weather forecast model using ML and how to find public available weather data with ERA5!

As a data scientist at Intellegens, I work on a plethora of different projects for different industries including materials, drug design, and chemicals. For one particular project looking I was in desperate need of weather data. I needed things like, temperature, humidity, rainfall, etc. Given the spacetime coordinates (date, time and GPS location). And this made me fall into a rabbit hole so deep, that I decided to share it with you!

Weather Data

I thought that finding an API that could give this type of information was going to be easy. I didn’t foresee weather data to be one of the most jealously kept types of data.

If you search for “free weather API”, you will see plenty of similar websites with different services but not actually free and even if there is a free package, it will never have historical weather records.You really need to search hard before finding the Climate Data Store (CDS) web site.

Continue reading “Create a weather forecast model with ML”

Intro 2: Our Thoughts create our Unhappiness

It is not events that disturb people, it is their judgements concerning them.

In last month’s post we saw how it is hard to retain happiness and how this concept might be even misleading; how trivial things can spoil our life and finally how our own thought process  can help us to get closer to our goals and to a life worth living.

In this post we will continue our conversation and will look at one of the most famous Stoic quote:

“It is not events that disturb people, it is their judgements concerning them.”

Epictetus

This short sentence is one of the cornerstones of Stoic philosophy, let’s see how:

Continue reading “Intro 2: Our Thoughts create our Unhappiness”

Intro 1: How to Create a Life that Flows Smoothly

It is safe to say that you – at least once in your lifetime – have experienced something that you can refer to as happiness. 

How long did it last? Not long, I imagine. Perhaps, you uneventfully transitioned to something that resembled ‘normality’, or worse, other issues came up immediately and spoiled your mood.

Even though happiness didn’t last long, it does not stop you from seeking more of it.

But what kind of happiness are you seeking? The immediate kind? Fuelled by sex drugs and rock and roll? Happiness as a reward for your hard work and sacrifice? Maybe by obtaining an object of desire, such as a new house, job, car or spouse? Or are you just, “kinda-okay” – happy to sit and wait?

The choice seems to be between obtaining happiness now, later, or not seeking it at all. With the first, you will be happy immediately and miserable later, with the second, you will be waiting agonic until your goal materialises, and maybe, in the long run, you will feel happier. With the third, you are not even trying.

These are the options, are you ready?

If you haven’t chosen to give up already, maybe there is a better option. Let’s see what ancient Greece and Rome thought about this topic! 

Continue reading “Intro 1: How to Create a Life that Flows Smoothly”

Testing in Python

After having seen how to test in R.

Let’s see how to do the same in Python:

Writing a tests-oriented program

A good practice demands that we should try to write our test before we code the program we intended to.

Or at least, try to write the code in a way that is easier to test in the future.
Fighting our natural tendency to write the code you desperately want to write and then the tests.

To do that, follow these guidelines:

Guidelines

Continue reading “Testing in Python”

The Psychopath Test, by Jon Ronson

A Review

What is a psychopath? Should I be scared of them? How can I know if somebody is one? Am I one?

These and many others are the questions that Jon Ronson try to answer in this book: The Psychopath Test!

This book is far from the usual philosophical book present here, but I think it is interesting to see another aspect of human mind. It’s interesting to see how genetic and upbringing can produce certain individual completely different from the great majority of human beings.

Ronson started his interest in psychopathy almost by random, by a strange book contained inside an unanimous parcel.

He is not the only one who encountered this mysterious book and, around the world, psychiatric and journalist have already received it.

This episode kicks off a spiral of research, interviews and paranoias that last for almost two years, and that is summarised in this book.

Continue reading “The Psychopath Test, by Jon Ronson”

“Stillness is the Key” by R. Holiday

Stillness is the key is the latest book by the American author and entrepreneur Ryan Holiday.

I already wrote about one of his previous books the Daily Stoics (2016) in my post on the best book for Stoicism.

With this book Holiday completes an ideal trilogy of books: The Obstacle is the Way (2014) and Ego is the Enemy (2016).

I found this book in the gift bag of the latest Stoicon in Athens, and despite having a huge backlog of books I started reading immediately. I was not particularly familiar with Holiday’s assays, as I haven’t read the other two books of this trilogy.

Continue reading ““Stillness is the Key” by R. Holiday”