Older posts...
Older posts...
23 Sep
Designing APIs in a resource-oriented architecture
23 Sep 2014
Designing APIs in a resource-oriented architecture
28 Sep
How I'm going to land my dream job
28 Sep 2014
How I'm going to land my dream job
1 Oct
Neural net training fail
1 Oct 2014
Neural net training fail
13 Oct
Pow + SSL without the hassle
13 Oct 2014
Pow + SSL without the hassle
17 Oct
Using machine learning to rank search results (part 1)
17 Oct 2014
Using machine learning to rank search results (part 1)
23 Oct
Using machine learning to rank search results (part 2)
23 Oct 2014
Using machine learning to rank search results (part 2)
9 Nov
Managing complexity in Go
9 Nov 2014
Managing complexity in Go
25 Nov
Remote work: an engineering leader's perspective
25 Nov 2014
Remote work: an engineering leader's perspective
19 Sep
Running A/B tests on our hosting infrastructure
19 Sep 2016
Running A/B tests on our hosting infrastructure
27 Mar
Every service is an island
27 Mar 2017
Every service is an island

Neural net training fail

So here I am, having collected test and control samples to train a neural network, an measure its predictive power.

But something’s fishy: it works well, from the get go.

My current pet project involves prototyping a learn-to-rank engine to provide more relevant search results on HouseTrip. I use behaviour data from our users: simply put, positive events are users who enquire on homes, negative events are those who stop at the listing page. The point being to predict, for a given set of search results, which the current user is most likely to continue with. Or in other words, provide relevant search results to users.

The first test net I tried has my 19 normalised inputs, 9 hidden nodes in 1 layer, and 2 outputs (“positive” and “negative”).

Upon running my very first series of training using FANN, I get this result:

  • 90% correct predictions;
  • 9% false negatives;
  • 1% false positives.

How in hell can it work so well?

I then continue with different training set sizes, different network layouts, with similar results.

And then eventally, I realise… that my data had way more negative samples than positives, because users view way more listing pages than they continue and make an enquiry. So the damn thing was just predicting “negative” all the time…. and it was generally “correct” given the data I fed it.

Morality: when training something based on mean errors (RMSE in FANN, by default)… your input needs to be as balanced / unbiased as possible.

I must be rusty at this.

Duh.