advices to train cnn

Hello all,

I try to train a cnn with caffe to detect the letter “OK” or not.
My first CNN is not very good and i would like to have your advices.

I got fonts in png (white font on black background with 28x28px) for the letter O and K (real size is more around 120x120px, and the letter are more black on “green” background).
I converted them to png for the positive images.
Then i followed the cmnis-nn tutorial from openmv github.
I did not change any parameters of solver.prototxt.

The first results are

  • the detection is not very precise at all
  • the detection is sensitive to the light

Is it due only to the difference of letter size ratio or color ?
What could you tell me to simplify the research of the good cnn ?


Hmm, have you tried the Chars74K network training example I linked to? It has all those letters:

To shrink the network down to two letters just remove all the example folders beside what you want before you generate the labels in step (l). Then it will just learn on those labels.

Note that in my example script I binarize the image since it only learns on black and white letters. The CNNs are quite stupid and literally learn nothing beyond the training examples. Don’t expect them to pull out obvious features if there are any lower level hacks they can learn to score well on your dataset.

well thanks,

As you said, i suppressed directory but i still have a around 90K what ever the number of letter and it is too big for the M7 ???
I try to train with different other parameters of solver.prototext to minimize the size without success at the moment.
Any help would be appreciated about this point ?

i dont know how to clone your models directory so i created files and copied each text inside (clone recursive dont have this part ?).

Hi, the steps on how to download the chars74K network are pretty detailed. When you download all the files and unpack them you end up with a ton of folders for each character. You just need to delete all characters except the ones you want.

Yes you are right, the explanations are very good and clear.
To reduce the size of the network file, a simple way is to reduce the num_output in convolution layers of the train_test.prototxt architecture file.

As you said, the cnn learn only as much as the datas have.
My feeling is if you have a lot of cases to learn (eg. all alphabet), the network is less performing for each case than if the cnn learns only one case.

As well, I tested the net.forward which needs to have a very similar image than original (size, binary, ratio, color).
I did not success yet to use the same network with, because i have always got a rectangle of all the image size…
If you have any idea, it would be appreciated.


Hi, did you actually reduce the number of characters you are feeding to it? You should see yourself scoring like 90% and above during training and test for the two character classes you wanted.

Yes i scored 100% after train, test and quantize.
net.forward works well with an image with same size, ratio object/image size and binary characters.
I dont know how to get working the network with at the moment…

Um, scoring 100% is usually bad. That means the network overfit.

Mmm, if possible I’d increase the number of training examples artificially until it stops scoring 100%. I.e. feed it rotated pictures and scaled pictures.