One of the issues that I identified in previous posts was that the original annotations were not of good quality. This included annotations that were incorrect and annotations that pointed to images that did not have enough differentiating information.
I took half of this week to review the annotations and remove the ones that had mistakes and were of very low quality. With respect to the low resolution annotations, I noticed that the biologist could make a pretty good guess of what type of flower it is based on position, surrounding vegetation and previous images of the same place. This created cropped images of very poor quality that could not be differentiated when presented without context (previous images and surrounding pixels). With this in mind I think it best to give an annotation a label if and only if one is sure that it will be recognized without context.
Something that might be useful to increase the quality of the annotations is to create a re-annotation process based on the cropped images. This would take place after the initial process of annotation. The biologists would receive a list of cropped images and they would have to re-annotate them (but this time without any context). We would keep only the cropped images that have the same label in the initial annotation process and in the re-annotation process.
As a side note I want to state how painful it is to create/review the annotations. It’s a process that is highly monotonous and stressful. In the 2009 flower dataset there are close to 300 pictures. All pictures have a resolution of 4000×3500 pixels and we are searching for flowers that are contained in windows measured in tens of pixels. This creates stress on the eyes and on the hand handling the mouse. Notice that the initial annotation took months to create and validate and its review took half a week. I’m constantly trying to modify the annotation software to make this process easier  but the changes are sometimes not clear from the HCI perspective.
With that out of my system we can no move to better news :). After I finished reviewing the annotations I ran a PCA on the calculated HOGs. This time around I implemented a mirroring feature that would allow me to mirror the cropped images. This is something that Navneet Dalal did in this HOG paper. He only mirrored the images along the y-axis (its logical since you don’t see a lot of people walking on their heads). For my purposes I mirrored my data along both axes. This means that one cropped image produces 3 mirrored images (quadrupling the number of data points).
The figure shows the first two PC components plotted in a 2D plane. We notice that the buds are still a very separable cluster within the data (red in the figure). The cyan and the green still have a significant overlap but display a small tendency towards opposite directions. The green tends towards the upper-right and the cyan towards the lower-left (green is female and cyan is male). The blue is scattered throughout (blue is Salix Arctica hair).
After these results I am more convinced that we need a better way of collecting training data. One thing that could be done is to avoid the fish eye lens. A lot of the very low resolution annotations were on the perimeter of the pictures taken with a fish eye lens. They were highly distorted and had poor resolution. Another possible solution is to get the camera a bit closer so we can get a better view of the male flowers. This to avoid confusing male flowers with female flowers. Recall that there is a type of male flower that is very similar to the female flowers; this is exacerbated by the fact that the cropped images are of poor resolution. Getting the camera closer might just give us the extra information needed to differentiate them in the annotation process and in the automated feature extraction process.
Finally, I believe that the markers that we deployed this summer are going to greatly improve the annotation process and therefore give us a better training set.
Things can only get better!!! ;)