Paper: A visual Vocabulary for Flower Classification

The idea behind this paper is to try to classify objects based on more than one differentiating feature.  The ones they used where colour, shape and texture.  Each feature follows a different process but all end up being vectors.  At the end of the process the vector is combined (they normalize the combination) and nearest neighbour is used for classification. In the paper they made a comparison of the classification behaviour of the individual features and of the combined features.

The way they evaluated the behaviour of their resulting algorithm was a bit strange.  The were creating a program the returned a list of images that were of the same type as the test image.  Their results are mainly based on the 5 first images of this list.  So they are saying that the algorithm was able to put the right flower within the first 5 elements of the resulting list with a certain accuracy.

Colour feature calculation.

They made emphases on the use of segmented images.  They argue that the colour of the background will negatively affect the feature extraction if the flower is not segmented out of its background.  They used HSV for the colour space.  The used k-means clustering to create a vector with the calculated means.  The search for the optimal number of clusters and found that 500 was the best value.

Shape feature calculation

The used a rotationally invariant descriptor.  They use a combination of SIFT and HOG.  The compute SIFT descriptors on a  regular HOG-like grid.  They search for the optimal cell size, step size.  They obtain the vector through vector quantization.  I didn’t really understand how they calculated the shape feature. (have to read more into SIFT to see if I can figure it out).  No comment is made on segmentation (I’m not sure if they used segmented or non-segmented images).  The idea here is to capture relations between the sub-sections of the flower: like overlapping petals and petal pointedness.

Texture Vocabulary

The idea behind the texture calculation is that some flowers have certain texture in their petals.  Here they used an MR8 filter.  The resulting calculation is rotationally invariant because they choose the maximum response over all orientations.  They create the resulting vector by clustering the descriptors and frequency histograms.

This paper contains lots of helpful directions on where to look for possible solutions for our problem.  One of the things that I am most positive about is the use of HSV colour spectrum in the detection.  (I think the Salix Arctica females might have a very distinctive HSV behavior).

Posted in papers, PhD | Leave a comment

Paper: Automated flower classification over a large number of classes

The problem tackled by this paper is how a combination of features can improve classification on datasets of similar classes.  The work with a flower data set that has large between class similarity and small within class similarity.

Before they use the features they segment the flower in the image.  They use the process described in [1].  A thing to notice here is that they were able to segment elongated flowers that do not follow the general flower model described in [1].  This is something good as my understanding after reading [1] was that the segmentation method would probably not work for elongated flowers.

The paper uses 4 types of features to describe the images:

  1. Colour: It’s based on HSV colour space and k-means clustering.  The thought here is that the colour is a feature that can be used to differentiate between certain types of flowers.
  2. SIFT on foreground region:  They refer to the foreground feature as the part of the image which is inside the flower boundary.  These features are to gather information about the texture and shape of the flowers.
  3. SIFT on the foreground boundary:  It’s basically the same process as 2 but done in the boundary of the flower.
  4. Histogram of Gradients:  This is used to extract the arrangement of the global distribution of the plant parts.  They calculate the HOG features within the smallest bounding box of the segmented flower.

After calulating the 4 features they are combined in an SVM classifier.  The kernel is a sum of the individual feature kernels.

Among other things, the results show how the combination of features increase the classification accuracy.  It compares classification with one of the features only versus the combination of the four.

For our purposes I think this methodology is the way to go.  A combination of the most relevant features is something that should result in a nice classification.

[1] delving into the whorl of flower segmentation

Posted in papers, PhD | Leave a comment

Paper: Delving into the Whorl of flower segmentation

Salix Arctica Females

Salix Arctica Females

A lot of the assumptions in this paper make me feel that this is not the direction where we should be headed.

  1. They assume that all the pictures they analyse contain a flower.  For them someone has already classified the pictures as containing a flower.  This puts me off a bit on using this technique as I wouldn’t know how it behaves in situations where the classification has not occurred.
  2. When describing their flower shape model they are clear to specify that it is a good approximation given “reasonable viewpoints” and “provided the deformation of the flower is not excessive”.  Not sure this applies to our data; the flowers change position drastically.
  3. The paper  describes the types of flowers to which the method is applicable.  It is very clear to say that the flowers need to have rotational symmetry.  It does mention that there were instances where the segmentation worked with images that did not contain flowers with rotational symmetry but they characterized this as surprising.
  4. The took their data from the Oxford Flower Dataset but committed flowers which were very small or were too sub-sampled.  This is one of the primary reasons why I think that this methodology does not fit our problem.
  5. I believe their flower shape model does not fit our flowers (not the Salix Arctica anyway).  It is centered in the shape of a petal with respect to a flower centre, where all petal vertices should intersect.  This is ok for flowers which are rotationally symmetric, but it might not be a good model for elongated flowers like the Salix Arctica.  When imaged from above, the Salix Arctica might be segmented correctly; but when imaged from one side, it presents an elongated shape that is not rotationally invariant and might “confuse” the shape model.
I’ve read a couple of papers on flower detection and am seeing that the general assumptions of flower shape might make these approaches unusable for us.
Posted in papers, PhD | Leave a comment

Buds as initial flower locators.

PCA of a flower's life

PCA of a flower's life

I had a series of pictures of some flower species that I bought in the shop. I obtained a series of pictures by programming an android phone to take a picture every hour.  After the plant had died, I annotated all the pictures and separated the states in (bud, growing, flower and dying).  This annotation was done very quickly and I did not pay much attention to detail.

After I finished the annotation, I calculated the Principal Component Analysis (PCA) figure with the HOG features from each annotation. As you can see the features are all over the place, except the red ones that are grouped towards one side of the figure.  I also observed this in the PCA that I did for the Zackenberg pictures (previous posts). I believe that this happens because of the shape of the buds.  They are usually round and fit nicely inside the annotation window.

This might be something that we can use to detect the position of the flower.  Assuming that every flower is a bud in the beginning of its life, we can scan images for buds and pinpoint their position.  Then we can keep doing an analysis of that position once the bud has changed into a female of male flower.  This could potentially give a very accurate count of the amount of flowers in the plot (without differentiating between sex).  Remember that one of the assumptions of this project is that we are going to have a static camera taking pictures of the plot.  This means that a coordinate in the picture will always be related to the same place in the plot.  There is still hope:)

Posted in Uncategorized | Leave a comment

Annotations: Things to improve

Took a closer look at things that need to be improved in the annotation process.  Spent the day extracting specific examples of what went wrong and how to solve/avoid it.  Here are my findings:

Improve the annotations themselves

Improperly Annotated

mislabeled as Salix Female.

I perused the annotation database again and found that there were some annotations that were improperly annotated.  The one that is in the left figure was annotated as Salix Artica Female, but there is not enough information to do this.  It could also be a male flower or a bud that is in a shady spot.

I have some ideas at to why this mis-classification occurs:  The main one is that the annotation process is tedious and boring.  One gets tiered quickly which can lead to mistakes.  Another is that the reviewing biologists have more information than what the annotation can show; they can for example say that this is a female based on the leaves that sit beside it, which are not contained in the final annotation crop.

An immediate solution for this is to go through all the annotations and mark the ones that don’t contain enough features as unknown.  The best way of doing this is looking at the cropped images instead of the annotations in the plot pictures.  For future annotations -to increase reliability we could have a third-party double checking the cropped images.

Avoid Resizing inequalities

Resized vs Original cropped annotation.

Resized vs Original cropped annotation.

The nature of our data (pictures of small elongated flowers) generates different shapes of the same thing:  If the flower is vertical and the picture is taken from atop, the elongated flower will appear as a circle in the picture;  on the other hand, if the flower is horizontal and taken from the same angle, it will appear as an elongated rectangle.

The elongated rectangle becomes a problem when we are calculating the HOG features because we have to normalize it into a common shape (which I chose to be square).  when I resize the original cropped annotation it modifies the pixel content (see figure). The resize algorithm adds pixels based on surrounding information or it can also collapse surrounding pixels into one.  This is actually ok if the ratio of the image is maintained, but when we stretch one of the axis and leave the other unmodified (like in the figure), the calculated HOG features of the resized image will contain gradient relations that are surely not in the original image.

For the pedestrian detector, this was not a real issue as all the pedestrian pictures where of people standing upright.  In other words, the elongated rectangle that they used characterized nicely their data.  They did analyse images at different scale levels, but they didn’t change the ratio nor the rotation of the training window (which was 64×128 pixels)

Note that the solution is not to change the training window shape from a square to a rectangle.  This would cause the same problem when going from the square “atop” annotations to the elongated rectangle ones.  One possible solution is to annotate the elongated images with only squares.  This would mean either to cover the elongated flower with several squares or to just annotate the top part of the flower.  If we were to implement the first solution we could have different types for different parts of the plant.  We can have for example a Salix_Female_midsection and Salix_Femal_top labels.  The second solution is also plausible assuming that the top of the elongated flowers is similar to the images from atop.

Male / Female similarities

Similarity Between Male and Female Salix Artica.

Similarity Between Male and Female Salix Artica.

There is one type of Salix Artica Female that looks similar to the Salix Artica Male.  Even for the human expert it is sometimes hard to distinguish between them.  In the image we see a Male Salix Artica on the left and a Female on the right. The difference between them is subtle at this resolution.  We see that the Female has more darkness in the middle while the make has longer hair.  It is difficult to distinguish between them because they have practically the same silhouette.  This could or could not be a problem, it all depends on the technique used to classify them (SVM or neural networks). For now I would just like to document the possible pitfall and wait.

If this becomes an issue we could solve it by creating a type of flower that encompasses Salix Artica Males and this type of Salix Artica Females (not all Females look like this).  That would mean that the detection would be a bit strange, but its a way out.  Not sure what else we could do here.

[1] http://www.navneetdalal.com/files/CVPR2005_HOG.pdf?attredirects=0

Posted in annotation, PhD | Tagged , , , | Leave a comment

Expected non-exciting initial results…

PCA of data set. (80x80 window size, no margin)

I ran Principal Component Analysis (PCA) on the hog features that I calculating (Using Navneet Dalal’s code) and the result was not spectacular.  I plotted the result and color coded the types of flowers (bud:red, female:green, unknownSex:blue, hair:cyan, male:magenta).  The figure suggests that the buds are easily differentiable from the rest since they are nicely grouped to one side.  It also elucidates the lack of hair types (we don’t see much cyan).  As could be expected the unknown type is all over the place.  And finally we have the male and female somewhere spread out in the middle of the figure.  I was expecting a bit more separation between male and female types.

Knowing that the neural network and/or the Support Vector Machine that I’ll use to classify the images could have good behavior despite the PCA results I have some observations that might be relevant as we move forward.  These observations can produce some changes in the way we do the annotations.

Salix Artica Females

Salix Artica Females

  1. All the females are not alike.  There are three different types of females, each with a specific shape.  We might get better results if we separate these types into sub-types.  Note that If we do this we might not have enough data (ATM) to train and test.
  2. Looking at the cropped images I notice that we get pictures that look out strange.  There are some instances that are too dark or too mangled to distinguish them (at a glance).  I still don’t know the reason for this. I suspect it’s because they have lost all context (surrounding pixels).  They could also be wrongly annotated.  The flowers are very small and the camera sometimes does not capture all the features needed.
  3. We need more data.  The hair types only have 50 examples and the buds only have 217.  This is a small number compared to the 800 (approx) that the male and female have.
  4. There is a similarity between the shape of one of the types of female and the males.
    Salix Artica Males

    Salix Artica Males

    This similarity is even more pronounced knowing that the HOG features are calculated using only one color channel. If this becomes a problem we might want to use the color dimension in a form of a hue histogram (or something of the sort).  We could also merge the flowers that look-alike into a separate type; This would still allow us to separate one type of female from the merged type.

  5. There are some annotations that contain more background than flower.  These annotations are of flowers that are in some sort of L-shape or are in diagonal position. To reduce the amount of unwanted background we could have a policy of annotating the top of the flowers only.  We could also separate the annotation in smaller squares that better capture the whole of the flower.
  6. Some flowers are elongated and in some pictures are annotated with elongated rectangles and in others with squares (seen from above). For the HOG feature I “normalize” everything to a square.  This means that some images get distorted and might lose their differentiating features.  To avoid this we might want to do the same as in 5.
These are all things that I believe are or might happen.  This post serves as documentation for further reference.



Posted in annotation, PhD | Leave a comment

git: another way of doing submodules.

Git keeps giving us more.  Today I found out another way of doing submoduling.  It’s kinda strange and probably a bit harder than using submodules (But that might be because I’m not used to it yet).  But it adds to the awesome diversity that is GIT.  Here is the link http://progit.org/book/ch6-7.html

Posted in commands, git | Tagged | Leave a comment

Navneet Dalal’s HOG Descriptors

After fighting a bit with the code, I got it to compile.  The following is a list of the things I did:

  1. There were a lot of additional files in the src tree.  These files are automatically created by Autotools.  I removed *.in files, the ltmain.sh file, configure.h.in, aclocal.m4…  In general I removed everything that was auto-generated.
  2. I ran `autoupdate configure.in`
  3. Modified the configure.in by hand a little.  There were some parenthesis that my Fedora 14 did not like.
  4. The code had some strangeness in the use of strlen, time
  5. The boost validation_error call changed slightly.
You can see the full list of changes at [1].  You can further access the code from [2]
[2]http://github.com/Joelgranados/Learcode

Updates
  1. May 17:  I changed a bit the repo on github.  Now you can find the original code under the master branch.  You will find all my changes in the branch called joel: https://github.com/Joelgranados/Learcode/tree/joel
  2. Jun 10: I updated to fedora 15 and the HOG code broke (no surprise).  I blame boost!!!.  In any case there is a new version of the code at https://github.com/Joelgranados/Learcode/tree/joel

Posted in git, PhD | 11 Comments

About HOG transform in opencv2.2

This post could also be titled: “Why I’ve decided not to use the HOGDescriptors from opencv”.  There are three main reasons why I chose not to use the provided HOG functionality in Opencv:

  1. What is in Opencv is not only a descriptor.  Of course the descriptors are implemented, but they are bundled up together with classifiers and detectors. Also bundled inside the HOG class hierarchy is a pedestrian detector.  Now, don’t get me wrong, I’m not against all that being in opencv.  They could have separated it, in such a way that one can use a HOGDescriptor class and not have to deal with design decisions based on the other non-descriptor elements of the class.
  2. My second reason is related to the work that brought on the HOG descriptors to begin with.  This is described in a paper by Navneet Dalala  and Bill Triggs[1]. Their work is centered in Histogram of Oriented Gradient feature descriptor.  How to calculate such a feature descriptor and how different tweaks to the descriptors parameters can affect a binary SVM classifier.  They included the classifier as a mere way of measuring the models worthiness.  What I’m trying to say is that HOG can live separately from classification method and detection method.  Note that even though they relate the HOG concept to pedestrian detection, it does not mean that it only works for that.  In Navneet’s PhD document he also works with motorcycles and cars.
  3. My third reason (and this was the one that tipped the glass): You need CUDA [2] to run the HOG stuff.  It’s all implemented in cu files.  I don’t have nvidia hardware and am not planning to get some.  I don’t need real-time object detection! In my situation I’m even willing to wait a couple of days.

Conclusion: I will have to use Navneet’s original code or code one myself (Which is not THAT hard[3])

[1] Histograms of Oriented Gradients for Human Detection
[2]http://www.nvidia.com/object/what_is_cuda_new.html
[3]http://smsoftdev-solutions.blogspot.com/2009/10/object-detection-using-opencv-ii.html
Posted in opencv, PhD | 2 Comments

Paper: Histograms of Oriented Gradients for Human Detection.

The approach lists the points that interest me.  This is NOT an exhaustive summary of the paper.

Approach:

  1. First you calculate the gradients.  They tested various ways of doing this and concluded that a simple [-1,0,1] filter was best.  After this calculation you will have a direction and a magnitude for each pixel.
  2. Divide the angle of direction in bins (Notice that you can divide 180 or 360 degrees).  This is just a way to gather gradient directions into bins.  A bin can be all the angles from 0 to 30 degrees.
  3. Divide the image in cells.  Each pixel in the cell adds to a histogram of orientations based on the angle division in 2.  Two really cool things to note here:
    1. You can avoid aliasing by interpolating votes between neighboring bins
    2. The magnitude of the gradient controls the way the vote is counted in the histogram
  4. Note that each cell is a histogram that contains the “amount” of all gradient directions in that cell.
  5. Create a way to group adjacent cell histograms and call it a block.  For each block (group of cells) you will “normalize” it.  Papers suggests something like v/sqrt(|v|^2 + e^2). Note that V is the vector representing the adjacent cell histograms of the block.  Further not that || is the L-2 norm of the vector.
  6. Now move through the image in block steps.  Each block you create is to be “normalized”.  The way you move through the image allows for cells to be in more than one block (Though this is not necessary).
  7. For each block in the image you will get a “normalized” vector.  All these vectors placed one after another is the HOG.
Comments:
  1. Awesome idea: The used 1239 pedestrian images.  The SVM was trained with the 1239 originals and the left-right reflections.  This is so cool on so many levels.  Of course!! the pedestrian is still a pedestrian in the reflection image.  And this little trick give double the information to the SVM with no additional storage overhead.
  2. They created negative training images from a data base of images which did not contain any pedestrians.  Basically randomly sampled those non-pedestrian images and created the negative training set.  They ran the non-pedestrian images on the resulting classifier in look for false-positives and then added these false-positives to the training set.
  3. A word on scale:  To make detection happen they had to move a detection window through the image and run the classifier on each ROI.  They did this for various scales of the image.  We might not have to be so strict with this as all the flowers are going to be within a small range from the camera.  Whereas pedestrians can be very close or very far from the camera.  The point is that the pedestrian range is much larger.
  4. A margin was left in the training images.  of 4 pixels.
Posted in Uncategorized | Leave a comment