T 1286/09 (Image classifier/INTELLECTUAL VENTURES) 11-06-2015
Download and more information:
Method using image recomposition to improve scene classification
I. The transfer of the European patent application no. 03078317.9 from Eastman Kodak Company (legal predecessor of the current applicant) to Intellectual Ventures Fund 83 LLC (current applicant) was requested with letter dated 21 March 2013.
The registration of the transfer took effect on 21 March 2013, as confirmed by the communication of the EPO dated 2 April 2013.
In the following no distinction is made between the current applicant/appellant and its predecessor.
II. The applicant (appellant) appealed against the decision of the Examining Division to refuse European patent application no. 03078317.9.
III. In the contested decision, the Examining Division came to the conclusion that the subject-matter of claim 1 of the request filed by fax on 26 November 2008 did not involve an inventive step with regard to the following document (Article 56 EPC):
D3: EP-A2-0 800 147.
The Examining Division identified also the following prior art which had been cited during examination:
D1: WO-A-99/57622
D2: WO-A-02/059828.
IV. With the statement of grounds of appeal the appellant filed a new set of claims 1 to 7 and implicitly requested to set aside the decision under appeal and to grant a patent on the basis of these claims. Additionally, the appellant requested oral proceedings before the Board, should the appeal not be allowed.
V. With letter dated 17 December 2014, the appellant was summoned to oral proceedings to be held on 21 May 2015.
VI. In a communication pursuant to Article 15(1) RPBA dated 2 March 2015, the Board raised some objections under Article 84 EPC against claim 1 of the appellant's request. On the other hand, the Board concurred with the appellant that document D3 did not constitute a suitable starting point for assessing the inventive step of the present invention, as understood from the description.
VII. In reply to the Board's communication, the appellant filed with letter dated 21 April 2015 a new main request and an auxiliary request. Furthermore, the appellant asked the Board to consider, in advance of the oral proceedings, if any of the requests was potentially allowable, and to communicate its findings to the appellant in case it proved possible to cancel the oral proceedings.
VIII. In a communication sent by fax on 8 May 2015, the Board expressed the view that neither the main request nor the auxiliary request satisfied the requirements of Article 84 EPC. However, if the appellant filed a new claim overcoming all clarity objections, the Board might decide to remit the case to the department of first instance with the order to grant a patent on the basis of such claim and a description to be adapted.
IX. With letter dated 14 May 2015, the appellant replaced the main request and withdrew the previously filed auxiliary request. The request for oral proceedings was maintained in the event that the Board was unable to remit the case to the department of first instance based on the replacement main request.
X. In reply to a fax by the Board dated 18 May 2015, the appellant filed, with letter dated 18 May 2015, a new replacement main request. The conditional request for oral proceedings was maintained.
XI. The oral proceedings were then cancelled.
XII. The only claim of the appellant's sole request ("main request") reads as follows:
"A computer implemented method for improving image classification of a digital image comprising the steps of:
(a) providing an exemplar color image;
(b) systematically altering the exemplar color image
to generate an expanded set of images, wherein systematically altering the exemplar color image comprises:
spatially altering the exemplar color image to generate an expanded set of spatially altered images, wherein spatially altering the exemplar color image comprises:
horizontally mirroring the exemplar color image, thereby doubling the number of images in the expanded set of images, or
systematically cropping the edges of the exemplar color image from one or more sides of the exemplar color image, thereby increasing the number of images in the expanded set of images; and/or
temporally altering the exemplar color image to generate an expanded set of temporally altered images, whereby the images in the expanded set simulate the appearance of capturing an image earlier or later in time, wherein temporally altering the exemplar color image comprises:
systematically shifting the color distribution of the exemplar color image, thereby increasing the number of images in the expanded set of images, or
systematically shifting the illuminant quality of the exemplar color image, thereby increasing the number of images in the expanded set of images; and
(c) using a semantic classifier and the expanded set
of images to determine an image classification for the exemplar colour image;
wherein the expanded set of images are used to train the classifier in step (c), thereby providing an improved classifier."
XIII. In the statement of grounds the appellant argued that document D3 did not put into question the inventive step of the method of the invention. In particular, the appellant argued that the "image" in document D3 was specifically disclosed as a set of text-based bi-level (black and white) symbols. Furthermore, rather than systematically altering the primary image to generate an expanded set of images, the method of D3 separately preprocessed each of the input images to produce, not an expanded set of images, but a single binary gray level image. In other words, document D3 disclosed the processing of many images into one, whereas the present invention processed one image into many.
1. The appeal is admissible.
2. The present application relates generally to the field of digital image processing and, in particular, to a method for improving image classification by training a semantic classifier with a set of exemplar colour images, which represent "recomposed versions" of an exemplar image, in order to increase the diversity of training exemplars (cf. application as filed, page 3, lines 11 to 18).
2.1 According to the application as filed (paragraph bridging pages 1 and 2), current scene classification systems enjoy limited success on unconstrained image sets because of the incredible variety of images within most semantic classes. Exemplar-based systems should account for such variation in their training sets. However, even hundreds of exemplar images do not necessarily capture all the variability inherent in some classes. As an example of such variability, the application gives the class of sunset images which can be captured at various stages of the sunset and thus may have more or less brilliant colours and show the sun at different positions with respect to the horizon.
2.2 The gist of the present invention consists essentially in increasing the diversity of exemplar images used to train a semantic classifier by systematically altering an exemplar colour image to generate an expanded set of images with the same salient characteristics as the initial exemplar image. More specifically, an exemplar image may be altered by means of "spatial recomposition", i.e. by cropping its edges or by horizontally mirroring it. Another technique for expanding the set of exemplar images is to shift the colour distribution or to change the colour along the illuminant (i.e. red-blue) axis (cf. paragraph bridging pages 6 and 7 of the application).
3. Claim 1 of the appellant's request is directed to a computer implemented method for improving image classification of a digital image and comprises the following steps (itemized by the Board):
(a) providing an exemplar colour image;
(b) systematically altering the exemplar colour image to generate an expanded set of images,
- wherein systematically altering the exemplar colour image comprises:
i) spatially altering the exemplar colour image to generate an expanded set of spatially altered images,
wherein spatially altering the exemplar colour image comprises:
- horizontally mirroring the exemplar colour image, thereby doubling the number of images in the expanded set of images, or
- systematically cropping the edges of the exemplar colour image from one or more sides of the exemplar colour image, thereby increasing the number of images in the expanded set of images; and/or
ii) temporally altering the exemplar colour image to generate an expanded set of temporally altered images,
whereby the images in the expanded set simulate the appearance of capturing an image earlier or later in time,
wherein temporally altering the exemplar colour image comprises:
- systematically shifting the colour distribution of the exemplar colour image, thereby increasing the number of images in the expanded set of images, or
- systematically shifting the illuminant quality of the exemplar colour image, thereby increasing the number of images in the expanded set of images; and
(c) using a semantic classifier and the expanded set of images to determine an image classification for the exemplar colour image;
(d) wherein the expanded set of images are used to train the classifier in step (c), thereby providing an improved classifier.
Articles 123(2) and 84 EPC
4. Claim 1 of the appellant's request is essentially based on a combination of the features recited in claims 1 to 3 and 6 to 11 of the application as filed.
4.1 Original claim 1 was directed to a method which used image recomposition to generate an expanded set of images for improving both the training of a classifier and the classification of an image, as the clause "whereby the expanded set of images provides at least one of an improved classifier and an improved classification result" implies. According to claim 1 on file, the expanded set of images obtained from an exemplar image is used to train the classifier.
Image recomposition in training is described on page 5, line 26 to page 6, line 11 of the application. Figure 9 shows a diagram of the method of the invention whereby the expanded set of images can originate from a test image or from an exemplar image (see also page 13, line 23). As specified on page 13, lines 30 to 32, "[i]f the expanded set of images are exemplar images, they are used to train the classifier in a training stage 28, thereby providing an improved classifier according to the invention".
Furthermore (see ibid. page 14, lines 6 to 9), "it is also possible to apply the recomposition stage 16 in only one of the two paths shown in Figure 10 [sic - it should read 9] (i.e. either in training an improved classifier or in providing an improved image classification result, but not both)".
4.2 Hence, the Board considers that claim 1 does not contain subject-matter extending beyond the content of the application as filed (Article 123(2) EPC), and that the claim is clear and supported by the description (Article 84 EPC).
Article 56 EPC
5. Document D3 relates to document image processing and in particular to recognising images from an image source, for example, a printed document (D3, page 2, lines 9 and 10).
As pointed out in the background section of the description (D3, page 2, lines 14 to 20), a "fundamental problem in the art of automatic document image processing relates to image defects, that is, imperfections in the image as compared to the original ideal artwork used to create the image. The sources of image defects are numerous and well known. For example, the original printed document (e.g., paper document) which was the source of the image may be defective (e.g., the paper has spots of dirt, folds, or was printed from a faulty printing device). Further, when the paper document was scanned, the paper may have been skewed while being placed in the scanner, resulting in a distortion of the image. In addition, the optics of the scanning process itself can produce defects due to, for example, vibration, pixel sensor sensitivity or noise" (underlining added).
As stated on page 2, lines 29 to 32, "[o]ne known technique for improving the image classifiers' ability to handle defective image representations is through so called training. For example, automatic training is frequently employed in which certain recognition algorithms are used to train the classifier to recognize defective images. One approach of such training is to use document-image degradation models" (underlining added).
According to D3 (page 2, lines 35 to 40), known "document-image degradation models read a single ideal prototype image (e.g., a machine-printed symbol) and as prescribed by the particular model's degradation algorithm, generate a large number of pseudorandomly degraded images. These generated degraded images (i.e., examples of potential representations of defective symbols in an image) are then used to train the classifier for use in the image recognition process to recognize defective symbols. Of course, this technique requires that an image degradation model (i.e., a prediction of how the original image will degrade) is known in order to generate the degraded images for the classifier" (underlining added).
As pointed out on page 3 lines 50 to 52, the method of D3 operates on bi-level images (i.e. black and white, not grey or colour). The set of input images represents images from the source on which, because of image defects and degradation, an OCR system would have difficulty in performing recognition. Furthermore, the set of degraded input images is collected in such a manner that the set captures those images from the source thought to be representative of a single symbol (page 3, line 52 to 55). For instance, Figure 3 shows an illustrative input character set 20 representative of the printed character "A". As explained on page 3, lines 37 to 56, and shown in Figures 1 and 2, the captured input images are preprocessed prior to the application of image averaging which produces a single binary image (see step 14 in Figure 2 of D3). Finally some postprocessing is applied to the single binary image to produce an approximation of the ideal character which is then sent to the image classifier for classification of the original image source (see steps 16 and 18 in Figure 2). These classifications are then used by the OCR engine 11 to produce a corresponding OCR system output 7.
5.1 Document D3 does not deal with the problem of training a colour image classifier, but with the problem of improving the recognitions of an original character represented by a set of degraded bi-level images. Furthermore, it discloses the processing of many (degraded) images of a character to provide an approximation of the original character, whereas the present application teaches processing an exemplar image to generate a set of exemplar images, such that the original exemplar image and the corresponding set of exemplar images share some salient characteristics of a certain semantic class of images.
Furthermore, also the use of image degradation models for the automatic training of image classifiers referred to in D3 (see page 2, "Background of the invention"), is not comparable to the present invention. In fact, the prior art acknowledged in D3 starts from a single ideal prototype image and processes it to generate a large number of pseudorandomly degraded images which train the classifier to recognise defective images of the same symbol (D3, page 2, lines 35 to 37). The present invention, however, starts from a real-world exemplar image and alters it "spatially" or "temporally", so as to produce a set of images which simulate other possible "real-world images" in the same image category.
5.2 Hence, in the Board's opinion, the teaching of document D3 cannot be regarded as a suitable starting point for the present invention.
6. As to the prior art documents D1 and D2 cited in the course of the examination, document D1 is concerned with the use of learning machines to discover knowledge from data. It relates therefore to a different field of technology and is not relevant to the present invention. Document D2 was cited by the Examining Division only as evidence that it was generally known to provide a digital representation of an image (see penultimate sentence of point 3.1 of the communication dated 28 April 2005).
7. In view of the available prior art, the Board finds that the subject-matter of claim 1 involves an inventive step within the meaning of Article 56 EPC.
7.1 In summary, the Board comes to the conclusion that the the only claim of the appellant's request satisfies the requirements of the EPC. However, before a patent can be granted, the description and drawings must be adapted to the subject-matter of claim 1.
For these reasons it is decided that:
1. The decision under appeal is set aside.
2. The case is remitted to the department of first instance with the order to grant a patent on the basis of claim 1 according to the appellant's main request and a description and drawings to be adapted thereto.