T 1482/19 (Scale space approximation/TELECOM ITALIA) 11-02-2022
Download and more information:
KEYPOINT IDENTIFICATION
Inventive step - (yes)
Inventive step - technical problem derivable from the application as filed (yes)
Inventive step - long felt need (no)
Claims - functional features (yes)
Amendments - extension beyond the content of the application as filed (no)
Correction of error - obvious error
Amendment to appeal case - amendment detrimental to procedural economy (no)
Amendment to appeal case - suitability of amendment to resolve issues raised (yes)
Amendment after summons - exceptional circumstances (yes)
I. The appeal is against the decision of the Examining Division to refuse the application. For all requests, the decision was based on a lack of compliance with Article 84, Article 123(2) and Article 56 EPC. The decision cited inter alia the following documents:
D1: J Ng et al: "Steering in Scale Space to Optimally Detect Image Structures", 2004
D4: D.G. Lowe: "Distinctive Image Features from Scale-Invariant Keypoints", 2004
II. With the grounds of appeal the appellant requested that the decision of the Examining Division be set aside and that a patent be granted on the basis of the main request or of one of seven auxiliary requests, all filed with the grounds of appeal. The sixth and seventh auxiliary requests were identical to the first and second auxiliary requests refused by the Examining Division.
III. In the preliminary opinion accompanying a summons to oral proceedings the Board indicated that it tended to agree with the position of the Examining Division, that all requests on file lacked compliance with Articles 56, 84 and 123(2) EPC. However, it also stated that the third auxiliary request had the potential of defining inventive subject matter, if re-drafted in a form allowable under Articles 84 and 123(2) EPC. The Board further noted an error in the application documents, at equation 9 on page 12 of the description.
IV. In reply to the summons the appellant filed a new page 12 as a correction under Rule 139 EPC. It also filed three new requests, labeled A, B and C, based on the said third auxiliary request, replacing all previous requests. During the oral proceedings, the appellant filed a new request, labeled C - revision 3, replacing all previous requests.
V. The final request of the appellant was therefore that the decision under appeal be set aside and that a patent be granted on the basis of the set of claims according to request C - revision 3 filed during the oral proceedings, page 12 as filed before the oral proceedings, and the remainder of the application documents as originally filed.
VI. At the end of the oral proceedings, the chairman announced the Board's decision.
VII. Claim 1 of the only request on file reads as follows:
"A method for identifying keypoints in a digital image (I) comprising a set of pixels, each pixel having associated thereto a respective value of an image representative parameter, said method comprising:
- approximating (108), without explicitly computing, a filtered image (L(sigma), 102) of said digital image (I), said filtered image being obtained from said digital image (I) by means of a filtering function, said filtering function being a Laplacian of Gaussian or a Difference of Gaussians and having a filtering parameter (sigma) being the standard deviation of the Gaussian function, said approximating comprising: a) generating (102) a set of base filtered images (L(sigmai)) of said digital image (I), each base filtered image (L(sigmai)) being the digital image (I) filtered by means of a respective base filter, each base filter being a matrix the elements of which are obtained by calculating said filtering function with a respective value (sigmai) of the filtering parameter; b) for each pixel (xt,yt) of at least a subset of said set of pixels of the digital image (I), approximating (106) the filtered image (L(xt, yt, sigma)) obtained from said digital image (I) by means of said filtering function having a filtering parameter value by means of a respective approximation function based on the base filtered images, said approximation function being a function of the filtering parameter (sigma) within a predefined range (104) of values of the filtering parameter (sigma), said predefined range ranging from a lowest value of the filtering parameter to a highest value of the filtering parameter, wherein: - wherein said approximating function is a polynomial expressed as follows: FORMULA/TABLE/GRAPHIC whereinL(x, y, sigma) is the filtered image to be approximatedc(x, y) is the polynomial coefficient for sigma**(r)
wherein said polynomial coefficients cr(x, y) are obtained as a linear combination of said base filtered images according to the following equations:
FORMULA/TABLE/GRAPHIC
wherein F**(T) is calculated by solving the following equation:
SF=W**(T)
wherein:
F is the unknown which, after being calculated and transposed, gives F**(T)
S is a matrix formed by specific values of sigmai as follows:
FORMULA/TABLE/GRAPHIC
wherein sigma'i (with i varying from 1 to q) are specific values of the filtering parameter sigma
W is a matrix calculated by solving the following equation:
AW = D
wherein
A is a matrix formed by the base filters, wherein matrix A has a number of columns equal to the number of base filters, wherein each column of matrix A represents a respective base filter, wherein each column of matrix A is built by arranging in columns the columns of each base filter
D is a matrix each element of which is obtained by calculating the filtering function, for each point (x, y) and for each of said specific values sigma'j (with j varying from 1 to q)
W is the unknown which, after being calculated and transposed, gives W**(T)
- for each pixel of said at least a subset, identifying (120) such pixel as a candidate keypoint if, in such pixel, the approximation function has a local extreme (110) which is also a global extreme (116) with respect to the filtering parameter (sigma) for a value sigmam of the filtering parameter in a respective sub-range of values of the filtering parameter internal to said predefined range said respective sub-range having a lower end larger than said lowest value of the filtering parameter and an upper end lower than said highest value of the filtering parameter, wherein the difference between said lowest value and said lower end, and the difference between said highest value and said upper end, are such that only the maximum or minimum points that happen in sigmam of which the behavior of the approximation function is known in a neighborhood of sigmam that is sufficiently large are considered as candidate keypoints;
- for each pixel identified as a candidate keypoint: c) comparing (130) the value assumed by the approximation function at the value of the filtering parameter corresponding to the global extreme of the pixel with the values assumed by the approximation functions of the pixels adjacent to said pixel in the image at the values of the filtering parameters of the respective global extremes of such adjacent pixels, and d) selecting (136) such pixel based on this comparison."
The application
1. The application relates to identifying so-called keypoints in a digital image, for e.g. scene or object identification/matching.
1.1 It builds upon the known method of scale invariant feature detection (SIFT keypoints, see D4, cited in the application on page 2), which detects keypoints as extrema in the image "scale space" (D4, section 3). The image scale space is obtained by convolving (filtering) the original images with a Gaussian function of varying widths; its standard deviation defines the scale dimension of the image "scale space". The extrema are not detected in the scale space itself, but in the LoG space (Laplacian of Gaussian, second derivative with respect to scale), approximated in SIFT feature detection by the difference of two adjacent in a sequence of scale filtered images (difference of Gaussian - DoG, see D4, section 3).
1.2 The application proposes a method of approximating the scale space by a continuous function of scale, determined as a linear combination of filter responses at fixed scales sigma1 ... sigman (basis scales - see equation 1), the combination weights p(sigma) themselves being determined as a polynomial function of scale (equation 8). The determination of the polynomials entails two steps: first the basis scales and a set of scales of interest are selected (page 10, paragraph before equation 5, page 15, 1**(st) paragraph). The combination weights for expressing the filters at the scales of interest as a linear combination of the basis filters can be determined by solving a linear system (equations 4 and 5). Then, polynomials coefficients are determined to best approximate these weights as a function of scale.
1.3 Under this approximation, given an image, the responses to the basis filters are computed and treated as constants. Then, the scale space response can be approximated at each image point as a polynomial (equations 12 and 13). This allows the analytical computation of extrema points by computing the roots of the polynomial (page 16, paragraph 2: first derivative equals 0). The application proposes to select "global" extrema points, also verifying whether the local extrema are extrema in view of the end points of the scale range (page 16).
1.4 The application further proposes (page 17, first full paragraph) to select as keypoints only points in a restricted working range, where the behavior of the approximation function is known.
Rule 139 EPC
2. The Board accepts the request for correction under Rule 139 EPC. The error is obvious, at least because it is not consistent with the description in the paragraph following it, and with other equations, see e.g. equation 11. The correction is unambiguous, as correctly explained by the appellant in its letter dated 4 February 2022, and reflects what was originally intended.
The decision under appeal
3. The Examining Division refused the application on three main grounds. The first one (points 1.1 and 1.2), under Article 84 EPC, relates to the definition of the scale space in that the filtering function was not specified to be a LoG or DoG function (feature considered necessary to detect keypoints).
4. The second one (points 1.3 and 3.1) was related to the definition of the approximation function and it was found, essentially, that the application as filed only provides support for an approximation as provided in equation 12, read together with equations 5, 8 and 9 defining the polynomial coefficients and their computation. The claimed generalizations were not allowable under Article 123(2) EPC. Furthermore, the attempts at formulating these equations (in text form) led to lack of clarity and conciseness (Article 84 EPC).
5. The third one was that the claimed subject-matter, even if considering the subject matter to be limited as defined in those equations (point 2), was obvious in view of a combination of D1 with D4 (points 2.1 to 2.6).
Admittance (Article 13 RPBA 2020)
6. The current request C - version 3 was filed during oral proceedings before the Board. Though it certainly could, and should, have been filed earlier, it also remedied to all objections brought forward in the preliminary opinion of the Board, by taking guidance on point 23 therein. The Board regards this as sufficient reasons for allowability under Article 13(1) RPBA 2020.
6.1 Regarding Article 13(2) RPBA 2020, the Board finds that exceptional circumstances are established in that a request which is clearly allowable, as is the case here, brings the procedure to an end (see also T 1294/16, points 18.2 and 18.3).
Articles 123(2) EPC and 84 EPC
7. Claim 1 of the only request on file defines, in summary, a method of identifying keypoints in a digital image, comprising:
first claim feature: using an image scale space defined by a LoG or a DoG filter,
claim features a and b: defining, for each pixel, a polynomial approximation of this scale space (reciting equations 12 and 13) in a predefined range of scales, this approximation being obtained in two steps in accordance with equations 5 (approximation of the scale space as a linear combination of the base filters), 8 and 9 (approximation of the weights as polynomial functions of scale) and
last paragraph of claim feature b: identifying candidate keypoints in this scale space as global extremes of the approximation function which are found in a sub-range defined by its lower and upper ends so that the behavior of the approximation function is known in a sufficiently large neighborhood of the extremum
claim features c and d: comparing, for each pixel identified as candidate keypoint, the value of the approximation functions at said extremum with those of neighboring pixels to select the pixel or not.
8. This claim is based on original claim 1 modified in the first claim feature to define the LoG/DoG filters (see e.g. original claim 11), amended in features a and b to define the approximation used in accordance with equations 5, 8, 9, 12, 13, and further amended (last paragraph of feature b) to define the selection of candidate keypoints in a sub-range.
9. The first two amendments remedy the deficiencies noted by the Examining Division in terms of original disclosure, support and clarity (Articles 123(2) and Article 84 EPC).
10. The last amendment is based on page 17, lines 7 to 13, defining the lower and upper ends of the sub-range in respect of the highest and lowest value of the sub-range as follows:
wherein the difference between said lowest value and said lower end, and the difference between said highest value and said upper end, are such that only the maximum or minimum points that happen in sigmam of which the behavior of the approximation function is known in a neighborhood of sigmam that is sufficiently large are considered as candidate keypoints.
10.1 In the Board's view, this formulation follows unambiguously from the explanation provided in the said passage on page 17 that by restricting the selection to a sub-range, only the extrema that happen in a neighbourhood where the behavior of the approximation function are known are considered (in this way...). Though a verb is missing in the sentence on lines 12 and 13, it is clear that the discussion is about which keypoints are to be considered - see the beginning of the paragraph. Thus no objection under Article 123(2) EPC arises.
10.2 Though functional, the definition of the sub-range is clear to the skilled person, who, in the context of the application, is deemed to have expertise in applied mathematics, in particular in approximation functions. Analysing the behavior of polynomial approximations around the interpolating points and on the borders so as to appreciate the validity of the approximation (e.g. estimating the errors' order of magnitude) is a matter of routine in this context. Thus no objection under Article 84 arises either.
Inventive step (Article 56 EPC)
11. The Examining Division rejected the requests underlying their decision for a lack of inventive step starting from D1 in view of D4. These reasons are also pertinent for the current request and are discussed below.
12. Document D1 describes a method of scale space approximation which is the same as that of the application (see D1, equations 5 to 15). This was also stated in the decision at point 2 and was not denied by the appellant.
12.1 The purpose of D1 is (introduction, page 483, paragraph 2) to address the "main problem ... that exhaustive filtering with kernels over a wide range of finely-sampled scales is computationally intensive and inefficient". The proposed method, by the formulation of what are called polynomial steering functions "lends itself well to the detection of global maxima by analytically finding the roots of the derivatives of the polynomials, rather than exhaustively performing operations over multiple scales to detect maxima as in the previous aforementioned works" (section 4, paragraph 1).
12.2 D1 uses global maxima over scales (section 4) to detect the scale of image structures in order to improve low-level feature detection and extraction by spatial filters (introduction). It is said though that the other polynomial roots "provide scale information about the other local energy maxima, minima and inflection points" (conclusion, last paragraph).
13. Document D4 represents the seminal work of Lowe regarding keypoint detection with scale invariant features. The method starts (section 3) by detecting local extrema in (DoG) scale space. These local extrema are obtained by comparing the value of a point with its neighbours both in the scale and the spatial direction (adjacent points - see figure 2). This raises the question of the sampling frequency in both scale and space domains (section 3.1). D4 uses experimentation in order to determine the frequency of sampling in the scale domain and "must settle for a solution that trades off efficiency with completeness" (end of section 3.1). To reduce computation, D4 further proposes to work on octaves (where the scale doubles), i.e. to repeat the method with the same set of scales, but on reduced versions of the image for each octave (see figure 1).
14. The Examining Division reasoned (2.4) that keypoint detection was a form of low-level feature detection and that it was commonly known to the skilled person that keypoint detection requires the same identification of scale space extrema. So the skilled person would use D1 for keypoint detection.
14.1 The keypoint selection (steps c) and d)) based on the spatial comparison with 8 adjacent points would be done by the skilled person because it was commonly known to do so for keypoint detection (presumably from D4). There was no need to also compare with the neighbours in scale, because the extrema in scale were already determined.
14.2 Accordingly, the Examining Division concluded that all claim features except for the last paragraph in feature b were obvious for the skilled person over the combination of D1 and D4.
14.3 Regarding the last feature - in its previous wording, defining a sub-range separated from the ends of the predefined range by a fixed value of 0.1 - the Examining Division noted (2.4.1 and 2.4.2) that the technical effect of eliminating unstable keypoint candidates was plausible, but could only be understood with hindsight, and could not be based on the corresponding disclosure on page 17. As a consequence, the Examining Division did not accept this feature to have the alleged effect (2.4.3, last sentence).
15. The appellant disagreed with the choice of D1 as starting point (see the grounds of appeal, point 1.3.1). There was no mention of keypoint detection in D1, so using D1 for that purpose was not apparent to the skilled person starting from D1. D4 should be taken as the "closest prior art", because it addressed the same problem as the claimed invention, namely keypoint detection.
15.1 Further, the appellant argued (see the grounds of appeal, point 1.3.3) that the skilled person would not have combined D4 with D1, because D1 was about features such as edges and ridges (D1, abstract), whereas D4 stated that edges were not good candidates for including keypoints (D4, section 4.1). Moreover, D1 provided a 12th degree polynomial (see, e.g., equation 10), which - although being a polynomial - was clearly too complex to handle for keypoint identification. Even if the two documents were combined, the combined method would have resulted in computing the approximation of D1 at each considered point in D4 and comparing it with the 9-8-9-neighborhood as per D4 (figure 2).
16. The Board remarks that document D4 is commonly known in the field of image analysis. A person skilled in this art, i.e. the intended addressee of D1, would see that D1, although it mentions edges and ridges, proposes a general method of scale space approximation, which is said to be more computationally efficient than computation by discretization of the scale space. Furthermore, D1 teaches that with this approximation, for each image point, local and global extrema in scale can be computed analytically in an efficient way. So the skilled person would consider employing the method of D1 to improve the commonly known keypoint detection method of D4, which uses the standard scale space discretization.
17. That said, the Board accepts that this way of presenting the argument does not follow the established (but not mandatory) problem-solution approach, according to which the starting point would be the prior art to be improved and the objective technical problem to achieve that improvement. One does not, however, come to a different conclusion that way.
17.1 Starting from D4, the skilled person would set out to improve the efficiency of the keypoint detection method, a matter already discussed in D4, section 3.1. In doing that, the skilled person would come across D1 which discloses a solution to this problem. As explained above, even though D1 does not expressly mention keypoints, the skilled person would not have any difficulty in realising that D1 can be employed to increase efficiency of the method of D4.
17.2 Since D1 emphasises the importance of global extrema, the skilled person would also identify, for each considered point, the global extremum among the local extrema.
17.3 Following the D4 structure, this extremum will be compared with those of the neighbours. However, using the continuous polynomial scale space approximation of D1, the comparison with the upper and lower 9 scale layer neighbors (figure 2 of D4) has no meaning, as those layers are not defined. But the spatial comparison with the 8 neighbors, as carried out in D4, remains valid.
18. The appellant also asked the Board to further consider (i) that the invention responded to a long-felt need and was included in the MPEG-7 standard, and (ii) that, although D1 and D4 were published in 2004, no one had the idea the inventors had up until 2013.
18.1 Regarding the first point, the appellant has not elaborated on any specific "long-felt need" or demonstrated any failed attempts in the art to address it. For this reason alone, the appellant's argument must fail.
18.2 Regarding the second point, the Board notes that the obviousness of the claimed invention is assessed in an "objective" manner, i.e. irrespective what actually happened, by reference to what a fictitious skilled person "would have done". The Board believes that the link between D4 and D1, provided by the problem of efficiency, would have led the skilled person to consider their combination, irrespective of whether that had actually happened in the cited period.
19. Thus the Board agrees in essence with the reasoning of the Examining Division regarding the combination of D1 and D4, though indeed the natural starting point according to the problem-solution approach is D4 rather than D1, because D4 is improved by D1 and not vice versa. It follows that all the claimed features but the last in feature b are obvious in view of said combination.
20. Regarding this last feature, the Examining Division and the appellant agree that this step solves a technical problem in that it allows to eliminate spurious or unstable candidate keypoints, due to the effects of the approximation.
20.1 The Board also notes that the Examining Division did not find this feature obvious over D1 and D4 but did not accept that the alleged technical effect could be acknowledged based on the application documents as filed.
20.2 The Board accepts this as a legitimate concern but finds to the contrary. The skilled person understands from page 17 that by keeping only the points for which "the behavior of the approximation functions are known in a neighborhood ... that is sufficiently large", the points outside the validity range of the approximation are eliminated, and that these are potentially not true extrema, and hence no real keypoints, but are due to errors in the approximation. Thus the technical problem of eliminating "false" extrema is clear from the application.
21. The Board concludes that claim 1 of the sole request involves an inventive step.
Other issues
22. The Board has no objections as to the content of the dependent claims, or of the description.
For these reasons it is decided that:
The decision under appeal is set aside.
The case is remitted to the examining division with the order to grant a patent in the following version:
Description:
Pages 1-11, 13-24 as originally filed
Page 12 as filed with letter of 4 February 2022
Claims:
1-9 according to request "C" - revision 3 filed during the oral proceedings of 10 February 2022
Drawings:
Sheets 1-16 as originally filed.