T 0441/06 (Gesture recognition system/MATTSSON) 06-02-2009
Gesture recognition system
Added subject-matter (yes)
Support by the description (no)
Inventive step (no)
I. This appeal is against the decision of the examining division dispatched 2 November 2005, refusing European Patent Application No. 01 914 291.8 for the reason that the subject-matter of claims 1 to 13 did not involve an inventive step having regard to the disclosure of
D4: US 5 617 312 A and any one of
D1: US 5 616 078 A;
D2: DE 196 32 273 A;
D5: Y. Kuno et al, "Vision based human interface with user-centered frame", Proceedings of the IEEE/RSJ/GI International Conference on Intelligent Robots and Systems: Advanced Robotics and the Real World, Munich, Sept. 12 - 16, 1994; pages 2023 to 2029.
II. Notice of appeal was submitted on 12 January 2006. The appeal fee was paid on the same day. The statement setting out the grounds of appeal and claims 1 to 17 to replace the claims on file were filed on 13 March 2006.
The appellant requested that the appealed decision be set aside and that the patent be granted based on claims 1 to 17. Further, an auxiliary request for oral proceedings was made.
III. On 12 November 2008 the board issued an invitation to oral proceedings scheduled to take place on 6 February 2009 accompanied by a communication. In the communication the board expressed the preliminary view that claims 1, 14, 15 and 17 did not appear to comply with the provisions of Article 123(2) EPC, that claims 1 and 17 did not appear to be supported by the description, contravening Article 84 EPC 1973, and that the subject-matter of claims 1 and 17 did not appear to involve an inventive step having regard to the combination of any one of D1, D2 and D5 with D4 as argued in the decision under appeal or, in the alternative, the combination of
D3: US 5 594 469 A
IV. In his letter of 14 January 2009, submitted by telefax on 16 January 2009, the appellant announced that he would not attend the oral proceedings. No substantive comments or amendments in response to the communication were received.
V. Oral proceedings took place as scheduled on 6 February 2009. Neither the appellant nor his representative attended the hearing.
VI. After deliberation on the basis of the submissions and the requests of 13 March 2006 the board announced its decision.
VII. Claim 1 reads as follows:
"A gesture recognition system comprising:
detecting arrangement (53,63,83) for detecting a plurality of markers (31, 32, 42, 51, 51, 81) arranged on a movable object (30, 50, 60) and generating signals corresponding to the markers,
processing arrangement (24, 54) for processing said signals from said detecting arrangement,
computing arrangement (11) for determining positions of said markers from said signals,
a set of reference markers attached to the object and forming a reference line delimiting a first area and a second area;
a set of command markers attached to the object, said command markers having a physical property different from the reference markers, said physical property being detectable by said detecting arrangement for differentiating said command markers from said reference markers;
said detecting arrangement detecting movements of said command markers and translating said movements into gesture signals in said first area and omitting translation of said movements into gesture signals in said second area.
Claim 17 is a method claim corresponding to claim 1.
The appeal complies with the provisions of Articles 106 to 108 EPC 1973, which are applicable according to J 10/07, point 1 (see Facts and Submissions, point II above). Therefore it is admissible.
2. Non-attendance of oral proceedings
In his letter of 14 January 2009 the appellant announced that he would not attend the oral proceedings which he had requested and to which he had been duly summoned (see Facts and Submissions point III above). Nobody attended the hearing on behalf of the appellant.
Article 15(3) RPBA stipulates that the board shall not be obliged to delay any step in the proceedings, including its decision, by reason only of the absence at the oral proceedings of any party duly summoned who may then be treated as relying only on its written case.
Thus, the board was in a position to take a decision at the end of the hearing.
3. Article 123(2) EPC
Claims 1 and 17 were amended to refer to "having a physical property different from the reference markers". Contrary to the appellant's statement this feature was not disclosed in the application as filed, since page 13, line 18 to page 14, line 3 only refers to markers having different shape and colour combination. Although page 12, line 8 mentions sequential colour patterns and page 21, lines 29 and 30 the possibility of using radio or ultrasonic markers, the more general term "physical property", which encompasses and discloses many additional physical properties as e.g. reflectivity, temperature etc., lacks a basis in the application as filed, contravening the provisions of Article 123(2) EPC.
Claim 14 differs from claim 10 as filed in referring to a "stored array" instead of a "sorted array". The description at page 6, lines 6 to 8 also refers to a "sorted array". Claim 14 does not comply with the provisions of Article 123(2) EPC.
Claim 15 differs from claim 11 as filed in specifying that a GUI driver serves as an emulation arrangement. At page 7, lines 9 to 13 a mouse emulation and a keyboard emulation are mentioned. The more general term "emulation arrangement" was not disclosed in the application as originally filed, contravening the provisions of Article 123(2) EPC.
Since the appellant's sole request is not allowable, the appeal must be dismissed.
However, since these objections could have been overcome without affecting the substance of the claims, the board also makes the following observations.
4. Article 84 EPC 1973
The description refers to the problem of using movements to enter commands into the operating system of the computer or control peripheral devices, see page 1, lines 9 to 12 of the application as published. The description refers to external systems the claimed system is intended to control, see page 5, line 29 and page 8, lines 20 to 23. The description further states that the function of the system is to detect the movements of a marker and relating the movement to a stored movement representation, which is translated to a command. To simplify the translation or to increase the command possibilities, the system uses alfa markers as a reference line and movements are translated or omitted with respect to the reference line, e.g. depending on the movement taking place above the line or beneath the line, see page 21, lines 6 to 11. The use of the gesture signals for controlling external systems results in specific requirements of the gesture control system. As controlling external systems is not mentioned in claims 1 and 17, the claims are not supported by the description, contravening Article 84 EPC 1973.
5. Novelty and inventive step
In its communication accompanying summons the board stated that it considered the reasoning based on a combination of either of D1, D2 and D5 with D4 made in the decision under appeal to be convincing. In view of possible amendments of the claims, intended to overcome the objections made under Articles 84 EPC 1973 and 123(2) EPC, the board added an assessment based on D3 and D5 in the communication.
However, no substantive comments or amendments in response to the communication were received. As no amendments of the claims making the combination of D3 and D5 more relevant than any of the combinations on which the decision under appeal was based were filed, the board bases its assessment of inventive step on the reasoning made by the examining division in the decision under appeal.
5.1 Claim 1
The board considers the reasoning in the decision under appeal to be convincing, in particular points 2.3 to 2.6 based on a combination of D5 and D4, for the following reasons:
D5 is a scientific report concerning vision-based human interfaces based on the recognition of hand gestures and their use for a man-machine interface, see point I, lines 1 to 5. D5 states that known methods for a man-machine interface may be classified into two categories: one uses special gloves with sensors, i.e. markers, the other employs computer vision technologies. Both technologies are said to be able to provide reliable results, however, the computer vision systems need calibration, which tends to be complex, in particular if the user's position and orientation varies. The system studied in D5 in more detail overcomes this problem in using four reference points on the user's body defining a body coordinate system with respect to which a position and orientation of a finger tip which is provided with markers is calculated. See points I and IV. This system thus meets the object to provide a gesture recognition method and system, which allow real time gesture recognition without a need for complex computing resources, underlying the present application, see page 2, lines 23 to 25.
The system comprises two cameras (Camera1 and Camera2 in Figure 2, page 2026), which constitute an detecting arrangement for detecting a plurality of markers arranged on a movable object (Hand in Figure 2) and generating signals (Video signals in Figure 2) corresponding to the markers, various processors (Nexus, SUN and IRIS in Figure 2), which constitute a processing arrangement for processing said signals from said detecting arrangement and a computer arrangement for determining positions of said markers from said signals.
The system uses special markers, e.g. three point marks on the upper body and one on the knee, for reference points corresponding to a set of reference markers attached to the object and forming a body coordinate system, i.e. a reference system having axes which constitute reference lines delimiting first areas and second areas in their planes. The reference points are e.g. bright balls. Further a e.g. black glove with three e.g. white marks is used to recognise a hand motion by which the position and the orientation of an object is controlled. This corresponds to a set of command markers attached to the object, said command markers having a physical property, i.e. colour and shape, different from the reference markers, said physical property being detectable by said detecting arrangement for differentiating said command markers from said reference markers. The user of the system can move an object in the 3D computer graphics world displayed on a screen by his hand motion. The finger tip position and finger orientation are calculated with respect to the body coordinate system established by the reference points. This corresponds to detecting movements of said command markers and translating said movements into gesture signals. See points I and IV.
The subject-matter of claim 1 differs from the teaching of D5 in that gestures are translated into commands only if some pre-defined markers ("command markers") are positioned in an area above a reference line defined by two other pre-defined markers ("reference markers").
The technical effect of this feature is to discriminate between a rest position and an active position of the command issuing body part, e.g. the hands, thereby solving the problem of avoiding spurious command triggers.
A similar effect is achieved in the computer system that enters control information by means of a video camera disclosed in D4, lying in the same technical field of gesture recognition. D4, see in particular figures 11 to 13 and column 7, line 35 to column 8, line 9, discloses a man-machine interface, translating hand gestures into a command only if they are detected within a specific "virtual area" of the field of view of the camera, whereas they are discarded if they are detected outside the virtual area, see column 7, lines 35 to 40 and lines 49 to 51. The skilled person would understand that the reference points of D5 may be used to define the virtual area, which may be considered to be the first area and the area outside the virtual area the second area. Thus, the subject-matter of claim 1 does not involve an inventive step.
The appellant argues in its statement setting out the grounds of appeal that the subject-matter of claims 1 and 17 differs inter alia from D5 at least by the feature that a set of reference markers attached to the object and forming a reference line delimiting a first area and a second area. The board notes that in the method of D5 four points are taken in the scene and other points' positions are calculated as invariant coordinates in the coordinate system with three basic vectors composed of four points, the reference points being taken on the user's body, see point I. It is common general knowledge that the vectors of a coordinate system define axes which may be interpreted as reference lines delimiting a first area and a second area in their respective plane.
The appellant further states that the first area is carefully defined by the use of reference markers, resulting in that both the user and the system knows exactly where the first area is situated. The board understands the definition of the first area as part of the definition of the triggering gesture which should be easily recognized in order to have a robust system.
As to the appellant's argument that the area is dynamic and related to the object, the board notes that D5 uses a user-centered frame so that the user can control the object by hand motions matched with human intuition. Since reference points are taken on the user's body, it can manipulate the object in the user-centered frame which implies that it does not need to keep its body at the initial position, see D5, page 2023, Abstract, and page 2029, Conclusion.
The appellant argues that the command marker being differentiated from the reference marker results in increased safety in detecting the two types of markers. However, using different kinds of markers as such is disclosed by D5 at page 2026, right column and implies making use of the difference.
As to the appellant's argument that only movements or gestures of the command markers in the first area are interpreted as gesture signals, the board notes that this does not establish a difference to the system disclosed in D4, in which the microcomputer converts the signal into a signal controlling the computer to trigger a punching action only if the representative point of the hand's image is detected in the hand detecting area, see column 7, lines 46 to 60.
Therefore, the appellant's arguments do not convince the board.
5.2 Claim 17
Similar arguments to those presented in point 5.1 with respect to claim 1 apply to the corresponding method claim 17 mutatis mutandis.
6. The further objections raised in the communication accompanying the summons still persist.
For these reasons, it is decided that:
The appeal is dismissed.