T 0344/19 18-03-2022
Download and more information:
SYSTEMS AND METHODS FOR COMPUTERIZED INTERACTIVE SKILL TRAINING
Remittal - special reasons for remittal (no)
Inventive step - mixture of technical and non-technical features
Inventive step - sole substantive request (no)
I. The appeal is against the decision of the examining division refusing the European patent application No. 09 803 499 (published as WO 2010/014633 A1) on the ground that the claimed subject-matter of the sole request before it did not involve an inventive step within the meaning of Article 56 EPC.
II. At the end of the oral proceedings before the board, the appellant requested that the decision under appeal be set aside and that the case be remitted to the examining division for further prosecution or that a patent be granted on the basis of the claims underlying the impugned decision, i.e. claims 1 to 20 filed with the appellant's (applicant's) letter dated 3 August 2018.
III. Claim 1 of the sole substantive request is worded as follows:
A computer based method of training, comprising:
for a first challenge in a training module, wherein the first challenge is a statement or question that a user is to respond to, electronically transmitting for presentation on a user terminal:
a reading user interface configured to present via text: the first challenge;
a guideline language construct that provides a model answer to the first challenge, wherein the guideline language construct includes: a key element; and
contextual language in which the key element is embedded, wherein the key element is caused to be visually distinguished from the contextual language;
a watching user interface configured to present:
a textual representation of the key element;
an audio video presentation of a role model character audibly presenting the guideline language construct, including the key element, and/or audibly presenting the key element without the guideline language construct, wherein the role model character has lip motions at least substantially synchronized with the audio presentation;
a practice user interface, wherein the practice user interface includes an audio video presentation of a challenge character, wherein the challenge character audibly presents the first challenge, wherein the user is to verbally provide a response to the first challenge, the response including at least the key element;
a review user interface configured and arranged to include at least a textual representation of the key element;
in response to at least one user action, transmitting for presentation on the user terminal a user interface associated with a scored challenge session configured to test the user with respect to at least the first challenge, wherein the scored challenge user interface includes:
at least one character visually and audibly presenting the first challenge,
wherein the at least one character has lip motions at least substantially synchronized with the audible first challenge, and wherein the user is to audibly respond to the first challenge by at least presenting the key element;
a scoring interface, the scoring interface configured to:
detect an initiation of a verbal challenge response using a voice recognition system during the scored challenge session by the user, wherein the verbal challenge response is made after the first challenge has been presented;
identifying a speech disfluency at the beginning of the verbal challenge response;
determine how long it took the user to initiate the response relative to the presentation of the first challenge wherein the act of determining how long it took the user to initiate the response relative to the presentation of the first challenge does not include the time at which the user articulated the speech disfluency as the initiation of the response; and
generate a score related to how quickly the user initiated the challenge response;
the scoring interface further configured to receive and/or provide at least the following scoring information:how accurately and/or completely the user audibly presented the key element; and
transmitting for presentation on the user terminal at least one navigation control via which the user can provide navigational instructions that enable the user to navigate to a desired user interface. IV. The appellant argued essentially that, although the decision to ignore disfluency in the timing of the challenge response may not be related to technical constraints, the specific implementation according to claim 1 provided a technical effect and was not obvious to the skilled person.
1. The claimed invention
The claimed invention relates to an interactive, computer-implemented training method and system.
1.1 The tasks for which the invention envisages to provide training involve personal interaction with (potential) clients, pupils, patients etc. The training should aim at preparing the trainee to respond appropriately in various situations involving other persons' reactions. An example is a doctor's discussion with a patient about their condition and possible cures. Apart from medical training the doctor should have the ability to respond to the patient's reactions in an appropriate manner.
1.2 The claimed training method provides a setting in which a trainee is presented with a situation involving another person, represented by a computerized avatar on a display. The trainee is presented with the necessary information regarding the circumstances of the setting, the possible reactions of the "other person" and some key points their response to those reactions should include. Then the interactive session starts and the avatar presents a challenge (e.g. makes a statement) and the trainee has to respond. The trainee's response to the challenge is recorded, timed, and analysed as to its content. A scoring is provided based on the time it took the trainee to respond and the content of the response, e.g. whether it contained the key points provided in advance (see also Figures 2A to 2R of the published application).
1.3 An aspect of the evaluation of the trainee's response is that any initial speech disfluency (e.g. "ah", "um", etc.) is not considered as part of the challenge response and is not taken into account when measuring how much time lapsed until the trainee's response started (see e.g. paragraph [0387] of the published application).
2. Request for remittal
2.1 Under Article 11 of the Rules of Procedure of the Boards of Appeal 2020 (RPBA 2020), the board shall not remit a case to the department that took the contested decision unless special reasons present themselves for doing so. Fundamental deficiencies in the first instance proceedings constitute, as a rule, such special reasons.
2.2 The appellant requested the remittal of the case to the examining division for further prosecution but did not put forward any reasons why the board should remit the case. It did not point to any possible fundamental deficiencies in the proceedings before the examining division, either.
2.3 The board notes that the sole contentious point relates to inventive step, which has been extensively discussed during both the examination and the appeal procedure. The board is thus in a position to take a final decision on the merits of the case and, therefore, sees no special reasons for a remittal to the examining division, especially since no procedural deficiency in the examination proceedings is apparent.
2.4 Hence, the board, exercising the power conferred by Article 111(1) EPC, rejects the appellant's request for remittal and proceeds to deciding the case on its merits.
3. Inventive Step (Article 56 EPC)
3.1 In the impugned decision the examining division held that the claimed method constituted an obvious implementation of a non-technical scheme related to training using notoriously well-known technical means (see especially points 7 and 13 of the Reasons).
3.2 It is common ground that the starting point for the skilled person was a notoriously well-known general purpose computer system with recording, speech recognition and playback capabilities.
The main points of contention were whether the claimed invention produced a technical effect over the prior art and whether any such technical effect (if present) was obtained through the technical features of the claim.
3.3 The appellant referred to a "notional psychologist", in correspondence to the "notional business person" used in the recent jurisprudence of the Boards of Appeal (see for example decisions T 1463/11 and T 1658/15). The board found this analogy plausible and adopted the "notional psychologist" (or better "notional pedagogue", since the claimed invention relates to a training/learning method) in the discussion.
3.4 It was common ground that the aspects of the claimed method that related to training and testing of the user related to non-technical aspects of the invention, as it was the notional pedagogue that would design the training method, determine its different aspects/steps and design the corresponding tests. Similarly, the definition of what a disfluency was and the decision to ignore any initial disfluency in the timing of the user's challenge response were also decisions of the pedagogue.
Following the established EPO case law and practice (the so called "Comvik approach"; see Case Law of the Boards of Appeal of the EPO, 9th Edition, July 2019, I.D.9.1.3 b)), they would be given to the skilled person as constraints for implementation in the general purpose, notoriously well-known computer system.
3.5 The appellant argued, however, that the implementation of these non-technical constraints according to the claimed invention produced the technical effect of a less resource-intensive processing by the computer system.
3.5.1 According to the appellant, the claimed invention was based on "voice recognition" and not on "conventional speech recognition".
The relevant passage of paragraph [0387] of the published application, to which the appellant referred to, reads as follows (see the last 5 lines on page 91 and the first 3 lines on page 92 of the published application):
The server 110 and/or one or more of the terminals 102, 104 and 106, can include a voice recognition system configured to recognize and convert human speech received via a microphone or otherwise to computer understandable characters. Optionally, the server 110 and/or one or more of the terminals 102, 104, and 106, are configured to determine from such converted human speech when a user has begun speaking (e.g. providing a challenge response), and/or whether the user has provided correct answers (e.g., by comparing a challenge response from the user to reference or model challenge response and/or to corresponding key elements).
According to the appellant, the system used templates with lists of those key elements (see also Figure 2H). When the challenge response of the user was recorded, there was no need for a detailed analysis of the recorded speech, but only a comparison of the recorded sounds ("voice") with the key elements in those templates in order to determine whether the recorded response contained any of them. A conventional speech recognition system would have analysed the recorded challenge response, identified the individual words, and determined whether any of the key elements were part of this response. Compared to such a conventional system, the system of the claimed invention needed only to compare the recorded speech to the key element templates in the memory, without any detailed speech recognition analysis. This implementation was less resource intensive than the conventional one; hence, it produced a technical effect. Moreover, the specific implementation was a decision of the technically skilled person and thus a technical feature. Since there was nothing in the prior art pointing the skilled person to such an implementation, the subject-matter of claim 1 involved an inventive step.
3.6 The board is not convinced by the appellant's arguments.
3.6.1 Firstly, there is nothing in the application that points to the implementation of the speech recognition process as presented by the appellant. There is nothing about lists/templates of key elements and comparing the recorded speech to such templates in order to identify any key elements in the user's response.
The only such lists mentioned in the application relate to the possible disfluencies at the beginning of the user's challenge response. Paragraph [0387], continuing from the passage cited above, explains:
The speech recognition is configured to distinguish between substantive speech and difluencies (e.g., "um", "ah", etc.). Thus, for example, when measuring the time from the end of a challenge until the user begins responding, the system will not identify a disfluency as the beginning of a response. Other forms of initial "hesitation" speech, such as "well, you see", are also not identified as the beginning of a response. Optionally, a file stored in system memory of words and/or phrases that if uttered before a substantive response, are not identified as the beginning of a response.
Noting that all these features are mentioned as optional, the board also notes that the system may use lists of words or phrases that are to be considered as disfluencies and thus ignored in the timing of the user's response. However, there is no corresponding indication that the recognition of any key elements in the challenge response is done in the same way. Hence, the board does not accept the appellant's interpretation of the speech/voice recognition by the claimed invention.
3.6.2 Secondly, even if it were to be accepted that such lists/templates were used in identifying key elements in the challenge response, the board is not persuaded that such an implementation would have been a decision by the technically skilled person.
3.6.3 It is considered common ground that in "conventional" speech recognition, the system parses the recorded speech and essentially compares it to pre-stored lists of words/phrases, the meaning of which is known, in order to reach a conclusion as to the content of the recorded speech. So, in the board's understanding, such "conventional" speech recognition is also implemented by using stored lists of words/expressions in order to identify the content of a recorded phrase.
3.6.4 According to the appellant, the claimed method was using stored templates with the predetermined key elements sought in a user's challenge response.
Taking the example of Figure 2H, one key element in the expected challenge response is "Old days, HIV was a death sentence". In the board's understanding, the user/trainee could express this key element in several ways, besides the one given, for example:
- In the past, people used to die from HIV.
- HIV used to be fatal.
- Until some years ago, HIV meant death.
- etc.
In the board's view, a possible template should cover all possible formulations that relate to the same underlying concepts of the key elements, i.e. that "in the past" - "HIV" - "was fatal", since a user can express these concepts in many different ways. In such a case, there would be no difference from the "conventional speech recognition" described above. Alternatively, as also the appellant explained during the oral proceedings, the notional pedagogue could choose a more restrictive approach and consider that a key element is included in the challenge response only if there is a word for word match with the text presented to the user during the training phase. Naturally, all the possibilities between these two extremes are possible, i.e. the pedagogue would decide how restrictive the evaluation of the challenge response will be.
3.6.5 The selection of which words/expressions to include in the templates used to determine whether a challenge response contains a key element would therefore be a decision of the notional pedagogue which would determine what is to be considered as a formulation of the key element(s) sought. Hence, how long or short these templates/lists would be is not a decision of the technically skilled person but of the pedagogue. As the size of the templates would have a direct influence on how much computational effort would be needed in the speech recognition process, the board takes the view that any savings of computational effort are a direct result of the pedagogue's decision of what to include in the templates and not of a decision related to the specific implementation taken by the technically skilled person. The technically skilled person would be given the predetermined templates to implement in the same way that they will be given the list of expressions to be considered as disfluencies.
3.6.6 Summarising, and following the appellant's interpretation of how the speech recognition in the claimed method is carried out, the board does not see any difference between the way this is done in the claimed method and in the prior art: recorded speech is essentially compared to stored lists/templates of words/expressions to determine its content. Any difference in the required computational effort that may exist depends on the length/size of these pre-stored lists/templates. The size of such lists/templates used in the claimed method depends on non-technical aspects, such as the selection of accepted formulations of the sought key elements, a selection which is done by the notional pedagogue (non-technically skilled person). In the board's view, such a difference relates to a modification of the pedagogical (administrative) scheme underlying the claimed computer-implemented method rather than to any technical means used for implementing the method. The technical problem of saving computational effort is thus not solved but rather circumvented.
3.6.7 According to established EPO case law and practice, method steps consisting of modifications to a business scheme and aimed at circumventing a technical problem rather than solving it by technical means cannot contribute to the technical character of the subject-matter claimed (see T 0258/03, "Auction method/HITACHI", OJ 2004, 575, Headnote II and point 5.7 of the reasons).
3.7 Adhering to this case law, the board concludes that the subject-matter of claim 1 does not involve an inventive step (Articles 52(1) and 56 EPC).
4. Since the appellant's request for remittal is rejected and its sole substantive request does not meet the requirements of the EPC, the appeal must fail (Articles 97(2) and 111(1) EPC).
For these reasons it is decided that:
The appeal is dismissed.