T 2804/19 (Data transformation/ORACLE) 18-11-2021
Download and more information:
Declarative language and visualization system for recommended data transformations and repairs
I. The appellant (applicant) appealed against the decision of the examining division refusing European patent application No. 15781486.4, published as international application WO 2016/049460.
II. The examining division decided that the subject-matter of claim 1 of the main request and of auxiliary requests 1, 2 and 3 lacked inventive step over a notoriously known "commonplace general purpose computer carrying out method steps, inputting, outputting and processing information" and that claim 1 of auxiliary requests 1 to 3 did not comply with Articles 84 and 123(2) EPC.
III. In its statement of grounds of appeal, the appellant maintained the main request and auxiliary requests 1, 2 and 3 considered in the contested decision and filed new auxiliary requests 2a and 3a.
IV. In a communication accompanying the summons to oral proceedings, the board expressed the preliminary view that the subject-matter of claim 1 of all requests lacked inventive step over a notorious general-purpose computer and in view of well-known graphical user interfaces. It also raised objections under Articles 84 and 123(2) EPC.
V. With its submissions in preparation for the oral proceedings, the appellant replaced its requests with an amended main request, amended auxiliary requests 1, 2, 2a, 3 and 3a, and new auxiliary requests 4, 4a, B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a.
VI. The appellant was heard on the relevant issues during oral proceedings held on 18 November 2021. At the end of the oral proceedings, the Chair announced the board's decision.
VII. The appellant's final requests were that the decision under appeal be set aside and that a patent be granted on the basis of the main request or, in the alternative, of one of the auxiliary requests 1, 2, 2a, 3, 3a, 4, 4a, B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a.
VIII. Claim 1 of the main request reads as follows:
"A method comprising:
identifying, by a computer system, a pattern in data from one or more data sources, wherein the pattern is defined by a type of data;
comparing the pattern to each of a plurality of entities in entity information obtained from a knowledge source, wherein the plurality of entities includes a first entity and a second entity, wherein the first entity is associated with a first pattern defined by a first type of data, and wherein the second entity is associated with a second pattern defined by a second type of data;
matching, based on comparing the pattern to each of the plurality of entities, the pattern in the data to the first pattern associated with the first entity in the entity information, wherein matching the pattern to the first pattern includes matching the type of data defined by the pattern to the first type of data defined by the first pattern;
generating one or more transformation scripts for the data based on the entity information corresponding to the first entity associated with the first pattern matching the pattern in the data, wherein a transformation script includes code executable to transform the data;
generating one or more indicators of a recommended data transform to perform on the data, where the one or more indicators correspond to the one or more transformation scripts;
causing the one or more indicators to be displayed in a graphical user interface;
receiving one or more transformation instructions based on input from the graphical user interface, the input corresponding to a selection of an indicator of the one or more indicators;
transforming the data by executing a transformation script corresponding to the indicator; and
publishing the transformed data to one or more data target sources;
wherein the step of identifying a pattern in data comprises using a set of regular expressions to determine the structure of that data, wherein the regular expressions are defined by semantic or syntactic constraints."
IX. Claim 1 of auxiliary request 1 differs from claim 1 of the main request in that the following text has been added at the end of the claim:
"wherein the entity information is obtained from the knowledge source based on the determined structure of the data."
X. Claim 1 of auxiliary request 2 differs from claim 1 of auxiliary request 1 in that the following text has been added at the end of the claim:
"wherein transforming the data comprises repairing the data."
XI. Claim 1 of auxiliary request 3 differs from claim 1 of auxiliary request 2 in that "generating one or more transformation scripts for the data based on the entity information" has been replaced with the following text:
"analysing the data by normalizing the data into a tabular format having one or more columns of data, identifying the types of data stored in the columns of the normalized data, and identifying information about how the data is stored in the columns of the normalized data;
generating metadata based on said analysis of the data;
generating one or more transformation scripts for the data based on said metadata and on the entity information".
XII. Claim 1 of auxiliary request 4 differs from claim 1 of auxiliary request 2 in that the following text has been inserted before "identifying, by a computer system":
"receiving data from one or more data sources;
normalizing the data into a tabular format having one or more columns of data;"
and in that "generating one or more transformation scripts for the data based on the entity information" has been replaced with the following text:
"analysing the normalized data by, [sic] identifying the types of data stored in the columns of the normalized data and identifying information about how the data is stored in the columns of the normalized data;
generating metadata based on said analysis of the data;
generating one or more transformation scripts for the data based on said metadata and on the entity information".
XIII. Claim 1 of auxiliary requests 2a, 3a and 4a differs from claim 1 of auxiliary requests 2, 3 and 4 in that the text "wherein the entity information is obtained from the knowledge source based on the determined structure of the data;" has been removed. In addition, in auxiliary request 4a the comma after "analysing the normalized data by" has been removed.
XIV. Claim 1 of auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a differs from claim 1 of the main request and auxiliary requests 1, 2, 3a, 3, 3a, 4 and 4a in that "A method comprising:" has been replaced with the following text:
"A method of data enrichment for subsequent indexing and clustering, the method comprising:".
XV. The appellant's arguments, where relevant to the decision, are discussed in detail below.
1. The application
The application relates to the "enrichment" of data.
First, data is retrieved from a data source and analysed for the presence of certain patterns (see paragraph [0059] of the published application). One such pattern could be XXX-XX-XXXX, indicating a US social security number (paragraph [0068]).
On the basis of the matched patterns, data transform recommendations and corresponding transform scripts are generated (paragraphs [0074] and [0075]). Such a data transform may "repair" the data, for example by converting date formats (paragraph [0076]).
The recommendations/transform scripts are displayed to the user for selection, and the transform script selected by the user is run (paragraph [0078]).
A knowledge service may be queried for additional data with which to further "enrich" the data set, for example with location, state, population and country information if the data set includes a column of city names (paragraph [0079]).
2. Admission of the main request and the auxiliary requests - Article 13(1) and (2) RPBA 2020
2.1 The main request and auxiliary requests 1, 2 and 3 correspond to the requests considered in the contested decision and maintained in the statement of grounds of appeal with an amendment aimed at overcoming an objection under Article 123(2) EPC raised for the first time in the board's communication.
2.2 Auxiliary requests 2a and 3a correspond to auxiliary requests 2 and 3 with an amendment removing a feature objected to under Article 123(2) EPC in the contested decision.
2.3 Auxiliary requests 4 and 4a correspond to auxiliary requests 3 and 3a with an amendment aimed at overcoming an objection under Article 123(2) EPC raised for the first time in the board's communication.
2.4 Auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a correspond to the main request and auxiliary requests 1, 2, 2a, 3, 3a, 4 and 4a with an amendment responding to an argument that was part of the inventive-step reasoning in the board's communication.
2.5 Hence, there are circumstances and reasons that, at least prima facie, may justify the admission of each request at this stage of the appeal proceedings. Moreover, since the amendments made are relatively straightforward and pose no excessive difficulties, the board considers that it would be contrary to procedural efficiency to investigate in detail for each request whether the applicable circumstances and reasons might be insufficiently exceptional and cogent to meet the requirements of Article 13(2) RPBA 2020. Instead, the board admits all requests into the appeal proceedings.
Main request
3. Inventive step
3.1 Claim 1 relates to a method that is at least partially carried out by a computer system.
The method includes steps which identify a "pattern" in data obtained from one or more data sources using regular expressions, compare the pattern with each of a plurality of "entities" obtained from a knowledge source, and match the pattern to a first pattern associated with a first entity. The method then generates one or more transformation scripts based on information "corresponding to the first entity".
For each transformation script, an indicator is displayed in a graphical user interface for selection by the user. The transformation script corresponding to the selected indicator is executed to transform the data. The transformed data is published to one or more "data target sources".
3.2 The subject-matter of claim 1 includes the following potentially technical aspects:
(a) the display of indicators in a graphical user interface to allow the user to choose from one of a number of tasks by selecting the indicator corresponding to the task; and
(b) retrieving and outputting data from and to data sources.
3.3 As for (a), graphical user interfaces with menu systems allowing a user to select one of a number of tasks by selecting a corresponding indicator were well known in the art at the priority date of the application.
3.4 As for (b), the examining division took the view - not necessarily without merit - that the claim expressions "from one or more data sources" and "obtained from a knowledge source" do not imply, as limitations of the scope of claim 1, steps of retrieving data from one or more data sources and retrieving/obtaining the entity information from a knowledge source.
However, this question of claim interpretation is not decisive, since retrieving and outputting data from and to data sources were in any event commonplace features of computer programs.
3.5 The remaining features of claim 1 relate to the manner in which the data obtained from data sources is processed and transformed.
Since these claim features leave the kind of transformation being applied to the data completely open, no technical effect can be recognised in the result of this data processing.
Moreover, in the board's judgment these features do not contribute to any technical effect related to their implementation on a computer system. In particular, they do not reflect any technical considerations concerning the computer's internal functioning which go beyond "merely" finding a computer algorithm to carry out some procedure (see opinion G 3/08, OJ EPO 2011, 10, Reasons 13.5 and 13.5.1; and decision G 1/19, OJ EPO 2021, A77, point 112).
Nor is there any unexpected technical interaction with the above-mentioned well-known technical aspects of the claim. In particular, the display of user-selectable indicators merely allows the user to select which non-technical data transformation to apply.
3.6 The appellant submitted that the data transformation of claim 1 achieved a technical effect over its whole scope in that it inevitably improved subsequent data indexing and clustering.
However, since the claim does not define the kind of data transformation that is applied, it cannot be seen what kind of technical improvement in "subsequent data indexing and clustering" the transformation achieves over the whole claimed scope. For this reason alone, the argument does not succeed.
3.7 The appellant also submitted that the claimed method is made more efficient by the use of regular expressions, as this reduced the load on system resources.
However, a mere speed comparison with a conceivable reference method that does not use regular expressions is not a suitable criterion for distinguishing between technical and non-technical procedural steps (see decision T 1227/05, OJ EPO 2007, 574, Reasons 3.2.5).
In the board's view, the claimed use of regular expressions is based not on any technical considerations aimed at improving the speed of the program, for example relating to the internal functioning of a computer, but on considerations that are purely algorithmic in nature, i.e. relating to computer programs as such. It therefore does not contribute to a technical effect for the purpose of assessing inventive step.
3.8 Hence, the subject-matter of claim 1 of the main request lacks inventive step over a notorious general-purpose computer.
Auxiliary request 1
4. Inventive step
4.1 Claim 1 of auxiliary request 1 adds to claim 1 of the main request that the entity information is obtained from the knowledge source "based on the determined structure of the data".
4.2 The appellant argued that the added feature made the queries to the knowledge service more focused and thereby contributed to reducing system load.
In the board's view this argument cannot succeed for essentially the same reasons as given in point 3.7 above. A change to a non-technical algorithm does not make a technical contribution merely because it results in a decreased or increased or different use of system resources. Rather, it should be based on technical (non-algorithmic) considerations aimed at achieving a technical effect.
Moreover, since the claim leaves so many details undefined, the board cannot accept that the added feature reduces system load over the whole scope of the claim.
4.3 Hence, the subject-matter of claim 1 of auxiliary request 1 lacks inventive step (Article 56 EPC).
Auxiliary requests 2 and 2a
5. Inventive step
5.1 Claim 1 of auxiliary request 2 adds to claim 1 of auxiliary request 1 that transforming the data comprises "repairing" the data.
Claim 1 of auxiliary request 2a adds the same feature to claim 1 of the main request.
5.2 The appellant argued that repair of data enabled data indexing and clustering. If there were errors in the data, it would not be possible to correctly determine which items of data had to be stored in neighbouring regions of the hard disk. Repairing the data ensured that this was not an issue.
Since the claim defines neither the kind of repair that is performed nor the kind of indexing or clustering that subsequently is to be carried out, the board cannot agree that the added feature contributes to a technical improvement in data indexing or clustering over the whole scope of the claim. There is no reason why the (unspecified) data indexing or clustering process would work any differently, let alone more efficiently, if carried out on "repaired" data.
5.3 Hence, the subject-matter of claim 1 of auxiliary request 2 and 2a lacks inventive step (Article 56 EPC).
Auxiliary requests 3 and 3a
6. Inventive step
6.1 Claim 1 of auxiliary request 3 adds to claim 1 of auxiliary request 2 the following features:
- analysing the data by
- normalising the data into a tabular format having one or more columns of data,
- identifying the types of data stored in the columns of the normalised data, and
- identifying information about how the data is stored in the columns of the normalised data;
- generating metadata based on said analysis of the data.
The claim further specifies that generating one or more transformation scripts is based on "said metadata" as well as on the entity information.
Claim 1 of auxiliary request 3a adds the same features to claim 1 of auxiliary request 2a.
6.2 The appellant argued that data clustering was based on storing data that was likely to be accessed at the same time in the same region of a hard disk. The type of data being stored, along with other metadata, was clearly relevant to data indexing and clustering. The metadata was therefore relevant to how data was transformed.
In the board's view, the features added to claim 1 make no technical contribution. The appellant's argument that the metadata is somehow "relevant" to how data is transformed (or repaired) does not change the fact that claim 1 leaves the kind of repair and the kind of subsequent data indexing or clustering undefined. One feature being "relevant" to another feature cannot be equated with technicality.
6.3 Hence, the subject-matter of claim 1 of auxiliary requests 3 and 3a lacks inventive step (Article 56 EPC).
Auxiliary requests 4 and 4a
7. Inventive step
7.1 Claim 1 of auxiliary request 4 adds to claim 1 of auxiliary request 2 essentially the same features that were added in claim 1 of the auxiliary request 3 (but arranged slightly differently to overcome an objection related to added subject-matter) and adds an explicit step of "receiving data from one or more data sources".
Claim 1 of auxiliary request 4a adds the same features to claim 1 of auxiliary request 2a.
7.2 Since the added step of "receiving data from one or more data sources" does not change the board's reasoning (see point 3.4 above), and since the same applies to the re-arrangement of the other features added to claim 1 of auxiliary requests 2 and 2a as compared with auxiliary requests 3 and 3a, the subject-matter of claim 1 of auxiliary requests 4 and 4a also lacks inventive step (Article 56 EPC).
Auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a
8. Inventive step
8.1 Claim 1 of auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a adds to claim 1 of the main request and auxiliary requests 1, 2, 2a, 3, 3a, 4 and 4a a feature specifying that the method is a method "of data enrichment for subsequent indexing and clustering".
8.2 Although discussion is possible as to whether this added feature limits the claimed method to a method of data enrichment which includes a step of subsequent indexing and clustering or merely to a method of data enrichment which is suitable for subsequent indexing and clustering, for the purpose of assessing inventive step the board accepts the former, narrower reading.
8.3 It is undisputed that methods of data indexing and data clustering were well-known in the art at the priority date. For the reasons given in points 3. to 7. above, the board considers that the subject-matter of claim 1 of the main request and auxiliary requests 1, 2, 2a, 3, 3a, 4 and 4a does not lead to a technical improvement in a subsequent step of indexing or clustering. Hence, the subject-matter of claim 1 of auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a does not go beyond a mere juxtaposition of the obvious method of claim 1 of the main request and auxiliary requests 1, 2, 2a, 3, 3a, 4 and 4a and known steps of data indexing and clustering.
8.4 The subject-matter of claim 1 of the auxiliary requests B, B-1, B-2, B-2a, B-3, B-3a, B-4 and B-4a therefore lacks inventive step (Article 56 EPC).
9. Conclusion
Since none of the requests on file is allowable, the appeal is to be dismissed.
For these reasons it is decided that:
The appeal is dismissed.