T 0309/10 (Archival and retrieval/MULTEX) 19-06-2013
Download and more information:
A METHOD AND SYSTEM FOR REFERENCING, ARCHIVING AND RETRIEVING SYMBOLICALLY LINKED INFORMATION
I. The appeal is against the Examining Division's decision to refuse European patent application 00932653.9. The decision was according to the state of the file and referred to objections formulated in the communication accompanying the summons to oral proceedings: lack of clarity, and lack of inventive step in the light of document EP?A-0 434587 (D2).
II. In the statement setting out its grounds of appeal, the appellant requested that the Examining Division's decision be set aside and that a patent be granted on the basis of the newly submitted main request, or else of newly submitted first or second auxiliary requests. The appellant also requested that oral proceedings be held, if the Board were inclined to refuse any of those requests.
III. The Board arranged for oral proceedings to be held and summoned the appellant accordingly. In an accompanying communication, the Board set out its preliminary view.
IV. At the oral proceedings, the appellant stated its final requests as that the Examining Division's decision be set aside and that a patent be granted on the basis of the main request, or else of the first, second, or third auxiliary requests, all filed during the oral proceedings.
V. Claim 1 according to the main request reads as follows.
A document repository system allowing electronic storing and referencing of documents comprising:
a storage device;
a network interface;
a processor coupled to the storage device, said processor adapted to:
process a symbol to generate a master symbol (115) including applying a set of normalisation rules to the symbol;
assign a parent identifier (110) to the master symbol (115), wherein the parent identifier (110) is assignable to a plurality of master symbols (115);
store the parent identifier (110) and the master symbol (115) in a master symbol database wherein the master symbol is linked to the parent identifier (110);
store at least one document wherein the at least one document is linked to the parent identifier comprising the steps of:
generating a document identifier;
storing the document identifier and the parent identifier (110) so that the parent identifier is linked to the document identifier in a relational database; and
storing the document and the document identifier so that the document identifier is linked to the document;
receive an input symbol which contains symbol segments (120);
process the input symbol to generate a normalized symbol including:
applying a set of normalisation rules to the input symbol; and
searching a normalization table database (417) of sets of symbols each set relating to a master symbol using the symbol segments to return corresponding master symbol segments (1527);
search the master symbol database using the normalized symbol to find a matching master symbol and a parent identifier linked to the matching master symbol;
search an information element database to find a document linked to the parent identifier; and
retrieve the document linked to the parent identifier.
VI. Claim 1 according to the first auxiliary request reads identically, except for the emphasized passages below.
A document repository system ... comprising:
a processor ... adapted to:
process a symbol to generate a master symbol (115) including applying a set of normalisation rules to the symbol, the master symbol having a number of symbol segments defined by a symbol template, wherein each symbol segment comprises a text string;
assign a parent identifier ...
process the input symbol to generate a normalized symbol including:
applying a set of normalisation rules to the input symbol;
if determined that the normalized symbol contains all symbol segments of the symbol template then
searching a normalization table database ....
VII. Claim 1 according to the second auxiliary request reads identically to that according to the main request, except as shown below.
A document repository system ... comprising:
a processor ... adapted to:
process a symbol to generate a master symbol (115) including applying a set of normalisation rules to the symbol, the master symbol having a number of symbol segments defined by a symbol template, wherein each symbol segment comprises a text string, further wherein the symbol template comprises a number of symbol fields, and further wherein the master symbol is structured according the symbol template;
assign a parent identifier ...
store with a document contributor identifier a predominant use record having the same number of symbol fields as the symbol template, wherein the predominant use record includes a symbol segment most frequently submitted by a document contributor associated with the document contributor identifier, the symbol segment corresponding to a parent identifier (110);
store the parent identifier ...
store at least one document submitted by the document contributor, wherein the at least one document is linked to the parent identifier, comprising the steps of:
generating a document identifier;
VIII. Claim 1 according to the third auxiliary request reads identically to that according to the main request, except as shown below.
A document repository system ... comprising:
a processor ... adapted to:
process a symbol to generate a master symbol (115) including applying a set of normalisation rules to the symbol, the master symbol having a number of symbol segments defined by a symbol template, wherein each symbol segment comprises a text string;
assign a parent identifier ...
process the input symbol to generate a normalized symbol including:
applying a set of normalisation rules to the input symbol;
determine if the normalized symbol contains all symbol segments of the symbol template;
if the normalized symbol does not contain all symbol segments of the symbol template then retrieving client preference defaults from a client database (470) to replace missing symbol segments;
searching a normalization table database ....
IX. The appellant's arguments can be summarised as follows.
The invention made the retrieval of relevant documents easier and more accurate. It saved time and resources. A solution to a similar problem was found to involve an inventive step in T 0654/10, Searchable message storage system/J2 GLOBAL COMMUNICATIONS (not published in the OJ EPO).
The invention could not be viewed as the automation of a mental act. A human, for example a librarian, would not, or could not, maintain the information required in his head. Nor could the invention be seen as the automation of a known method, because the method was not known.
Implementation using a computer system would allow the repository to grow larger than without.
Normalisation was well known, and the skilled person understood the meaning of the term.
The use of a template, defined in claim 1 according to the auxiliary requests, allowed for filtering. It was better to obtain no results than wrong results.
The use of default values, defined in claim 1 according to the third auxiliary request, addressed the problem of allowing a machine to cope with missing data, or with data in different formats.
The background of the invention
1. The invention is concerned with the archival and retrieval of documents. The claims are broadly drafted to cover any sort of document, but the sole example provided in the application concerns financial documents. It is convenient to outline the invention in terms of that example, because the background to the invention lies in the way such documents were, according to the application, commonly archived and accessed.
2. As explained in the application (see "BACKGROUND INFORMATION"), companies issue various securities, but there are different ways of referring to them in different parts of the world. For example, "T" might refer to AT&T in the US, but to Telos in Canada. Vendors of financial information sometimes use a two-part name for securities or companies, for example T.US, IBM@GB, IB.EG. These are not used in a consistent way: IBM@GB and IB.EG might both be used to refer to the same company, T.US might refer to an AT&T security, T@US might refer to a security issued by a different company. All that creates problems with archival and retrieval. It is difficult to locate all documents, and only those documents, that relate to a particular company.
3. That is the problem the invention sets out to solve. The solution involves the use of a table of symbols in some normal form. These are called "master symbols". The normal form might, for example, use only upper-case letters, and "@" rather than any other delimiting character. The normal form corresponding to "t.us" might then be "T@US". For each master symbol in the database, there is an associated "parent identifier". The latter is a symbol that uniquely identifies a company. An example would be "T@US" associated with "AT&T". It is the function of this database to keep track of which master symbols correspond to which parent identifiers.
4. The database of master symbols is used when documents are archived and when they are retrieved. When archiving, whatever symbol a contributor indicates should be associated with a document (e.g. "ib.EG") is put into normal form (e.g. "IBM@GB"), and the document is stored with the corresponding parent identifier (e.g. "IBM"). When retrieving, a user enters a search term (e.g. "t.CA"), which is normalised (e.g. to "T@CA"), and then documents assigned the corresponding parent identifier (e.g. "Telos") are retrieved.
The core method
5. Claim 1 according to each request defines a document repository system. The system comprises a number of technical features, for example "a storage device", "a network interface", and "a processor. The bulk of the claim is the method that the processor is adapted to carry out. This method is slightly different from request to request. However, there is a core method, common to them all, and it is useful to set that out at the start.
6. The first part of the method sets up a table of master symbols: a symbol is normalised, the result being a master symbol; a parent identifier is assigned to it, and the association is stored. The same parent identifier can be assigned to more then one master symbol.
7. The second part of the method creates a database of documents, and a table that links documents with parent identifiers: a document identifier is generated; the document identifier is stored with the associated parent identifier; the document is stored together with its parent identifier.
8. The final part of the method retrieves documents from the database: a symbol is normalised; the table of master symbols is consulted; the corresponding parent identifier is retrieved; the database is searched; and a document with the parent identifier is retrieved.
9. The core method as just set out could well be performed without the technical aid of a computer. One readily imagines a librarian creating an index. He would not include all the variants of each entry (e.g. "IBM", "iBM", "IbM", and so on) but choose one representative form. He might provide a unique identifier for each book (or just use the ISBN), a list of authors linked to book identifiers, and a list of index entries linked to book identifiers. When a user of the library asks for books about, say, "International Business Machines", the librarian might look up "IBM", discover that the books are indexed under "IBM - COMPANY" and so find a list of the required books. There would be nothing technical in what the librarian would be doing. He would simply be a good administrator, solving the non-technical problem of storing and locating books.
10. Thus, the Board considers that the core method is not technical. According to the established jurisprudence of the Boards of Appeal non-technical features do not contribute to inventive step, and can appear in the formulation of the technical problem, when inventive step is at issue (T 641/00 Two identities/COMVIK, OJ EPO 2003,352). That means, that the novelty or obviousness of the core method is not an issue, and that the technically skilled person can be considered as being faced with the technical task of implementing it using a computer system.
The main request
The main request
11. In the system defined by claim 1, the processor is adapted to carry out the steps of the core method, set out above.
12. The invention defined in claim 1 does not involve an inventive step (Article 56 EPC 1973) if, when faced with the technical problem of implementing the core method, the provision of a storage device, a network interface, and a suitably adapted processor would have been obvious. Here, "suitably adapted" means that the table of master symbols, the table linking document identifiers and parent identifiers, and the database of documents must be electronically stored; and it must be the processor that carries out the steps, including normalisation.
13. In the Board's view, any computer implementation must make use of storage, and whatever carries out the method steps, which the skilled person seeks to implement, can be called a processor. Thus, inevitably, there would be storage, and all the steps of the core method would be performed by a processor, which must be adapted to perform them.
14. The only options open to the skilled person would be the provision of a network interface, and the choice of storing the tables and database in electronic or magnetic storage (the claim does not specify the form of storage, but it can be taken, implicitly, as normal computer storage). In the Board's view, in choosing to implement the core method on a general purpose computer, both of those optional features would have been provided. Since general purpose computers were known for their ability to store and process data reliably and quickly, their use would have been obvious.
15. The appellant has argued that the invention makes the retrieval of relevant documents easier and more accurate. That argument bears on the non-technical problem of librarianship. The same advantage accrues to any library, regardless of its technological substrate; regardless, indeed, of whether or not there is a technological substrate at all. The Board does consider that retrieval and accuracy might, in some circumstances, be technical issues. The Board in T 0654/10 evidently considered them to be so in that case. However, the Board, in the present case, has reasons (set out above) for considering them to be non-technical in this case.
16. The argument that a librarian would not, or could not, maintain the information required in his head, so that the invention does not amount to the automation of a mental act is not apposite; nor would it be sufficient to change the result if it were. The claimed repository need not be large. According to the claim, it is sufficient to have one master symbol, one parent identifier, and one document. In such a case, it would be entirely possible to perform the method mentally. Naturally, even the problem of librarianship would then be trivial. A librarian would normally have to deal with more than can easily be done without some assistance. He might, for example, as librarians often have done, use a card index to store associations between master symbols and parent identifiers. The method, however, remains the same, independently of any technical substrate used for storing information, whether it be pencil and paper, cardboard and ink, or digital storage.
17. The further argument that implementation using a computer system would allow the repository to grow larger than without cannot change the Board's assessment that using computers for what they are good at - storing and processing large amounts of data quickly - was an obvious measure.
18. The argument that the invention cannot be seen as an automation of a known method, because the method, in this case, was not known, fails because the novelty of non-technical methods is irrelevant (cf T 0641/00, headnote 2).
19. In conclusion, the main request cannot be allowed, because the subject matter defined by claim 1 does not involve an inventive step (Article 56 EPC 1973).
The first auxiliary request
The first auxiliary request
20. In claim 1 according to this request, two features have been added. The first is that the master symbol has "a number of symbol segments defined by a symbol template, wherein each symbol segment comprises a text string", and the second that "if determined that the normalized symbol contains all symbol segments of the symbol template then" searching a normalization table database takes place, in order to find a master symbol.
21. The effect of the first of those features is that it is possible to see whether the input symbol is complete.
The second feature comprises an "if ... then" clause. If the symbol is complete, then the search is performed. However, the claim does not specify what is done if the symbol is not complete. The first problem with this formulation is that the use of the test without both the positive and negative consequents seems to have no basis in the application as filed. The second problem is that the test has no effect, either technical or non-technical, since the claim does not exclude that a search is carried out even if the input symbol is incomplete. This is in fact the same situation as according to claim 1 of the main request
22. The appellant's argument that the template allowed for filtering, and that it was better to obtain no results than wrong results, is, consequently, not to the point. The method carried out by the processor does not filter any results and does not prohibit a search using incomplete symbols.
23. In conclusion, the first auxiliary request cannot be allowed, at least because the subject matter defined by claim 1 does not involve an inventive step (Article 56 EPC 1973).
The second auxiliary request
24. The final clause of the addition reads "the symbol segment corresponding to a parent identifier." That does not accord with the invention as set out in the application as filed. In the only example, symbols have a two-part form, e.g. "T.US". See points 2 - 4, above. The segments, in that example are "T" and "US". The corresponding parent identifier would be "AT&T". For "T.CA" the corresponding parent identifier would be "Telos". No segment alone corresponds either to AT&T or to Telos; it is the whole symbol that corresponds to a parent identifier. The application as filed does not provide a basis for a correspondence between symbol segments and parent identifiers.
25. The Board concludes that the second auxiliary request cannot be allowed due to the present of subject matter that extends beyond the application as filed (Article 123(2) EPC).
26. The Board, additionally notes that the subject matter of claim 1 seems to lack inventive step. The system defined by claim 1 does not make use of the records produced in the "predominant use record". The effect is simply that certain information about a contributor is stored. Although the storage in a computer system is technical, the indication of the meaning of the information to be stored (here: the most frequently submitted symbol segment) is not. Thus, the invention defined in claim 1 solves the problem of implementing a modified core method. Technically, some extra storage is involved, and the skilled person would find it just as obvious to provide the technical features (a storage device, a network interface, and a suitably adapted processor), as she would have done for the main request.
The third auxiliary request
27. In claim 1 according to this request, the master symbols are, as in the first auxiliary request, defined in terms of a template. In this request, too, a test for completeness is made, but additionally some action is taken if the symbol is incomplete: the missing segments are replaced by "preference defaults" that have previously been stored.
28. The appellant argued that the invention defined in this version of claim 1 addressed the problem of allowing a machine to cope with missing data, or data in different formats.
29. In addition to implementing the core method, set out above, the processor in this version of claim 1 fills in missing segments with default values. That can be for the convenience of the user: it would be enough to enter "T", and "T.US" would be understood, if the default had been set that way. The setting of default criteria of the form "assume I mean X, unless I say otherwise" is simply a common way of using language, and there is nothing technical about it. For example, a user might inform the librarian that he is looking for documents related to several symbols, and say, "They all end in '.US,' so I won't repeat that." Thus, the use of defaults simply adds to the non-technical method which the skilled person should implement. That being the case, the reasoning set out for the main request applies to this request too. The Board does note, however, that the prior specification of defaults was common practise amongst programmers.
30. In conclusion the third auxiliary request cannot be allowed, because the subject matter defined by claim 1 does not involve an inventive step (Article 56 EPC 1973).
For these reasons it is decided that:
The appeal is dismissed.