Key points
- Board 3.5.06 in the headnote: "A neural network defines a class of mathematical functions which, as such, is excluded matter. As for other "non-technical" matter, it can therefore only be considered for the assessment of inventive step when used to solve a technical problem, e.g. when trained with specific data for a specific technical task."
- Obviously, this headnote concerns computer-implemented neural networks. Claim 1 is reproduced below. It defines a software product.
- Inventive step is the critical issue.
- The Board, on the Comvik approach: "following the so-called Comvik approach (T 641/00), it is made sure that only features contributing to the technical character of the invention are considered for the assessment of, in particular, inventive step (cf. G 1/19, reasons 30 (F), cited from T 154/04). In particular, "non-technical" features, understood in this context as features which, on their own, would fall within a field excluded from patentability Article 52(2) EPC (see, e.g., T 1294/16, reasons 35), can only be considered for this assessment if they contribute to solving a technical problem (see also T 1924/17, reasons 15 to 19)."
- As an aside, the Board recalls that "Even technical features may be ignored with regard to inventive step if they do not contribute towards solving a technical problem (see G 1/19, reasons 33)."
- The claimed neural network has a new structure because the claim recites that "the hierarchical neural network [is] formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of a [low-density parity-check code]".
- The Board: "A neural network is composed of nodes, called "neurons", linked to each other by edges transmitting the output of one neuron to the input of another. Each neuron implements a parameterized mathematical function, typically a weighted addition of its inputs followed by a nonlinear operator (e.g. a threshold, [...]); the parameters are called weights."
- "In principle, it is possible, if cumbersome, to replace the inputs to each neuron by the mathematical functions implemented by the nodes of the previous layer and write down the mathematical function that the network implements as a whole, i.e. the output as a function of the input. "
- The Board: "The proposed network structure [...] defines a class of mathematical functions (see above points 7 and 8), which, as such, is excluded matter. As for other "non-technical" matter, it can therefore only be considered for the assessment of inventive step when used to solve a technical problem".
- "The Appellant argued that the proposed modification in the neural network structure, in comparison with standard fully-connected networks, would reduce the amount of resources required, in particular storage, and that this should be recognized as a technical effect, following G 1/19, reasons 85."
- "The Board notes that, while the storage and computational requirements are indeed reduced in comparison with the fully-connected network, this does not in and by itself translate to a technical effect, for the simple reason that the modified network is different and will not learn in the same way. So it requires less storage, but it does not do the same thing. For instance, a one-neuron neural network requires the least storage, but it will not be able to learn any complex data relationship. The proposed comparison is therefore incomplete, as it only focuses on the computational requirements, and insufficient to establish a technical effect."
- The Board here seems to imply that reduced storage and computational requirements without reduced performance would be valid technical effects and would cause the claim feature of "loose couplings between the nodes in accordance with a sparse parity-check matrix" to be considered for inventive step.
- The Board's reasoning that a technical effect can only be acknowledged if an advantage is obtained without simultaneous disadvantages seems unusual to me. Cf., e.g, GL G-VII, 10.1: "if [a] worsening is accompanied by an unexpected technical advantage, an inventive step might be present".
- The Board concludes that "The claim as a whole specifies abstract computer-implemented mathematical operations on unspecified data, namely that of defining a class of approximating functions (the network with its structure), solving a (complex) system of (non-linear) equations to obtain the parameters of the functions (the learning of the weights), and using it to compute outputs for new inputs. Its subject matter cannot be said to solve any technical problem, and thus it does not go beyond a mathematical method, in the sense of Article 52(2) EPC, implemented on a computer."
- "For the sake of completeness, the Board also notes the following: even if, as the Appellant argued, general methods for machine learning, and neural networks in particular, were to be considered as matter not excluded under Article 52(2) EPC, it would remain questionable whether the proposed loose connectivity scheme actually provides a benefit beyond the mere reduction of storage requirements, for instance a "good" trade-off between computational requirements and learning capability."
- This issue ( 'remains questionable') seems highly fact-specific and will likely turn on the evidence. There may also be an aspect of 'plausibility'.
- The Board: "whether a claimed invention is patentable [i.e., inventive] or not can often be decided by focusing on the technical problems it solves, and by means of which combination of features, be they technical or not, and by answering the question of whether this combination of features is obvious"
- This observation is very interesting but sheds a different light on the headnote: all (distinguishing) features can only be considered for the assessment of inventive step when used to solve a technical problem. This observation also seems to shed a different light on the Board's reasoning, cited above, starting with "even if, as the Appellant argued ...". Whether or not general methods for machine learning, and neural networks in particular, are to be considered as matter excluded under Article 52(2) EPC, is irrelevant, the only question is whether the problem of "reduction of storage requirements" (possibly: while maintaining good learning capability) is a technical problem and whether the claimed solution solves that problem and does so in an obvious way.
- As a comment, I don't know if there is a clear hint or teaching in the prior art to reduce the memory and computational requirements of a neural network by "loose couplings between the nodes in accordance with a sparse parity-check matrix of a [low-density parity-check code]". According to Wikipedia, "in information theory, a low-density parity-check (LDPC) code is a linear error correcting code, a method of transmitting a message over a noisy transmission channel", which does not immediately seems to pertain to neural networks (duly noting that IT is not my technical field).
Claim 1 of the main request defines:
A hierarchical neural network apparatus (1) implemented on a computer comprising
a weight learning unit (20) to learn weights between a plurality of nodes in a hierarchical neural network, the hierarchical neural network being formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of an error correcting code, wherein the error correcting code is a LDPC code, spatially-coupled code or pseudo-cyclic code, and comprising an input layer, intermediate layer and output layer, each of the layers comprising nodes; and
a discriminating processor (21) to solve a classification problem or a regression problem using the hierarchical neural network whose weights between the nodes coupled are updated by weight values learned by the weight learning unit (20)
or comprising
a weight pre-learning unit (22) to learn weights between a plurality of nodes in a deep neural network, the deep neural network being formed by loose couplings between the nodes in accordance with a sparse parity-check matrix of an error correcting code, wherein the error correcting code is a LDPC code, spatially-coupled code or pseudo-cyclic code, and comprising an input layer, a plurality of intermediate layers and an output layer, each of the layers comprising nodes; and
a discriminating processor (21) to solve a classification problem or a regression problem using the deep neural network whose weights between the nodes coupled are updated by weight values learned by the weight pre-learning unit (22)
and
a weight adjuster (23) to perform supervised learning to adjust the weights learned by the weight pre-learning unit (22) by supervised learning; and wherein
the weights are learned by the weight pre-learning unit (22) by performing unsupervised learning; and
the weights between the nodes coupled are updated by weight values adjusted by the weight adjuster (23).