Knowledge could be identified as a ingredient for the computational process of thinking, irrespective of whether the computation is rational or irrational. A rational computation yield an intelligent behaviour by making best use of
knowledge whereas an unintelligent behaviour could be well understood as;
Vast knowledge is available to us since the advent of life. But, all this knowledge is mere useless until it can be utilized so as to be able to determine its implication and draw relevant inference from it[2]. Evidences[3, 4] suggest that in order to do so, human brain structure the information. A multi store model of human memory system[5] was proposed, way back in 1968. The model claims a memory system specifies the underlying infrastructure for rational behaviour for animals. Human intelligence immensely rely on such infrastructure. Among all such type of memory structure Long Term Memory(LTM) is of special interest as it can store information for unlimited period of time. Further Declarative and non declarative are two logical separation of LTM, where first can store knowledge that someone can tell others and later contains knowledge that someone can show by doing[6, 7]. Two different type of declarative memory - semantic memory and episodic memory keeps general world knowledge and experience gained over time respectively. Briefly, memory includes those aspects of human which they gain by their own observation and experiences which changes over time to revise beliefs about an object, event, or action of the world.
Research in the field of knowledge representation has been pursued since the concept of modern computer. There has been much development in the area over the last few decades[8]. In the context of intelligent processing, most of the existing knowledge representation techniques provide strong evidence towards processing of a particular language for a particular problem, but most of the such system lack the significant objective, of .
Text processing usually involves performing various computations on the unstructured text. A more efficient way to process text could be obtained by first transforming the text into a structured format and then fed for processing. There is growing interest for knowledge representation technique for natural language[8] with systems that are able to store information in a structured format so that we could say it knowledge.
The remainder of this paper is organized as follows: section 2 provides the background review and motivation towards the representation followed by desidarata in Section 3. Section 4 contains the formal structure to illustrate the fundamental intuition of representation followed by experiment and evaluation in section 5. Finally section 6 briefly concludes the paper.
Human memory has long been studied and attempts have been made to map the same, but it is a very complex system whose absolute structure hasn’t yet been obtained. Recent research [9] has shown that human memory is not located in one particular place in the brain, but is instead a distributed structure in which different parts of the brain act in co-ordination with one another. Thus, an actual representation might be a large complex network, in which the nodes symbolize the various elements that join at edges to form a memory. A memory system specifies the underlying infrastructure for rational behaviour for animals. Human intelligence immensely rely on such infrastructure. Briefly, memory includes those aspects of human knowledge which they gain by their own observation and experiences which changes over time to revamp beliefs about objects, events or actions of the world.
Psychological studies fundamentally classify human memory system in two
classes based on their temporal existence 1. Short Term Memory(STM) and 2. Long Term Memory(LTM). Short Term Memory holds current observation and survive for very short periods of times, which are assumed to be in the order of seconds[10]. Where as Long Term Memory can detain acquired knowledge for unlimited duration[11]. LTM is further divided into two subcategory[5]; Declarative and Non-declarative storage, where first can store knowledge that someone can tell others and later contains knowledge that someone can show by doing[12], which is of our special interest. Evidences suggest that declarative memory contains two different types of declarative knowledge, which are separately stored in different storage infrastructure known as semantic memory and episodic memory[11].
Semantic memory refers to the facts, information and features about objects which are internally used by brain in order to determine what the object is. Semantic Memory is used in the field of AI in order to determine the meaning of a word or fact such that the computer system is able to understand it and perform computations on it. Episodic memory represents the chronological record of a persons experience, where some specific events are only stored for long interval of time[13]. Episodic Memory can be explicitly accessed and an episode can be reconstructed using it. When the personal context is shared from the Episodic Memory, it becomes a part of the Semantic Memory of the person. This generally happens when there are some facts or information in the episodic memory which is repeatedly learnt over time [14, 15].
Beyond representation, it also important to define how operations are performed on knowledge. While retrieving instances of knowledge it is important to determine which information is relevant to the task at hand. Insignificant knowledge needs to be removed so as to avoid accumulating non-essential information. The essence is reinforce important parts to make sure that the knowledge is not forgotten. Five different operations that a knowledge representation technique should be able to perform in order to enable the aforesaid are listed below[16]:
1. Encoding deals with determining how a new data will be transformed
2. Storage handles the internal storage structure for the KR. Further, it needs
3. Retrieval manages when and how retrieval will be triggered. Most impor-
4. Forgetting tackle the removal of insignificant knowledge so as to avoid
5. Consolidation take care of reinforcing important information to make sure
The essential requirement for any system to exhibit human-like intelligence is to be able to draw conclusions from the knowledge the system already possesses. This requires the system to be able to represent relationships between various beliefs, includes not only inferring any new rules encountered but also to be able to update the existing ones. In order to establish an intelligent conversation, the system must be capable of determining the feasible choices by associating it with the existing patterns and then on the basis of the feasibility of various choices, be able to choose a alternative.
The human-machine interaction system can work towards achieving the goal only when it supports some varied capabilities that are required for a system to achieve human level intelligence [17, 18]. Desiderata for completeness and evaluation of a cognitive agent for human machine communication, whether directly or through embedded process are:
Adaption could be bolstered if one can properly categorize knowledge and can recognize and extract rational knowledge. Therefore a proper knowledge structure along with satisfactory retrieval mechanism is essential.
The essential requirement for any system to exhibit human-like intelligence is to be able to draw conclusions from the knowledge the system already possesses. This requires the system to be able to represent relationships between various beliefs, which includes not only inferring any new rules encountered but also to be able to update the existing rules.
The system must possess the ability to encode and store the result of previous experiences and to be able to retrieve them later and the inferences drawn by them at the previous stage. Additionally, the rules must be generalized in memory to be able to learn by applying them to similar problems or other tasks in the same domain.
Aforementioned desiderata express the abstract notion of underlying requirement. Throughout the work we have emphasize on developing the knowledge infrastructure which is fulfilling part of requirement discussed above. The proposed mechanism is psychologically plausible and never denies the possibility of other better methodology. The design perspective would go around the human centered functionality.
Knowledge representation fundamentally is a function which takes input from one domain and returns an output belonging to another one. In other words, it is a mapping of entity between two domain. There could be enormous variety of structural method through which one could perform such a mapping. Due to human centric bias, this section will supply the formal description of episodic memory based knowledge representation structure for text data.
Episodic memory in general is a network of experience[11] gained by individ-
ual. Which is certainly different for different individual. Since experiences are gained over time, therefore it requires a temporal order of maintenance. Multiple experiences gained over time could have variety of interconnected contexts, that is again a crucial challenge for modeling.
Definition 1. Episodic memory is a 5 tuple consists of chronological sequence of episodes and temporal relationship between them.
where, refers to the unique identification associated with individual episode,
records the temporal parameter for each instance of individual episodes, which further could be used to establish the temporal occurrence of individual episodes.
indicate the references to nextepisode; whereas sig-nificance of a particular instance of episode could be recorded in
tuple.
establishes the similarity connection between instances of episode which itself composed of two tuple
. Pair of parameter defines the strength of similarity between episodes. It comes into picture when a particular episode is linked with number of other episodes due to context similarity.
Figure 1: Sequence of Episodes in Episodic Memory
The knowledge representation generates a graph where there are various episodes linked in chronological order, as shown in Figure 1. In the memory it can serve to find the last gained knowledge about a particular context as well as to answer a query whose context could be inferred from previous experience.
Episodic memory is composed of instances of episodes linked together depending on the strength of similarity and temporal occurrence. Among such instances, a few could be more significant then other in a particular context.
Definition 2. Each episode in turn consists of instances which have smaller data chunks stored in the form of nodes. Instances are connected to previous instances on the basis of similarity. An episode from node s to g is a sequence of nodes (,...,
) such that s=
, g=
. Individual episode could
be formally expressed as a 5-tuple entity;
where, is again unique identification for individual nodes within an instance of an episode.
keep the sequence information about the nodes i.e. the order of occurrence of nodes within an instance of episode.
indicate the temporal parameter about individual nodes within each instance of episode.
contains the reference to next node within an episode.
is again there to state the significance or importance about the context of particular node. This would be initially fixed constant value for any newborn episode and can change according to uses or priority of the episode.
Individual episodes are further composed of series of nodes; which are organized as a collection of sub nodes expanded in three logical layers; primary, secondary and ternary.
Definition 3. A node is an elementary data unit of episodic memory. It consists of vital decision making information. A node is a 5 tuple structure.
Where, indicates the unique identification of a node. Depending on type of node (indicated by
), a node contains relevant information;
refers to the collection of keywork associated with a node in a context,
records the episode identification of within which current node belongs.
As an elementary data unit and in line with the complexity of text representation, the three types of nodes are a logical separation of text is based on the building block of a particular language, which turns out to be a language dependent functionality.
Primary node indicates the subject of the sentence. It acts as an anchor to the underlying context. Such that it could be used as a reference point whenever that is called upon in future. The primary node has been chosen from the tags such that it contains the information regarding the main subject of the instance. Table 1 contains the list of tags associated with primary node.
Table 1: Tags for Primary Node
Secondary node works as the sub nodes of primary and keeps the information about the subject of the primary node. This helps in determining the attributes of the subject. Table 2 comprises the possible tags to identify a secondary node.
Table 2: Tags for Secondary Node
Further, tertiary node indicate the adverbs related to the property being referred to in the secondary node. Table 3 shows two observed tag for tertiary node.
Table 3: Tags for Tertiary Node
4.1. Operations
Beyond primitive structure, we present how variety of operations would operate over the dynamic structure to deal with the transformation of informal input to a computable knowledge structure. While retrieving instances of knowledge it is also important to determine which information is relevant to the task at hand. Also, it is important that the insignificant knowledge be removed so as to avoid accumulating non-essential information. At the same time, it is also of essence to reinforce important parts to make sure that the knowledge is not forgotten. Various operations [16] that a knowledge representation technique should be able to perform in order to enable the aforesaid are:
4.1.1. Encoding
Encoding deals with determining how a new data will be transformed following the rules of the Knowledge Representation such that the original structure of the Knowledge is maintained.
Whenever a new action sequence is observed, it is required to examine whether the new action sequence would be considered as a new episode or an instance of the ongoing episode. We assert in general that if elapsed time between timestamp of current instance and timestamp value of last observed instance is greater then the time elapsed between the last observed instance and first instance of that episode then new action sequence would be considered as start of new episode.
Definition 4. Episode determination is assimilated as a Boolean valued function. Function returns 1 if new episode is required to be started, similarly returning 0 indicate that current instance continues with the current episode. We would consider
and
as different timestamp
and
indicates start timestamp and last timestamp of episode in which last instance belongs. Further
returns the timestamp of current instance for which we would like to determine the episode.
Where is the time-stamp constant which is a tuning factor that may be set according to the application requirement i.e. when an application demands new time-stamp be created within a small time-interval, smaller value of
would be preferable.
Nodes are an elementary unit of proposed episodic memory which is formed based on the classification of input text based on a tag it acquire. Tag features of the input text are used to organize the node as a primary, secondary and ternary node.
4.1.2. Storage
Storage handles the internal storage structure for the KR. The storage handles the changes when any other operation is performed, so as to reflect the operation but at the same time retain the structural rules.
The structure used to store an episode will influence the efficiency of addition or modification of knowledge as well as relevant retrieval. As it has been experimentally established that graphs would best complement to assimilate a non linear structure like an episodic memory[16]. Therefore, to take care of such structure it maintains an interaction graph.
Definition 5. The Interaction graph is a directed graph denoted as , is of two tuple structure of (
). Where
is a finite, nonempty set of episodes and
is a set of links between pair of episodes.
Conventionally, the set E and L represents the vertices and edges of a graph respectively.
Encoded tags are used to classify the input text into three node type; primary , secondary and ternary. Storage structure treat primary node of the input text as root for that node instance. Secondary nodes are directly connected to primary node whereas ternary nodes are directly connected to secondary nodes. Therefore a evolutionary model of knowledge structure is stored in form of node for each input instance. A symbolic structure is shown in the figure 2.
Figure 2: Schematic View of Nodes within an Episode
4.1.3. Retrieval
Retrieval manages when and how knowledge will be triggered. The retrieval process defines how the past episodes and instances may be retrieved from the storage. It involves initiation condition, selection, and similarity determination
Spontaneous retrieval is initiated when an episode is retrieved in order to link similar instances whereas deliberate retrieval is initiated when the conversation demands an answer which might be present in the memory. The knowledge structure backtracks and finds the first node it encounters that has the same primary node or similar to the current node. Here Wordnet[19] plays significant role to determine the similar knowledge.
Definition 6. Whenever it encounters a node which satisfies this condition, it applies the function S, where i is the node on which function is applied and computes value for both, the current node and the node being compared, j, using
the equation,
Where, k, l, m are constants such that k + l + m = 1 and k > l > m, in order to ensure that maximum priority is given to the primary node.
Here, is the inverse of distance between the primary nodes which is calculated by finding the similarity between them using WordNet,
is the inverse of least distance between the secondary nodes and
is the inverse of least distance between the tertiary nodes.
If (threshold constant) then, the difference between the two nodes is acceptable and they are linked, otherwise previous nodes are searched until similar node is found or start of time is reached. The weight of any new link will be initialized at the time episode is linked as 1.
4.1.4. Forgetting
Forgetting performs the removal of insignificant knowledge to avoid accumulating non-essential information. Therefore the organization of the episodic memory changes over time[20]. Usually forgetting target those episodes which is least used. Therefore it weakens the link and decrease the utility value of that instance over the period of time.
Definition 7. Link weight has to decrease to weaken the significance of an instance. At time t, the new weight depends on the difference in time elapsed between target instance to the current instance and utility of the target instance.
Therefore, could be updated as;
Where is the difference between the time stamp of target instance and current one.
refers to the utility of the particular instance. Further, x is link weight constant, y is forgetting constant and z is utility constant.
The constants x, y, z are tuning factors whose values may be taken according to the application. Value of x can be chosen in the range (0,1) where values close to 1 implies slowest forgetting and closer to 0 implies rapid forgetting. Value of t must be chosen such that effect of passage of time may be reflected on the link, where the value must be kept in the range (0,1) with greater value signifying rapid decrease in weight links with time. The weight of any new link will be initialized at the time episode is linked as 1, while the utility of instance is fixed as 1 at the time of instance creation itself.
Definition 8. The utility value of an instance at time t of an instance will change with respect to the elapsed time measure. Therefore new utility could be
computed as,
where, y is the forgetting constant which will be consistent with the forgetting value with respect to time of links.
If the weight of any link is lower than link threshold, or the utility of an instance is lower than utility threshold,
then the link should be severed or the instance be deleted, respectively. The deletion of instances or severing of links must be done while the application is idle.
4.1.5. Consolidation
Consolidation take care of reinforcing important information to make sure that the knowledge is not forgotten while it is in use. Also, it must be taken care of that some knowledge which is used frequently becomes a permanent memory after reaching a frequency threshold.
Whenever, difference is acceptable, and the instances are linked, the utility value of both the instances and the weight of the link is increased using the
equation,
The weight of the link is increased using the same parameters as the link weakened has to be performed during forgetting.
The utility value at time t is increased with reference to its previous value, threshold constant and forgetting constant. Consolidation is performed when the system is idle.
In this section, we present the experiments conducted to observe the working of each operation. At different instance of time, different input paragraphs are introduced to examine the snapshot of episodic memory coupled with various operators. Table 4 presents a snapshot of existing episode ahead of introducing input. Creation of episodes, instances of node within individual episode and effectiveness of operators will be observed to examine the working of knowledge structure.Firstly, three different paragraphs are given as input at three separate instance of time and observation will be made thereafter.
Table 4: Instance of an Episode before encoding the above said input
We would visualize the instances of node correspond to individual sentence in Table 5. Individual row in Table 5 reflects the unique identification mark for individual node, fragments of input belongs to variety of node type, time-stamp and next linked node.
Table 5: Instance of Nodes after Encoding
In continuation Table 6 present the newly created episodes based on acceptable time difference considering value of is equal to 0.1.
Table 6: New instance of episode after encoding
Broadening the knowledge infrastructure has been designed to work implicitly. Where system learns through experience to upscale the knowledge acquisition. The episodic memory has been developed in such a way that the system learns over time as more and more information is fed to the system.
5.1. Learning Experience
The estimated enrichment of the episodic memory is represented graphically through Figure. 3. Initially the episodic memory is created and linked to each other only through chronological sequence as they rarely have anything in common. However later on, slowly but steadily the links start to rise as the system starts finding some knowledge in common. After sufficient accumulation of knowledge, there is a steep growth in links as for almost every new memory, similar memory can be found out in storage. Learning of such kind eliminate the redundant knowledge which significantly reduce the dense search space to a sparse search space. Which should improve the retrieval opportunity of a query.
Figure 3: Learning shown as the variation of Total Number of Links in system with the
5.2. Retrieval Time
Time to retrieve older memories will change with the accumulation of knowledge. Best Case will be observed when continuous knowledge on the same topic will be given. Average Case will be observed when sufficient knowledge has been obtained and the topic on which knowledge is obtained is previously present in the memory.Worst case scenario will be observed when the topic is not previously present, therefore it is required to search for the topic till the start of time.
Figure 4: Efficiency of the System in terms of Retrieval Time
A graphical representation of the memory retrieval rate with respect to knowledge acquired has been given in Figure 4. Here, retrieval time denotes the time taken in order to find the similar node. It is based on the observance that, if the knowledge network increases the system needs to go to the start of time less and less. Which is because the system only has to find the latest instance of topic and updates the links in the episodic memory accordingly.
5.3. Question Answering
In order to evaluate the working of episodic memory knowledge infrastructure, integration with a different application were carried out. The objective was to demonstrate the functioning of retrieval.
5.3.1. Analysis of the System
Firstly, an analysis is done where the system is given a set of simple and complex sentences jumbled together. Now, a set of simple questions are fired on the simple text as well as the complex text. Similarly, a set of complex questions are also fired on on both the texts. Here, the questions are asked only for the cases where the answer is present in the text given to it previously. The results recorded are observed in Table. 7.
Table 7: Results of Question Answer Implementation for different scenarios
5.3.2. Comparison with Cleverbot
For demonstrating the capabilities of our system with respect to other ar-tificial intelligence question answering machine, we compare our system with ”Cleverbot”[21]. Cleverbot is a very popular web application which was developed by Rollo Carpenter. The reason for selecting Cleverbot when there are so many question answering machines available is its unique feature of developing database by having conversation with people. During its launch it had 200 million conversations which now has increased to 265 million. When asked a question, Cleverbot tries and matches it to the exact phrase. If no exact phrase is found, it searches for keywords in input and then retrieves the best match from database.
Table 8: Efficiency of Our System & Cleverbot based on Correct Answers
Therefore, to show the comparison between the two, knowledge related question were fed to our system. As shown in Table. 8 the results were found to be comparable when our system was fed appropriate data. However, it could not answer any questions for new topic because it entirely depends upon its accumulated knowledge and cannot give answers to such questions. We can see that the Cleverbot performs better for any given topic. Although, in case of a known topic, keeping in mind the vast difference between the database of the two, the observations made were quite satisfactory.
In this paper we have presented a psychologically plausible knowledge representation infrastructure to organize text data. The intuition was to have text as a knowledge. Formal structure of an artificial episodic memory and number of operators were defined. Wordnet was used to supersede the requirement of semantic memory. Proper functioning episodes and operators were examined. Finally evaluation of the knowledge structure were presented to establish the claim. Looking at ways to deal with topics not seen before is part of an ongoing research.
[10] J. E. Laird, A. Newell, P. S. Rosenbloom, Soar: An architecture for general
[11] J. R. Anderson, Act: A simple theory of complex cognition., American
[12] A. D. Baddeley, Human memory: Theory and practice, Psychology Press,
[13] T. V. Bliss, G. L. Collingridge, et al., A synaptic model of memory: long-
[14] G. McKoon, R. Ratcliff, G. S. Dell, A critical evaluation of the semantic-
[15] D. L. Greenberg, M. Verfaellie, Interdependence of episodic and seman-
[16] A. M. Nuxoll, Enhancing intelligent agents with episodic memory, Ph.D.
[17] P. Langley, J. E. Laird, S. Rogers, Cognitive architectures: Research issues
[18] S. Pushp, B. Bhardwaj, S. M. Hazarika, Cognitive decision making for
[19] G. A. Miller, Wordnet: A lexical database for english (1995).
[20] A. Nuxoll, D. Tecuci, W. C. Ho, N. Wang, Comparing forgetting algorithms
[21] R. Carpenter, Cleverbot [computer program] (2015).