Reader Comments

Post a new comment on this article

On the recognition of positive/negative reinforcement

Posted by alorozco53 on 09 Sep 2013 at 22:33 GMT

My inquiry has to do with the way Dee Chee recognizes reinforcements from the teacher. Since the system is suppose to simulate simple (one-syllabled) phoneme gathering, then it must also deal with the problem of recognizing "'more complex"' words such as "well done”, “good”, “clever”, etc. Yet, this paper doesn't really specify how this issue was treated, and instead, it suggests that a different type of phoneme recognition was utilized only for the reinforcement scenario. Thus, I would like to know if what I just explained is correct, and if it is, I would like to point out that this fact might discredit the main goal of simulating a language-acquisitive agent.

No competing interests declared.

RE: On the recognition of positive/negative reinforcement

caroline1 replied to alorozco53 on 13 Sep 2013 at 14:34 GMT

We investigated 2 hypotheses in these experiments (page 4)
1. A synthetic agent embodied in a humanoid robot can learn one-syllable word forms through interaction with a human teacher talking naturally;
2. Learning is augmented by contingent reinforcement, if the teacher makes an approving comment when a proper salient word form is uttered.

Hypothesis 1 was confirmed, as reported and discussed (page 11). A key factor is the sensitivity of learners to the statistical distribution of linguistic elements. The core of our work is not undermined by shortcomings in the reinforcement process, designed to augment learning.

Regarding Hypothesis 2, as you point out the phonemic recognition of reinforcing terms like “well done” etc is based on holistic recognition of the term, unlike the single syllable recognition in the core learning. The rationale behind this approach was that:
- We wanted to simulate contingent interaction, found to be critical in human language learning
- Human interaction involves numerous modes of expression as well as words, e.g prosody, facial expression, gestures. We were not able to simulate these.
- We were confined to using actual terms collected from the speech of our participants. A pattern matching process was used to detect these terms. The holistic recognition was an approximation to human reinforcement through interaction. Note there was no attempt at negative feedback.

No competing interests declared.