Sunday 20 July 2014

ChatBot 2 - Sioni at Pandorabots

Llanllwch - Sion's home town
AIML is a common approach for ChatBots. I take a look at Pandorabots and and see if it meets my needs.

Sioni the Person

I set out to build a ChatBot that would capture facts about a person such as place of birth, parents and stories from their life etc. A ChatBot that could provide much more than a simple family tree and some photos.
I need subject to test with so I've decided to pick myself.


To include all my personal history and stories is impractical, so I'll restrict the test to a short period of my life in a small village called Llanllwch in Carmarthenshire, Wales. I was about 3 years old when we moved there and left when I was about 7 years. That is, a short, contained period of my life in a single location, with a very limited number of friends, at an age where I only remember key events and short stories - perfect.

When we moved to Llanllwch, I was a Welsh language speaker. My mother used to call me Sioni (Johnny) or Sion (John). So I've picked this name for my ChatBot. Sioni Llanllwch. However, I'll assume for the time being that Sioni speaks English. Actually he does, so that's ok. However, there will (necessarily) be a smattering of Welsh terms, names etc that featured in Sioni's life. This will make things more interesting.

Sioni at Pandorabots

There is a Standard created to capture a ChatBot's knowledge base, it's AIML. The Artificial Intelligence Markup Language. A term (I now realise) that implies a broader range of capabilities than it actually covers.

There are a number of websites where you can create your own ChatBots based on AIML.

I've picked Pandorabots - it's free and easy to use.

Note: If you want to replicate my activities, then sign up and create a ChatBot.

The first thing I did was to remove all the default AIML files. I'm not sure if this is wise, but there are a lot of default files that contain information I simply would not have known as a kid!

I started using the "Train" option but this is rather laborious. Not only that but it's not a great way to capture a lot of information in a way that can be edited later.

You could edit the AIML files directly, but this requires a very good understanding of the XML syntax and the result is sure to be error prone. Also, the XML structure will simply get in the way of capturing a simple dialog.

To being with, I've opted for uploading my own text file using the Dialog to AIML Parser. This simply takes a relatively freeform question and answer structure and converts it into AIML which you can then load into your ChatBot.

The free form text I've used is here.
When converted to AIML, it looks like this.

Upload to your Pandorabots by going to the AIML Tab for the ChatBot properties.
Remember to publish your ChatBot after uploading new files.

You can talk to Sioni here.

Here's an example chat I had with Sioni when I knew exactly what to ask:

Me: Did you ride a bike?
Sioni: Yes, after a while we had bikes and we used to ride up and down the road to the railway bridge and back. Sometimes we went up Allt y Crap hill and rode down at high speed.

Me: Where did you live in Llanllwch?
Sioni: We lived in Leat House. It was a modern house designed by my father.

Me: Do you have a brother?
Sioni: I have a brother called Simon.

Me: What's your mother's name?
Sioni: Ileana Jones or Ileana Maesydd to some who knew here when she was younger. 


So far so good! However, if you don't match the questions exactly, the answers are repetitive and less than engaging:

Me: Did you have a cat?
Sioni: I have no answer for that.

Me: Did you have a dog?
Sioni: I have no answer for that. 


You can include wild cards which carry the matched part of the sentence through to the ChatBot answer but this isn't that useful when representing a knowledge base.

For example,
Me: "Do you like *"
Sioni: "Yes I like *"

More advanced AIML

To take advantage of AIML's power, you need to leave the simple text format and jump in to direct XML editing of the AIML.
In simple terms it has patterns to match on and templates for text output.  It supports recursive text matching through a key-word <srai>. This one construct is used to create numerous synonyms for words or phrases and provide a seemingly more intelligent agent.
A more elaborate AIML file can be found here.

Dialog example:
Me: did you have a bike?
Sioni2: After a while we had bikes and we used to ride up and down the road to the railway bridge and back. Sometimes we went up Allt y Crap hill and rode down at high speed. 

Me: Tell me about Allt y cnap hill
Sioni2: We used to watch the hunt from the top of the hill. I remember watch the fox get away. That was good to see. 

Me: and where else did you ride your bike?
Sioni2: I had a bike and when I learnt to ride I rode straight in to Any May! 

Me: what was your father's name?
Sioni2: Donald Jones, or Jones the Architect. 

Me: did you have a brother?
Sioni2: 1 I have a brother called Simon. He's a year younger than me. 

Me: did you know a tramp?
Sioni2: There was a tramp who lived in Allt y cnap woods sometimes. 

Me: tell me more about the tramp
Sioni2: When I was a kid there were lots of them arround. They used to go from one parish to another and get food from the vicar. 


Conclusion

This is not what I'm after...

To be fair, AIML does allow a whole host of additional functionality which amounts to more flexibility in pattern matching for pseudonyms etc. I could have made Sioni a lot more convincing with time and effort. Maybe that's exactly what's required.

There's no doubt that Pandorabots have also done exceedingly well at the Loebner prize over the years  - see their blog for an overview. AIML based ChatBots can be good - very good!

So what's my problem?

To begin with, AIML based systems follow a pattern matching approach and do not track context or represent knowledge in a way that can be queried in an ad-hoc manner. It's a bit like teaching a horse to count - see clever Hans.

The other problem with this approach is the input data needs to be highly structured. Ideally some means of inputing free form text would be good!

There's plenty of information on the web on the pros and cons of AIML. Take a look here.

So for now, I'll leave AIML based ChatBots and take a look at what other approaches have been tried and how they differ from (completely) AIML based systems.


No comments:

Post a Comment