Dialogflow is a natural language understanding platform that makes it easy to design and integrate a conversational user interface (definition from GCP). In essence, you design conversational sequences in the Dialogflow UI and then have the conversational exchange with the user sent to Dialogflow upon which it immediately responds with another message as an answer.
Commons use-cases are short-lived chatbots, for example ordering a pizza (case study) or assisting as an FAQ-bot using knowledge bases to answer customer questions. Here, a user is unlikely to interact with Dialogflow for an extended period of time and thus Google has decided to only preserve Dialogflow sessions for 20 minutes (source). After the 20 minutes pass, Dialogflow completely forgets that it ever had a conversation with the associated userId/sessionId. Unsurprisingly, sending a new message to DialogFlow will not work as the session contains all contextual information about the inputs and where the conversation left off. Dialogflow will answer with the “I don’t know what you mean. Could you please repeat that” fallback message which will leave your end-user confused.
A recent project of mine required me to keep a conversation going via text messages (SMS). For the multiple days to multiple months time horizon of the text conversation the 20 minutes session length needed to be extended to, well…, infinity. Luckily, some reverse engineering of how Dialogflow session management works in conjunction with trial-and-error of the API resulted in a simple and elegant solution. No messing with the confusing session, context, agent, users, conversation APIs, nor the need for processing webhooks.
There are many different ways to implement a session-persisting facade for Dialogflow as part of your application but for the sake of clearness, I will use a very simple service (think microservice, SOA service, hexagonal architecture domain service — whatever you fancy) to demonstrate the logic. You could easily replicate the behavior using a FAAS approach — it really isn’t that complicated :)
The client is the component that communicates with the DialogFlow API and GCP provides a client implementation for a bunch of languages but for the sake of this article we will stick to the REST API.
The essential endpoint is detectIntent which takes an input message, evaluates it, and returns a response.
--- REQUEST ---
POST https://dialogflow.googleapis.com/v2/projects/{my-app}/agent/sessions/1:detectIntent{
"queryInput": {
"text": {
"text": "I would like a pizza",
"languageCode": "EN-en"
}
}
}--- Response --- (non-relevant fields omitted) ---
{
"responseId": "95616542-7424-4db9-bf1f-eadd0a3837f0-b4cb2fcf",
"queryResult": {
"queryText": "I would like a pizza",
"parameters": {
"Size": "",
"time": "",
"Pizza_topping": [],
"date": ""
},
"fulfillmentText": "What day do you want to pick up your order ?",
"outputContexts": [
{
"name": "projects/{my-app}/agent/sessions/1/contexts/af57b401-9108-48ff-aae9-11cc287026ab_id_dialog_context",
"lifespanCount": 2,
"parameters": {
"date": "",
"Size": "",
"time.original": "",
"date.original": "",
"time": "",
"Size.original": "",
"Pizza_topping.original": [],
"Pizza_topping": []
}
},
"intent": {
"name": "projects/{my-app}/agent/intents/af57b401-9108-48ff-aae9-11cc287026ab",
"displayName": "order.pizza"
},
"intentDetectionConfidence": 0.69725364,
"languageCode": "en"
}
}
In the response, we see the fulfillmentText which contains the answer that must be returned to the user. Additionally, we also are provided with all outputContexts that DialogFlow has associated with the message. Now, to make things come full circle I will define session for you:
Session /ˈsɛʃ(ə)n/ — noun: A collection of contexts used to associate the state single conversation to the conversation flow definition.
What I’m telling you is that the contexts are all we need to know to make up our own session.
Well, luckily using the same detectIntent request we used above we can also provide this context information. When providing context information DialogFlow exclusively uses the provided contexts instead of the information is has associated with the session. This means that when passing context information it does not at all matter if a follow-up message is sent within the 20-minute limit or not — Dialogflow always uses the information we provide.
--- REQUEST ---
POST https://dialogflow.googleapis.com/v2/projects/{my-app}/agent/sessions/1:detectIntent{
"queryInput":{
"text":{
"text":"I want to pick it up on Tuesday.",
"languageCode":"EN-en"
}
},
"queryParams":{
"contexts":[
{
"name":"projects/{my-app}/agent/sessions/1/contexts/af57b401-9108-48ff-aae9-11cc287026ab_id_dialog_context",
"lifespanCount":2,
"parameters":{
"date":"",
"Size":"",
"time.original":"",
"date.original":"",
"time":"",
"Size.original":"",
"Pizza_topping.original":[],
"Pizza_topping":[]
}
}
]
}
}
I’ll omit the response because it is the exact same JSON as the response above just with a different fulfillmentText and other outputContexts. Now, we only need to rinse and repeat: Take the outputContexts from the response and insert them as queryParams.contexts. That’s it!
Obviously, we don’t want to hold this information in memory because the raison d’être solely is that a session can be continued after a long time, much longer than the lifecycle of a single service running. I recommend a document-based database as we truly only want to persist the contexts we get and then retrieve them without having any interest in manipulating the internal data structure.
When using the REST API directly the transformation to a JSON to be stored in a database is trivial. If you use the DialogFlow client mentioned above the internal Protobuf structure makes serialization/deserialization a tad more complicated but I’m sure you will manage so I will not go into detail here.
To interact with our service we must provide an API to the other services in our system. This interface can be extremely simple as we truly only require some identifier to the session and the message that is sent. Everything else can be encapsulated into our Dialogflow service.
interface DialogflowService {
fun evaluateMessage(
sessionId: DialogflowSessionId,
message: String,
): EvaluationResponse
}
Testing this solution is fairly easy: Just create start a new conversation with one session ID. Then mid-conversation change the session ID but keep the same context information et voilà, Dialogflow answers as if this was the session was well known.
One open point remains: Changing the Dialogflow conversation definition may modify the context names as these are autogenerated. While the behavior is not 100% clear to me it seems like they remain the same as long as no breaking context information changes. Dialogflow may provide some kind of backward compatibility but I have not needed to significantly change my conversation definitions and as such, I cannot give any truthful information. In essence, changing the conversation definition uses the same session-management approach as without the persisted context approach so the same migration strategies should work.
Besides the migration issue, this seems to be the way to go even though there is not much talk online about building a Dialogflow bot that should interact with the user for extended periods of time.
Don’t hesitate to leave a clap or two on this article if it helped you along on your Dialogflow journey. Have fun with Dialogflow — once integrated it’s pretty close to magic!
I'm interested in exciting ideas, big or small, and business partners alike. So drop me a message, and we will talk about the vision you are pursuing.