Skip to main content

· 3 min read
Yuri Santana

Dialogflow sits in the middle between the user and our application, helping us capture conversations from natural language into useful and readable data.

Entities take care of extracting information and details from what the user says. They are present the moment you create your first intent and start creating training phrases, Dialogflow will automatically identify and label some words suggesting entities for you to match with an intent.

Having entities set in place will help you train your assistant and make it more efficient for the users. These can be created manually or by a JSON or CSV file.

There are multiple types of entities:

  • System entities:

These are default entities of Dialogflow and they match many types of common words like geographic locations or dates.

@sys.date
  • Custom or developer entities:

These allow you to define your own words to trigger an intent, you can also provide synonyms.

They come in handy when building your own assistant with specific words you want it to listen to and identify so you can provide an accurate response to your users.

Just remember that when providing a custom name, it can start with a letter, number, dash or underscore.

@computer_service
  • Custom or developer composite entities: These are built from multiple custom entities linked to be triggered together.

    @os_computer[@os_device @computer_service]
  • Session entities:

They are generated for a single user-specific session, from one conversation between the agent and the user.

These entities expire automatically after 20 minutes.

  • Regexp entities:

These utilize Regular Expressions to match more specialized entities from the user.

It is important to remember that the order in which you present your regular expressions to the agent matter because the search will stop once a valid match is found.

Entity vs Intent

Entities will make your development time quicker and, once identified by the agent, provide accurate responses to the interaction at hand. They are the way you have to catch important data from the user. Intent helps understand what the user request really means, it usually contains training phrases that help it identify what the end-user expression wants, actions to be performed after an intent is identified, parameters that will form the entity and dictate how data is extracted and responses that will be returned to the end-user.

Join the conversation

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

· 2 min read
Yuri Santana

There are 3 main types of Conversational Interfaces. Depending on the most appropriate for your users, we can establish some very clear differences and elements in common between them.

Conceptual diagram of speech systems (3)

Voice applications

They let you complete your task and control your device with spoken commands. In return you'll get audible responses from the application.

These are the elements of a voice-only interface:

  • Voice commands
  • Audible response from assistant
  • Audible list of options
  • Descriptive options for users
  • Sometimes a button is present to help start or end conversation

Text applications

Also known as chat applications, they allow the user to complete their task and interact with the interface by sending and receiving messages.

These are the elements of a text-only interface:

  • Text Commands
  • Written response from assistant with sound
  • Written list of options
  • Video or Images to help the users make a choice
  • Button available to make certain selections easier

Multimodal applications

They allow the user to interact with the interface by combining different features from voice and text, taking advantage of the strengths of each.

These are the elements of a multimodal interface:

  • Voice or Text Commands
  • Written or audible response from assistant
  • Audible or written list of options
  • Video or Images to help the users make a choice
  • Button available to make certain selections easier

Join the conversation

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

· 4 min read
Yuri Santana

There are multiple conversation design elements that we keep witnessing pop up in each assistant interaction that are based on conversational interface design principles.

These elements are those present in human interactions and help the user in the task they want to fulfill, reducing frustration and increasing user satisfaction levels. Some of the most common ones are:

  • Welcome message, greeting and farewell

The assistant should introduce itself and state its function so the user knows what to expect from the interaction.

Example: "Hello! I'm Alex and I can help you set up an appointment."

The farewell message should inform the user of the status of the request and end the interaction.

Example: "Your appointment has been set for X. Thank you for scheduling. Goodbye."

  • Questions and feedback

Questions redirect the user and help clarify the intent this one has when interacting with our interface. They help us collect useful information and keep the user engaged in the conversation.

Example: "Would you rather make an appointment for today or tomorrow?"

  • Feedback and confirmation messages

It should be a must in every interaction with the assistant. It lets the user know the assistant is working on their request, if it was confused or if it needs clarifying details. More importantly, it lets the user know the assistant understands their intent and they're working together to address their request.

Example: "Okay, I'll schedule it for tomorrow morning."

  • Informational interactions and suggestions

Informational interactions are those that present information in a way to answer questions. They provide a general overview of options and it's often followed by a question prompting the user to pick one of the choices presented in the message.

Example: "We have the following [list of hours] available for tomorrow. Would you like to know more about the slots?"

Suggestions help the user pick an option that is understandable by our assistant. It can also guide the user to the best option available providing new information.

Example: "If you pick the 2pm slot it comes with a 50% discount for future appointments. Would you be interested in that?"

  • Apology statements and useful commands

Ideally, apology statements won't be necessary, but when mistakes happen or the users intent is not clear to the assistant, they should quickly apologize and redirect the user to another option available.

Example: "Unfortunately, we don't have 4pm slots available. Would you like to have a 5pm slot? "

When the issue keeps on happening, the apology statements should include understanding of the users problem, accepting responsability, explaining the issue, solving it and expressing gratitude for the user's understanding.

Example: "I understand how [customer’s feelings – i.e frustrating, upsetting] this problem has been for you. I’m sorry that you’ve had to deal with [the issue]. The issue happened because we [what caused the issue]. To prevent it from happening again [what we will do]. Thank you for bearing with us through this incident. If there’s anything else I can help you with, please let me know. "

Useful commands should be available at all stages of the conversation. It directs the users to what the assistant understands and how they can get there.

Example: "Would you like to continue? Yes - No "

  • Buttons and interactive elements

Buttons are visual elements that help the user to quickly pick between different options presented to them. They can be accompanied by images or extra text to aid the decision making.

Example: "It has been scheduled. What would you like to do next?

  • Button 1: Check out directions
  • Button 2: Check out parking "

Multiple interactive elements help construct a multi-modal assistant. They utilize audio, emojis, images, videos and more to help the user make a decision and help showcase the assistants' personality.

Example: "It has been scheduled [confirmation sound]. What would you like to do next? "

To know more about conversational structure check out our video about it on YouTube.


Join the conversation

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

· 3 min read
Yuri Santana

VUIs (Voice User Interfaces) is the ability a virtual assistant has to respond to a voice commands utilizing NLU (Natural Language Understanding), NLP (Natural Language Processing) and speech recognition technologies.

Speech is more intuitive and natural for humans to communicate with each other while it also helps us gain important information and context, this is why voice assistants have become more popular in the last year with multiple uses from home, health, entertainment, businesses and many other sectors.

VUI technology is becoming more sophisticated and reliable, being fast to adopt and leaving the users with higher satisfaction levels than conventional chat or text assistants.

But what are the real advantages of Voice and Speech recognition technology?

  • Users don’t need to be trained on how to use the interface

Finding and understanding how to use new features on a system can be difficult, especially for new users. When you have many menus, dropdowns or information to display to the user they can feel overwhelmed and frustrated to not know how to pick what they’re looking for.

Voice can help ease the user to reach their goal on your product faster, just by voicing a command to the assistant and finding what they’re looking for immediately, offering more flexibility than a text/visual only interface.

  • Makes your product more accessible for the users

Accessibility is essential in this day and age, we have all suffered from a disability whether that is temporary or permanent so making your product accessible is a must.

Many groups of individuals rely on voice features to navigate the internet completely and even people who want to limit their keyboard use due to fatigue or cognitive disabilities.

Incorporating voice will help include a good section of the population that is often overlooked, placing you and your product in the competitive advantage in front of those with less accessible products for the users.

  • Boost productivity levels

Voice can provide support and assistance to customer support or task management, it allows you access to the information you need with just one voice command, taking less time than it would to type out a query on a text or visual only interface. Stanford’s study has stated that speech is three times faster than typing.

Voice prevents you from having to use hardware to achieve your goal, for example taking out your phone to get a direction from Google Maps, minimizing the risk of accidents.

  • Users will connect with your brand and product

Voice for the users feels more like a human interaction, providing comfort when the VUI actually understands what the user is saying and providing an accurate response to the intent and feeling of the user.

Voice provides a personality to your brand, it can be programmed to have humor, to be kind or to be friendly. All of those human traits the VUI learns over time, will make the user feel more connected to the brand.


Speech has the freedom that it can be applied for any industry, so the benefits are not only for the tech community. Voice can significantly improve the user experience and make the interaction with the product be more efficient. It ultimately, when done correctly, combines the best of the graphical and voice interfaces in benefit for the user reducing time and fatigue.


Join the conversation

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

· 3 min read
Yuri Santana

What we’re dealing with is the technology of conversation - Harvey Sacks

Voice user interfaces (VUI) are applications where the main form of interaction is speech. They allow users to interact with software by speaking to them.

VUI benefits

The language the user can utilize is way more natural than typing on chat applications and the conversations are mostly engineered to simulate the way we would interact with another person. When done right, they can eliminate the use of keypads, buttons or graphic interfaces and reduce user frustration.

VUIs also help make your products more accessible to those faced with the significant barriers that graphic interfaces inevitably impose on those with disabilities, both individual and situational. Interacting with software using voice is a way of making sure your product is accessible to those in situations where hands and eyes might be preoccupied.

VUI popularity

The use of speech is then the best solution for what Dario D. Salvucci calls secondary-task interfaces in his paper on Predicting the effects of in-car interface use on driver performance: an integrated model approach, those that are for support on a primary more critical task like driving, where the safety can be compromised when the other senses are involved if you’re checking your phone for example to get the weather.

In the past year, 2022, we have seen that 71% of users prefer using voice for search queries instead of graphic interfaces, nearly 1 out of 3 US consumers own at least one smart device and more than half of all owners use their device on a daily basis. The most well known VUIs include Google Assistant, Amazon’s Alexa and Apple’s Siri. For smart devices, we have Amazon Echo, a newly released Apple HomePod and Google Home.

The future for VUIs

This overwhelming popularity of voice interfaces makes us evaluate the impact of assistants on human - machine interactions. When users are faced with the option of a chat or voice assistant to get support, more than half choose the latter. This is because of natural language understanding (NLU), and natural language processing (NLP) the user can communicate their frustration or doubts effectively using their voice rather than pre-conceived options on a screen.

Users are now actively choosing to speak with the machine, and with the increase of AI assistants and VUI devices, we’ll soon discover even more usages and ways to apply Voice Interfaces to our applications and web pages and the undeniable effects these interactions will have on business-customer machine-user relationships.

Connect with us

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

· 5 min read
Yuri Santana

A conversational user interface (CUI) is the way users interact with software through language-understanding interfaces, whether that’s text or voice.

It is formulated to emulate human interaction and that is reflected every step of the way.

Prior to starting building your conversational interface, it’s important to have several aspects pre-defined to make the development and design process more clear and direct.

Before building a conversational interface

You will need to have a clear vision on the following aspects:

  • Type of interaction
  • Is the user going to interact with the app using text? Using voice? or a mix of both? This will depend entirely on the needs your company has and the type of interaction your users need.
  • Goal of interaction
  • Is it transactional or relational? Do you want your users to buy something? This will completely shape the word and interaction design to fit the needs as appropriate.
  • Domain of knowledge
  • Is your app going to be generalist? Would you be able to talk with it about anything or is it going to be a specialist? Focused on your product and specific topics surrounding it? This will not only limit and craft the development process but also help with the conversation design.
  • Who takes the initiative
  • Is it going to be proactive and lead the conversation or reactive and respond only when prompted by the user?
  • Depth of conversation
  • Is it a single shift or a multi-turn conversation with the user?

Steps for creating conversational interfaces

Product design

It should involve the tech side, business knowledge and what your users need.

Create a list of possible functionalities and eliminate, according to your already established goal of interaction, domain of knowledge and depth of conversation. Define those that will end as part of the minimum viable product (MVP).

Conversation structure

This is the point where your team needs to start crafting the happy path your application will follow. You will also need to define the order the information will be presented to the user after a keyword is identified from their input to trigger a search query into your database.

To know more about conversation structure, check out Fonoster’s video on Conversational Interface Design

Interaction design

Much like conversation structure, your team will need to design how to solve each of the presented interactions on the Happy Path, presenting the user with several options or ‘paths’ they can trigger on the application that will take them to their desired outcome with no friction.

For your application to learn, conversational patterns must be used to craft it based on the ideal interaction between the app and the user.

Word design

Picking the exact words to provoke actions in your users is a science by itself. That’s why it’s important to choose specific words and sounds that will make the user reach the goal we want.

We can aid ourselves by asking open, closed or yes or no questions. Users have a better time responding to ‘which country would you like to visit?’ than to ‘where do you want to go’.

Personality design

This is where your team designs the aspects that define your assistant. Your team should be able to identify how the assistant will respond to specific circumstances and how it’s never going to respond.

This is usually where an avatar is created with the demographic characteristics of the assistant and the behavior is defined extensively.

Sound design

It is now time to define the sound of your assistant. Is it going to be an automated voice or a voice actor?

This also includes setting up the sound effects that will be played when opening or closing the assistant.

After building

Prototype and testing

Now that our conversational interface prototype is ready to be released to our users, it’s important to keep on listening to feedback to see which features are working and which ones need to be polished or deleted.

You can begin testing within your own team or community by reading the conversation structure out loud and noticing how they respond to certain choices or paths presented. Remember the goal is to simulate human to human interaction. This is called analog testing .

You can also submit your conversation structure to a platform that will act as a user, allowing you to identify issues and corner cases. This is called automated testing.

Lastly, we have beta testing. It is done by taking a selected group of users and, making the application available for them to get feedback from your own community before releasing it to a bigger audience.

Metrics

After you have made your application available to your users, one quick way to identify if it’s working or which features are the ones they prefer is by analyzing metrics.

This will allow you to know if the objective the user has set has been met by the application, help you correct interactions and questions and which utterances you should train your interface on.

There are many software applications to know the metrics of both text and voice interfaces, they should give you a clear view of the users, recurrency, functionalities and where your users are abandoning your assistant.

· 2 min read
Yuri Santana

Connecting Fonoster to Dialogflow is just a few clicks away using the Fonoster Dashboard.

Trunking information you'll need:

  • VoIP provider
  • Number
  • Username
  • Password
  • Host

Set up your provider's information

Sign in to Fonoster and go to the Fonoster Project Dashboard, next select SIP Network tab and create a new Trunk.

Here you'll need to provide this information from your provider's account:

  • Your provider's name
  • Your username
  • Your secret / password
  • Providers Hostname or IPv4

Google Service Account key

Next step, you'll need to create a new Secret on the Secrets tab and set it to be the Google Service Account json key.

Create a new Fonoster Application

Now we are ready to create a new Application, go to the Applications tab and create a new one.

  • Pick a name
  • Select the secret you previously added from the previous step
  • Pick a voice
  • Type the intent ID from your Dialogflow Agent
  • Type the project ID from your Dialogflow project
  • Hit save
Add a new number to call

Lastly, we need to add a new number we can call and trigger Dialogflow.

Create a new number from the SIP Network tab

  • Add your number from the provider
  • Add the weebhook URL http://voice.fonoster:3000
  • Click save

And there you have it. You're ready to call that number and be able to interact with the AI.

Need help?

Fonoster is developed in the open. Here are some of the channels you can use to reach us:

Discord

GitHub discussions:

Twitter: @fonoster

We look forward to hearing from you.

· 3 min read
Yuri Santana

Hear the full conversation

We had a lovely conversation with Graham an accessibility expert who gave us some wonderful insights on how to test our contrast levels, why accessibility matters and how to fix common mistakes. Let's see some of the main points covered in the conversation.

Accessibility is not only important for people with disabilities, we all see and experience the benefits.

There’s this enormous field of software development that people do not know much about. That is accessibility.

How can we make low-code non-code contributions more accessible?

From an accessibility point, structure your headings correctly, they shouldn’t be nested on each other. There should always be one Heading 1 per page. Do not use them because they’re pretty, use them for their intended purpose. This is important for people using screen readers.

Use labels on forms. Do not replace them with placeholders. This is an issue for people in general. For example, someone anxious might fill out the form but wants to double check the form and starts questioning which field is which, so they can only delete the information and see the placeholder text and see if they filled it out right. This is pretty inconvenient.

The second point is alt text on images. Try describing the picture as if you were talking to someone on the phone. And last thing, if there are pictures of code snippets you should fix that and turn it into actual code snippets, for both accessibility and speed for people to copy your code. Ex.

Some code you can copy

Why is accessibility important?

  • Accessibility not only improves your code or life, but improves other people’s as well.
  • 97.4% of websites overall have accessibility errors bad enough to stop someone with a disability from using that website.
  • 1 in 10 people have a disability and 1 in every 10 people have a disability that affects how they interact with web pages.

But we can approach this from two angles. How does this affect me as a developer?

Well, it makes you use best practices, makes you use semantic HTML, it will save you time, ex: with the button tag. And makes your code easier to read and understand for future checks and reviews.

The other perspective is the business one. There’s a billion people in the world who care about accessibility because they have family or they themselves are disabled, so when you cater to those audiences, you can outperform your competition and have a bigger market.

Another reason why you should care is that the things you do for those with disabilities can help those without disabilities.

Tips to make your projects more accessible

You need to know how to identify problems. There are tools that help with this issue, like Accessibility Insights. It’ll pick up color contrast issues, if you used the wrong HTML element, the focus order on the page, among some other useful features.

Read about accessibility, learn about semantic HTML and best practices. Consume content that can help you improve the structure of your code and make sure to run it through accessibility checkers or make sure to apply Accessibility Guidelines specifications.

Other resources:

· 4 min read
Pedro Sanders

Last year, when I started assembling Team Fonoster, I published a post on Reddit that sparked a great conversation and placed Fonoster on Github's trending list even though we didn't have much to show.

As a result, I had the opportunity to interview dozens of CTOs from companies worldwide and speak with several investors who were interested in the idea of an open-source stack of Programmable Telecommunications.

In the interviews, I found we need an innovative approach to a cloud-based stack for Programmable Telecommunications.

Why we needed CPaaS in the first place?

Building an application that takes advantage of the existing Telecom network has always been a difficult task compared with, for example, building a web-based application.

This is difficult because it involves a particular set of skills that is challenging to find and can get really costly.

Let's face it, no one wants to read through dozens of RFCs to program a phone call.

So, when the API era arrived along with UCaaS and CPaaS providers, it was a no-brainer to use one of those providers to deploy a solution within weeks instead of spending months only to get a simple use-case.

So what's wrong with traditional CPaaS?

There is nothing wrong with traditional CPaaS. In fact, in most cases, using a CPaaS is a great option to deploy a Telecommunications solution.

However, even though the concept of using a CPaaS to go to market quickly is fantastic, it comes at a high price for some use-cases. After all, if something goes wrong, you will have no other option but to migrate to another CPaaS or build your own solution and start again on square zero.

Some companies complain about the high prices for using a CPaaS. A startup CTO once told me, “It almost feels that we are paying for a lot of features we don't need.” This is because, with a traditional CPaaS, you start on a pay-as-you-go model, but costs can quickly get out of control.

Other companies find themselves limited by their providers' features because with traditional CPaaS you have no option but to use what they have available. There is no chance for customization. And even though that's not a problem for most companies, it is a deal-breaker for technology companies.

Then you have use-cases, especially in the healthcare industry, that can't benefit from using a traditional CPaaS due to privacy concerns and local regulations.

In which of those categories does your company fall?

How can we make this better?

The primary innovation of Fonoster lies in researching and developing the means for creating a highly portable, cloud-based Programmable Telecommunications stack.

This Programmable Telecommunications stack will allow businesses to call an API to dial, answer a call, establish a video session, send SMS, etc. There won't be any concern about what servers and networks are doing with that information in the background.

Our overall approach to building Fonoster is to use existing open-source solutions that are best in their class when possible and build our own when necessary. We then integrate this individual open-source software into a cohesive set of APIs that resembles a traditional CPaaS.

For example, to start a simple Voice Application one could write a Javascript code like the one below:

const { VoiceServer } = require("@fonoster/voice");

const serverConfig = {
pathToFiles: `${process.cwd()}/sounds`,
};

new VoiceServer(serverConfig).listen(
async (req, res) => {
console.log(req);
await res.answer();
await res.play(`sound:${req.selfEndpoint}/sounds/hello-world.sln16`);
await res.hangup();
}
);

Or to make a call to the telephone network, you could use the SDK and write a simple script like this:

const Fonoster = require("@fonoster/sdk");
const callManager = new Fonoster.CallManager();

callManager.call({
from: "9842753574",
to: "17853178070",
webhook: "https://5a2d2ea5d84d.ngrok.io"
})
.then(console.log)
.catch(console.error);

Want to create a reminders application? No problem, in few easy steps, you can create and deploy a Cloud Function that will run based on a given Cron schedule.

First, initialize your Cloud Function with:

fonoster funcs:init

Then, edit the handler with the following code:

const Fonoster = require("@fonoster/sdk");
const callManager = new Fonoster.CallManager();

// 🚀 Let's get started
// Use fonoster funcs:deploy to send to the cloud functions
module.exports = async(request, response) => {
await callManager.call({
from: "9842753589",
to: "17853178070",
webhook: "https://5a2d2ea5d84d.ngrok.io"
})
return response.succeed("OK");
};

Finally, deploy to the Cloud Functions subsystem with a Cron string.

fonoster funcs:deploy --schedule "*/5 * * * *"

You get the idea.

The Cloud Functions capability if offered by the integration with OpenFaaS (by Alex Ellis)

What's next?

Be sure to check The essentials of building Voice Applications with Fonoster to overview the Programmable Voice features available on Project Fonoster.

Star the project on Github and contact us via:

· 3 min read
Pedro Sanders

At Fonoster Inc, we want to help companies and individuals that wish to adopt Fonoster as their Programmable Telecommunications solution. To help archive this goal, our team uses Cloud-Init for cloud instance initialization.

You can deploy Fonoster to all major public cloud providers, private cloud infrastructure, and bare-metal installations with Cloud-Init.

In this tutorial, we will also use Multipass.

Multipass is a Canonical project that offers a lightweight VM manager for Linux, Windows, and macOS. With Multipass, you can deploy Fonoster on Ubuntu in a local environment in a single command. This deployment method is by far the fastest way to get started with Fonoster.

Requirements

Before you start this tutorial, you will need the following:

  • Multipass
  • NodeJS 14+ (Use nvm if possible)
  • Fonoster command-line tool (install with npm install -g @fonoster/ctl)

Deploying to Multipass

This method will not automatically enable TLS for you

Deploy Fonoster to Multipass with the following steps. First, download the cloud-config.txt file into a local directory with:

curl https://raw.githubusercontent.com/fonoster/fonoster/main/operator/cloud-config.txt -o cloud-config.txt

Since we are running locally, we have to modify the cloud-config to discover the private ipv4 instead of the public ipv4.

First, update cloud-config with:

sed -i.bak -e "s#publicv4#privatev4#g" "cloud-config.txt"

Then, from the same directory, fire up Multipass.

multipass launch --name fonoster --disk 8G --cpus 2 --mem 4G --cloud-init cloud-config.txt

You might see a timed out waiting for initialization to complete, especially in a slow Internet connection. Don't worry. The process will continue in the background.

You can access your VM and continue following the installation process with:

multipass shell fonoster
tail -f /var/log/cloud-init-output.log

Once you see "Cloud init is done!" the process is complete. If everything goes well, you will be able to log in to your Fonoster deployment. To authenticate for the first time to your deployment, first get your admin credentials with:

cat /opt/fonoster/config/admin_credentials

Your output will look like the one below.

{
"accessKeyId": "admin",
"accessKeySecret": "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
}

Next, from the host machine, obtain your VM's IP with:

multipass info fonoster

Look for the entry starting with IPv4.

Name:           fonoster
State: Running
IPv4: 192.168.64.39
172.17.0.1
172.24.0.1
...

With the accessKeyId, accessKeySecret, and your VM's IP address, you can now login using the command-line tool or access your server with the SDK.

What's next?

For more deployment options, be sure to check the operator's section of Fonoster's documentation.

Star the project on Github and contact us via: