Skip to main content

5 questions for… Nuance – does speech recognition have a place in healthcare?

Speech recognition has been on the brink of major success for decades, so it feels. Rather than set of generic “when will it be mainstream” questions, I was keen to catch up with Martin Held, Senior Product Manager, Healthcare at Nuance, to find out how things stood in this, specific and highly relevant context.

  1. How do you see the potential for speech recognition in the healthcare sector?

Right now, the most gain will be from general documentation, enabling people to dictate instead of type, to get text out faster. In some areas of healthcare, things are pretty structured – you have to fill forms electronically, with drop-down lists and so on. That’s not a primary application for speech, but anything that requires free text, there’s no comparison or alternative. Areas where handwritten notes are put into notes fields, that’s a good application. Discharge notes can be also be very wordy.

From a use case perspective, we’ve done analysis on how much time teams are spending on documentation and it’s huge — three quarters of medical practices are spending half of their time on documentation alone. In the South Tees Emergency department, we did a study where use of speech recognition reduced documentation time by 40%. In another study with Dukinfield, a smaller practice, by introducing our technology they were able to see 4 more patients (about a 10% increase) per day.

  1. What has happened over the past 5 years in terms of performance improvements and innovation?

In these scenarios, it’s a question of “can it work, can it perform” across a range of input devices. General speech recognition has improved so much that we are in the upper 90% range straight out of the gate. Now none of our products require training, based on new technology that was introduced using deep neural networks and machine learning.

In healthcare, we have also added cloud computing and changed the architecture: we put a lightweight client on the end-point machine or device, which streams audio to a back-end recognition server hosted in Microsoft Azure. We announced recently the general availability of Dragon Medical One — cloud-based recognition.

Still connectivity is a big issue, in particular for mobile situations, such as a community nurse — it’s not always possible to use recognition back in the car, if a mobile signal is poor for example. We are looking at technology that could record, then transcribe later.

  1. How have you addressed the privacy and risk implications?

We are certified to connect to N3 network, allowing NHS entities to connect according to requirements around governance and privacy, for example patient confidentiality. Offering a service through the NHS N3 network requires an Information Governance Statement of Compliance and submission of IG Toolkit through NHS Digital — this involves a relatively long and detailed certification process, including disaster recovery, Nuance internal processes and practices, employees with access and so on.

We are also offering input via the public Internet, as encryption and other technologies are secure so customers can connect through these means. So, for example, we can use mobile phones as an input device. We are not trying to build mobile medical devices, we know how difficult that is, but we are looking to replace the keyboard (which is not a medical device!)

As a matter of best practice, it is still required that the doctor has to sign the discharge or confirm an entry in electronic medical record system, whether it has been typed or dictated. So generated text is always a reference: and that will need to stay there. It’s more than five years before the computer can be seen as taking this responsibility from the doctor. Advice similarly can only be guidance.

  1. How do you see the market need for speech recognition maturing in healthcare? 

Right now we’re still very much in an enablement situation with our customers, helping with their documentation needs. From a recognition perspective we can see the potential of moving from enablement to augmentation, making it simpler and hands-free, moving to more of a virtual assistant approach for a single person. In the longer-term, further out, we have the potential to do that for multiple people at the same time, for example a clinician, parent and child.

We’re also looking at the coding side of things — categorising disease, treatment, length of stay and so on from patient documentation. Codes are used for multiple things – reimbursement with insurance, negotiation between GPs, primary and secondary care about services to provide in future, with commissioner and trust to negotiate on payment levels. For primary care, doctors do coding but in secondary care, it’s done by a coder looking through a record after the discharge of a patient. If data is incomplete or non-specific, trusts can miss out on funding. Nuance already offers Natural Language Understanding based-coding products in the US, and these are being evaluated for the specifics of the healthcare market in the UK.

So we want to help turn documentation into something that can be easily analysed. Our technology cannot just recognise what you say, but in natural language understanding we can analyse the text and match against codes, potentially opening the door to offering prompts. For example, if doctor diagnoses a COPD, the clinician may need to ask if patient is a smoker, which will have a consequence in the code.

  1. How does Nuance see the next 5 years panning out, in terms of measuring success for speech recognition?

We believe speech recognition is ready to deliver a great deal of benefit to healthcare, gaining efficiency and freeing up clinical staff. In terms of the future, we recently showed a prototype of a virtual assistant that combines a lot of technologies, including biometrics, complete speech control, text analysis and meaning extraction, and also appropriate selection — so the machine can distinguish between a command and whether I just wanted to say something.

This combination should make the reaction a lot more human — we call this conversational artificial intelligence. Another part of this is about making text to speech as human as possible. Then combining that with cameras and microphones in the environment, for example pointing at something and saying, give me more information about ‘this’. That’s all longer term, but the virtual assistant and video are things we are working on.

My take: healthcare needs all the help it can get

So, does speech recognition have a place? Over the past couple of decades of use, we have learned that we generally do not like talking into thin air, and particularly not to a computer: the main change over recent years, the reduction in training time, has done little to reduce this very psychological blocker, which means that speech recognition remains in a highly useful, yet relatively limited niche of auto-transcription.

Turning specifically to the healthcare industry, a victim of its own science-led success: it is difficult to think of an industry vertical in which staff efficiency is more important. In every geography, potential improvements to patient outcomes are being stymied by a lack of funds, symptomized by waiting lists, bed shortages and so on, while being burdened by the weight of ever-increasing bureaucracy.

Even if speech recognition could knock one or two percentage points off the time taken to execute a clinical pathway, the overall savings could be massive. Greater efficiency also opens the door to higher potential quality, as clinicians can focus on ‘the job’ rather than the paperwork.

For the future, use of speech recognition beyond the note-taking this also links to the potential for improved diagnosis, through augmented decision making, and indeed, improved patient safety as technology provides more support to what is still a highly manual industry. This will take time, but our general habits are changing as the likes of Alexa and Siri make us more comfortable about talking to inanimate objects.

Overall, progress may be slow for speech recognition particularly in healthcare, but it is heading in the right direction. One day, our lives might depend on it.

 



from Gigaom https://gigaom.com/2018/06/20/5-questions-for-nuance-does-speech-recognition-have-a-place-in-healthcare/

Comments

Popular posts from this blog

Who is NetApp?

At Cloud Field Day 9 Netapp presented some of its cloud solutions. This comes on the heels of NetApp Insight , the annual corporate event that should give its user base not just new products but also a general overview of the company strategy for the future. NetApp presented a lot of interesting news and projects around multi-cloud data and system management. The Transition to Data Fabric This is not the first time that NetApp radically changed its strategy. Do you remember when NetApp was the boring ONTAP-only company? Not that there is anything wrong with ONTAP of course (the storage OS originally designed by NetApp is still at the core of many of its storage appliances). It just can’t be the solution for everything, even if it does work pretty well. When ONTAP was the only answer to every question (even with StorageGrid and EF systems already part of the portfolio), the company started to look boring and, honestly, not very credible. The day the Data Fabric vision was announced

Inside Research: People Analytics

In a recent report, “ Key Criteria for Evaluating People Analytics ,” distinguished analyst Stowe Boyd looks at the emerging field of people analytics, and examines the platforms that focus on human resources and the criteria with which to best judge their capabilities. Stowe in the report outlines the table stakes criteria of People Analytics—the essential features and capabilities without which a platform can’t be considered relevant in this sector. These include basic analytic elements such as recording performance reviews, attendance monitoring, and integration with other HR tools. The report also defines the key criteria, or the features that actively differentiate products within the market and help organizations to choose an appropriate solution. These criteria include: Full employee life cycle tracking Support for different employee types (seasonal or freelance workers) Employee surveys Diversity and inclusion monitoring Stowe also looks at the rapid innovation and em