Wednesday, January 3, 2018

Seeing AI: my two cents

There are a lot of reviews on youTube about this free  Microsoft Seeing AI Talking Camera app for iPhone.  Instead of repeating what others have said, I just want to put in my 2 cents on each of the function offered by this app.  I will provide links for additional info at the end of this post.  After reading this post and viewing the videos, I think you would agree this is one of the must have app if you have vision issue.  This post is about Seeing AI 2.0 for iPhone, as of today, there is no Android version.  I was so convinced this app would make my life easier, I bought an cheap iPhone just to use this app.

Microsoft referred  the functions  in the app as channels.  When you open the app, the channel selection is default to the short text channel.  Each flick of the finger would change the channel in the following listed order.  Some of the channels are very useful and some are just  good for amusement.

Short Text Channel
I think this is the most useful function of this app because it instantly read out whatever text placed in front of the camera. Unlike KNFB Reader, I don't need to take a picture first in order  for the app to read out the text.  With this channel, I can sort out mail or read other short text much faster then KNFB reader.  I can even read the text on TV screen, like the phone number with some ads, and headline during news cast.  I tried to use it to read the caller ID display on my table top phone but it didn't work.  It couldn't read my LCD thermometer either, I think this app can't read text on small LCD screen.  

I found this app is a great componentry too to my ZoomText screen reader.  ZT reader just won't read some Window screen like anti-virus result screen.   I just point the camera at the onitor with the unreadable screen and let Seeing AI read it to me.  The app can do all these without internet access.

I hope in the future, the app would provide a way for the user to interact with the text.  For example, the app was able to read all the labels on my washing machine's buttons.   Wouldn't be nice if I place a finger on a button and the app say my finger is on what button?

Document Channel
Use this channel to scan and read full page document.  Like KNFB Reader, I  have  to take a picture of document before it would read out the text.   The app also provide verbal assistant to help you frame  the whole document within the view finder.  Unlike KNFB Reader, Seeing AI needs internet access for this channel to work.  I guess Seeing AI sends the text image to some Microsoft cloud server for OCR processing.  With my short tests, Seeing AI  provided very accurate OCR result on plain text documents.  It also did well with text imbedded in picture. 

I am hesitant to let Seeing AI process my more sensitive documents because the app would send the them to Microsoft for OCR processing, I just don't want my financial and medical record floating around somewhere in the web.  Until I find out more, I would use KNFB Reader for my private documents.

Product Channel
This is the barcode scanner function.  Unlike other barcode scanner, you don't need to position the barcode exactly in the scanning window.  All you need to do is point the camera  at a product and it would provide audio clue whether a barcode is in view.  Once the barcode is captured, it would go out to the internet to retrieve product info.  Depend on what info you want, I think the short text channel can do it faster and easier.

Person Channel
After taking a person's photo, the app attempts to guess the person's age, face expression, and other basic info.   This channel is definitely an amusement channel because the result were so off the mark.   Just for fun, I took a photo of our wedding photo which we took about more then 30 years ago, the app said two faces detected, 42 year old man look happy and 15 year old girl look natural. 

You don't have to be blind to have fun with this one.

Microsoft called the following preview channel.  I interpreted these functions are not ready for prime time and the result can't be trusted?

Currency Channel 
This function worked very well on identifying US paper money.  It accurately ID dollar bills from front or back or even crumbled  bill. think its best use is to sort out a bunch of paper money at home or at a bank.  I don't think it is practical to use this function while doing purchase.  Imagine  you paying  your hot dog  with a $20 bill at a food truck, the vendor hands you back some bills and coins.  How many hands to you need?  One hand hold the change, one hand fumble with the phone, one hand hold your white cane and don't forget to take the hog dog, while a line of hungry people behind you wondering what you doing. 

Scene Channel
You take a photo of something and the app would tell you what was  taken.  I am not sure the purpose of this channel because the info provided is not enough to act on.  For example, let say you are blind and you check into a motel room.   You take a photo at a random direction and the app say a room with a bed, a lamp and a desk.  It doesn't say the relative position of these three things, so you can't tell whether the lamp is at the left or at the right of the bed.

I was wondering whether it would ID stuff on table so I gave this a try.  When point at a TV remote, it correctly ID it as such.  When I point at a black stapler, the app said it was probably close view of a black car, hau?  This channel probably wasn't mean to ID stuff at close range, but it was fun to see how it mess up on many simple objects. 

Color Channel
Point your camera at an object and the app would tell its color.  I was able to determine whether a banana is yellow or green.  Since I am color blind also, I guess I would find more use of this channel.  Some people said the result is not very accurate, well, that's why this is a preview channel.

Handwriting Channel
I didn't expect too much from this function but I was amazed it read most of the hand writing correctly on the Christmas cards.  It also able to read many hand writing on checks, like payee's name an\ memo.  Yes, I did cover the routing and account numbers on the checks  for this test.   Since I hardly come across hand written stuff in my daily life, not sure where else I can use this function.

Light Channel
This is a light detector function.  The app provides audio clue on the brightness of your surroundings.  From no tone for no light detected  to high pitch tone for very bright.  It works but I don't think I have much use for this channel.

In conclusion, I think this is one of the best low vision app I have used so far.  The Short Text Channel made it so much easier and faster to read any type of short text.  
Additional Info

Seeing AI Overview

Seeing AI Review 1

Seeing AI Review 2










No comments:

Post a Comment