Author Topic: Speach Recognition  (Read 5101 times)

Aslamma

  • Jr. Member
  • **
  • Posts: 85
    • View Profile
Speach Recognition
« on: May 04, 2010, 11:12:29 am »
Any thoughts on this?  Or is it just too unreliable to be useful?  My past attempts with other software were not that great. 

Wondering if this could be done via the iPad, Elve, and Command Fusion though.

I can't say that I am that eager about it since it would mostly likely not turn out well, but might be curious.

John Hughes

  • Administrator
  • Hero Member
  • *****
  • Posts: 2851
    • View Profile
    • Codecore Technologies
Re: Speech Recognition
« Reply #1 on: May 04, 2010, 10:19:28 pm »
Believe it or not I actually have a simple Speech Recognition driver I wrote a year or more ago using the Microsoft speech engine.  The reason I have not worked more on it is because it would give too many false positives so I didn't think it was reliable enough to continue development on using the current speech engine.
« Last Edit: May 15, 2010, 11:29:23 pm by John Hughes »
John Hughes
Codecore Technologies

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #2 on: March 15, 2013, 11:45:13 am »
Sorry for the necrobump, but figured I'd post as I'm actually thinking of working on something similar.
I currently have a program I wrote for my current automation setup that's using the microsoft speech SDK along with the Kinect, and it has 0 false positives, even when a movie or music is on full blast. I was thinking of adapting this program to Elve (since I am upgrading my automation stuff this year and wanted to try out different products to see what direction I want to take ;) ) The biggest hurdle in my code planning is how I'm going to grab all the information on Lighting, scenes, etc. from the server and get it into the program for the dynamic grammar generation part (at the moment I was going to use the XML communication protocol and see how far I could get with that). Not sure if this is the route you went, but any advice you can offer would be great!

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #3 on: March 18, 2013, 11:26:43 am »
Great news you may be taking a stab at this. Reading up over on CT an upstart is getting notoriety for the same type of endeavor. I would try to help, but I am currently knee deep in another driver. Would love to see this functionality in Elve though. Maybe the user could map a keyword to desired device or scene, and that way the driver won't have to automatically try to configure it all. It does seem like a lot of work to pull all the device info and build keywords off of that. Also doesn't seem to be user friendly as you probably would have to use the device's full name, and many of mine have abbreviations. Like 'turn on FR-Fan-Light' or 'Turn off LR-Chndlr'.  :)
I always wanted to be somebody. In retrospect, I think I should have been more specific.

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #4 on: March 18, 2013, 12:18:58 pm »
Yeah, in one of my other versions, it learned commands as it went; it would scan through devices and check your sentence, then do a weighted probabilistic matching to determine what the most likely desired result is. It then would confirm that it's guess was what you wanted, and would then store that data so it would recognize the command in the future. That version also allowed custom aliases to be defined for devices (which could also be learned on the fly by the system). I've read a little bit of the driver API, though I was considering trying to do this via the XML interface  and have this run as an app that interfaces with Elve. Still drafting up some ideas, but haven't really had time to work on it at the moment (lots going on at my job taking up lots of time haha). I think I've tried out the program you are thinking of, it's actually really impressive, though it's still pretty early in it's development and (in my opinion) lacking some desirable features still, though that can change of course. The natural Language Processing of that program is pretty awesome though, and I imagine it's far more advanced than what I'm doing, which is just clever tricks with Regex.

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #5 on: March 18, 2013, 06:31:36 pm »
Incidentally, are there any code samples anywhere for the XML communications? I tried searching around but didn't really come up with anything.

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #6 on: March 18, 2013, 07:34:29 pm »
I can post something later. What language? Also it is not really XML. It is essentially text with tags mimicking XML.

Sent from my Galaxy Nexus using Tapatalk 2
I always wanted to be somebody. In retrospect, I think I should have been more specific.

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #7 on: March 18, 2013, 07:48:04 pm »
Ah, that was confusing me, I was trying to use xmlwriter in VB.net to write out the xml into a stream, that makes it much simpler. If you have an example in vb.net or c# that would be great, but I could make due with any language really (I'm familar with java and C++ as well, though I mainly code in vb.net and some C# at the moment).
Thanks for your help! Once I get past the communication part I'll have no issues porting over my current speech recognizer :)

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #8 on: March 22, 2013, 12:59:03 pm »
Sorry for the delay. I have been busy but have not forgotten about you. I will try to get that to you as soon as possible. I whipped up something last night, which is the same connection code I have used in my other drivers, but it was having stability issues in this context. I can send you what I have if you want to try to fix it, otherwise I will fix it, but it may take some time before I can complete.

Sent from my Galaxy Nexus using Tapatalk 2

I always wanted to be somebody. In retrospect, I think I should have been more specific.

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #9 on: March 22, 2013, 02:25:54 pm »
I can wait if needed, but if you want to send the code my way I could always take a stab at it.

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #10 on: March 23, 2013, 08:58:00 pm »
I took a look at it today with fresh eyes and it wasn't a stability issue at all. It was an authentication issue. So the connection was being made properly, but the server drops it after a short period of time if you have not authenticated your client, and it won't send system property updates to you as well until you have passed authentication. So now I am learning about hashing...  ;D and then I should have a stable working example.
I always wanted to be somebody. In retrospect, I think I should have been more specific.

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #11 on: March 23, 2013, 09:35:56 pm »
Ok... so I added hashing (ie copied and pasted hashing examples from the web. Thanks to MSDN for the hashing code, and CodeProject for the token generator) and am still not able to authenticate my client. I'll go ahead and post what I have. If you are able to fix it please let me know how. I have hit a brick wall at the moment and need to pass the torch. Plus its late and I'd rather be watching Breaking Bad.  ::)

This is a vb.net windows forms application (I did this for testing purposes. Once fixed we can easily add it to a class library application for Elve). The form contains a multiline textbox named textbox1, and two buttons, named button1 and button2. Click button1 to open a connection to the xml service. Click button2 to start the authentication process. Note I did not add a checksum. 1.) Don't know what Elve wants to see, 2.) I was able to use the xml service with that field blank. You can also use this to write to the network stream, but as stated in the post earlier the service will eventually kick you off due to non authentication.


Code: [Select]
Imports System.IO
Imports System.Text
Imports System.Security.Cryptography

Public Class Form1
'So app knows if we are trying to authenticate or not
    Dim authenticating As Boolean = False
    Public networkStream As Net.Sockets.NetworkStream
    Delegate Sub SetTextCallback(ByVal text As String)

    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Connect()
    End Sub

    Private Sub StopConnection()
        networkStream.Close(100)
    End Sub

    Private Sub Connect()
        Dim tcpClient As New Net.Sockets.TcpClient("Elve Server IP Address in doted quad", 33905)
        networkStream = tcpClient.GetStream()
        Dim buffer(2048) As Byte
        networkStream.BeginRead(buffer, 0, buffer.Length, AddressOf ReadComplete, buffer)
    End Sub

    Private Function KeyGenerate()
'This generates our random token used during authentication
        Dim KeyGen As RandomKeyGenerator
        Dim NumKeys As Integer
        Dim i_Keys As Integer
        Dim randomKey As String = Nothing
        NumKeys = 20
        KeyGen = New RandomKeyGenerator
        KeyGen.KeyLetters = "abcdefghijklmnopqrstuvwxyz"
        KeyGen.KeyNumbers = "0123456789"
        KeyGen.KeyChars = 12
        For i_Keys = 1 To NumKeys
            randomKey = KeyGen.Generate()
        Next
        Return randomKey
    End Function

    Private Sub ReadComplete(ByVal AR As IAsyncResult)

        If networkStream.CanRead = False Then
            Exit Sub
        End If

        Dim buffer() As Byte = AR.AsyncState
        Dim bytesRead As Integer = networkStream.EndRead(AR)

        Dim Message As String = System.Text.Encoding.ASCII.GetString(buffer, 0, bytesRead)
     
'checks to see if we are attempting to authenticate
        If Message.Contains("response cmd=""hello""") And authenticating = True Then
            Dim randomKey() As Byte = Encoding.ASCII.GetBytes(KeyGenerate())
            Dim t2 As String = Nothing
            For i As Integer = 0 To (randomKey.Length - 1)
                t2 = t2 & Hex(randomKey(i))
            Next
            Dim strTemp() As String = Split(Message, "<t1>")
            Dim strTemp1() As String = Split(strTemp(1), "</t1>")
            Dim t1 As String = strTemp1(0)
            Dim md5 As MD5 = md5.Create
            Dim password = Program.GetMd5Hash(md5, "Your Elve Password")
            Dim hash = (t2 & password.ToUpper & t1)
            md5 = md5.Create
            hash = Program.GetMd5Hash(md5, hash)
            Dim a = "<auth><u>Your Elve Username</u><t2>" & t2 & "</t2><h>" & hash & "</h></auth>"
            Dim b = "<J9Message len=""" & a.Length & """ cksum=""May need to add checksum here"" id=""1"" async=""true"">" & a & "</J9Message>"
            Dim asu() As Byte = System.Text.Encoding.ASCII.GetBytes(b)
            networkStream.Write(asu, 0, asu.Length)
        End If

        If Me.TextBox1.InvokeRequired Then
            Dim d As New SetTextCallback(AddressOf SetText)
            Me.Invoke(d, New Object() {Message})
        Else
            TextBox1.Text = Message & vbCrLf & TextBox1.Text
        End If

        networkStream.BeginRead(buffer, 0, buffer.Length, AddressOf ReadComplete, buffer)

    End Sub

    Private Sub SetText(ByVal text As String)
        Me.TextBox1.Text = text & vbCrLf & TextBox1.Text
    End Sub

    Private Sub Form1_Disposed(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Disposed
        StopConnection()
    End Sub

    Private Sub Button2_Click(sender As System.Object, e As System.EventArgs) Handles Button2.Click
           Dim a = "<hello/>"
             Dim b = "<J9Message len=""" & a.Length & """ cksum=""May need to add checksum here"" id=""1"" async=""true"">" & a & "</J9Message>"
        Dim asu() As Byte = System.Text.Encoding.ASCII.GetBytes(b)
        authenticating = True
        networkStream.Write(asu, 0, asu.Length)
    End Sub

    Public Class RandomKeyGenerator
        Dim Key_Letters As String
        Dim Key_Numbers As String
        Dim Key_Chars As Integer
        Dim LettersArray As Char()
        Dim NumbersArray As Char()

        Protected Friend WriteOnly Property KeyLetters() As String
            Set(ByVal Value As String)
                Key_Letters = Value
            End Set
        End Property

        Protected Friend WriteOnly Property KeyNumbers() As String
            Set(ByVal Value As String)
                Key_Numbers = Value
            End Set
        End Property

        Protected Friend WriteOnly Property KeyChars() As Integer
            Set(ByVal Value As Integer)
                Key_Chars = Value
            End Set
        End Property

        Function Generate() As String
            Dim i_key As Integer
            Dim Random1 As Single
            Dim arrIndex As Int16
            Dim sb As New StringBuilder
            Dim RandomLetter As String

            LettersArray = Key_Letters.ToCharArray
            NumbersArray = Key_Numbers.ToCharArray

            For i_key = 1 To Key_Chars
                Randomize()
                Random1 = Rnd()
                arrIndex = -1
                If (CType(Random1 * 111, Integer)) Mod 2 = 0 Then
                    Do While arrIndex < 0
                        arrIndex = _
                         Convert.ToInt16(LettersArray.GetUpperBound(0) _
                         * Random1)
                    Loop
                    RandomLetter = LettersArray(arrIndex)
                    If (CType(arrIndex * Random1 * 99, Integer)) Mod 2 <> 0 Then
                        RandomLetter = LettersArray(arrIndex).ToString
                        RandomLetter = RandomLetter.ToUpper
                    End If
                    sb.Append(RandomLetter)
                Else
                    Do While arrIndex < 0
                        arrIndex = _
                          Convert.ToInt16(NumbersArray.GetUpperBound(0) _
                          * Random1)
                    Loop
                    sb.Append(NumbersArray(arrIndex))
                End If
            Next
            Return sb.ToString
        End Function
    End Class

    Class Program
        Shared Function GetMd5Hash(ByVal md5Hash As MD5, ByVal input As String) As String
            ' Convert the input string to a byte array and compute the hash.
            Dim data As Byte() = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(input))
            ' Create a new Stringbuilder to collect the bytes
            ' and create a string.
            Dim sBuilder As New StringBuilder()
            ' Loop through each byte of the hashed data 
            ' and format each one as a hexadecimal string.
            Dim i As Integer
            For i = 0 To data.Length - 1
                sBuilder.Append(data(i).ToString("x2"))
            Next i
            ' Return the hexadecimal string.
            Return sBuilder.ToString()
        End Function 'GetMd5Hash
    End Class
End Class
« Last Edit: March 23, 2013, 11:39:51 pm by iostream212 »
I always wanted to be somebody. In retrospect, I think I should have been more specific.

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #12 on: March 26, 2013, 10:39:10 am »
Getting closer... figured out checksum. I tested the algorithm extensively and it was spot on and matched the checksums I was receiving from the xml service. However, even after adding that I was not able to authenticate my client. I think its an issue with my hash function. Anyone know how to resolve? I know some of you are programmers.

Here is what I did to test:
Get wireshark and start a capture on your server ip and port of your Elve xml service.
Then sign into EMS. Wireshark will capture the client server handshaking done during EMS sign in.
Extract the t1 (<t1>), t2 (<t2>), and hash (<h>) values from wireshark.
Now hash your password (must be same password used to log into EMS).
Hash the results of these concat strings: t2 + password hash + t1
See if your results match the <h> value from the handshaking session.
In my function posted earlier they do not.   ???
I always wanted to be somebody. In retrospect, I think I should have been more specific.

nikku

  • Newbie
  • *
  • Posts: 8
    • View Profile
Re: Speach Recognition
« Reply #13 on: March 26, 2013, 02:44:06 pm »
Hey sorry I haven't responded in a while, been busy at work. I have some code set up that might get me authenticated, just haven't had time to test. Ill let you know how it turns out as soon as I'm able.

iostream212

  • Sr. Member
  • ****
  • Posts: 459
    • View Profile
Re: Speach Recognition
« Reply #14 on: March 26, 2013, 05:06:26 pm »
Hey no worries. I don't have a project that I even need to use the xml service for. Its just the principal! I hate being bested by tech stuff and now this auth issue just got personal. For my own growth and tech curosity I must figure out how to do this or need someone show me, otherwise i will keep hammering away until its solved.  ;D
I always wanted to be somebody. In retrospect, I think I should have been more specific.