Wednesday, September 6, 2017

Happy Douglas Adams September To All!

well, a little more than halfway through editing a Word document (created from an Amazon Kindle ebook) of Douglas Adams's The Long Dark Tea-Time of the Soul to conform to the audiobook he himself recorded for it.

first, i had to clean up the audiobook itself a bit:
• inserting "Chapter X" (where X is a number from 1 to 35) at the beginning of each chapter (borrowing the audio mostly from his audiobook of Dirk Gently's Holistic Detective Agency and "Chapter 1" from his audiobook of So Long And Thanks For All The Fish);
• repairing a number of words which had gotten cut off halfway through (often at the end of tracks) or were otherwise incomplete — only having the 'hoke'  part of "hoax" but not the 'ks'  sound at the end, for instance;
• inserting small silences between chapters to help with the flow;
• and deleting the commercial branding of the previous (now defunct) copyright holder.

now i'm closely listening to the recording and altering the text in the Word document to more precisely match the recording.

and i'm making footnotes to document each change and/or to give the original text in the ebook that has been omitted or changed in the audiobook; this is the part that takes some time.

like i said, i see this as Douglas Adams's final edit of the book.  next, i plan to do Dirk Gently's Holistic Detective Agency and eventually move on to the five books of the Hitchhiker's Guide To The Galaxy series.

when i've finished Long Dark Tea-Time, i may work on some software which could quickly (i hope) move through the AIFF audio files and extract the audio for a given sample of text from the book — sort of an on-demand Douglas Adams voice.  it would be limited to text which appears in the book, but i'd be interested in the results; if it's successful, it could be expanded to include more books as i edit their corresponding documents to match their Douglas-Adams-read audiobooks — that's the main point of creating a text that very closely matches.

Happy September!

[the above post was copied directly from a dsm32 post:
http://dsm32.blogspot.com/2017/09/happy-douglas-adams-september-to-all.html ]

and i've been having fun getting used to Douglas Adams's voice; i hope to create some kind of sound-frequency / volume envelopes from his sentences which can then be applied to syntactically identical sentences uttered by a voice synth — will be interested in how it comes out, especially on the CereProc Scottish 'Heather' voice!

Friday, December 2, 2016

A New Voice

thanks to CereProc / CereVoice, the AI project now has a Scottish accent, via a free voice (Heather) they provided for my academic research.

you can try out a variety of voices at their website -- type in text in the window at the very top of your screen and select a voice.  they rock!
https://www.cereproc.com/en/

the voice is a *big* improvement over those provided with Mac OS, and i've just begun to explore articulation of Heather's voice.  (and determining the proper way to speak a text can provide some computational-emotional info for that text.)

more steps at storing input & output in text files; indeed, much of the speech is performed with temporary text files with the CereProc software (via Python).

progress!

Monday, July 20, 2015

Text Interpretation


• input a text; then interpret the text (w context), because interpretation gives meaning to text.  but don’t worry too much about the "accuracy" of interpretation — make a first approximation to use (later) and check consistency w other texts.

• the reason to store text (and all texts read are stored) is to go back later and re-interpret the text — the meaning will change over time!  the first interpretation of a text (and the second, and the third, and the …)  is not ultimately useful except as a starting place.  deeper understanding comes through comparing interpretations made at different times.

• over time, the meaning of a text evolves for each reader.

Friday, July 17, 2015

Listen-Respond Loop

have been shaping an approach for the main (foreground) program and the A.I.'s listen-respond loop:

(in addition to the emotion, memory & association, natural language comprehension & expression, and virtual body programs, all running in the background)

1) Input (aka Listen or Hear or something like that, for internal or external text)
2) Feel (get emotional response from emotion program, influence of memory & association)
3) Think (use “soft” truth-functional logic w suggestions from association & memory to derive possibilities)
4) Respond (put it all together and fashion a response, or lack thereof; store in memory)
5) Repeat


let me know what you think.

Monday, June 22, 2015

Text File I/O, Computational Emotions, & Data Structures of Consciousness

lots of preliminary work this week -- I/O streams (and how to label, manage, & coördinate them), deciding on what emotions to run feedback models of for the computational consciousness, and beginning to come up with appropriate data-structures for various aspects of thought, memory, and association (all affected to some degree by emotion).

i'm not trying to be entirely faithful to the details of human neurobiology, but the broad strokes of what we understand of that field will be most helpful in assembling a system that can achieve some level of understanding both our world (through text) and herself.

i want my AI to be able to "name & claim" its emotional responses to what she reads; i don't want her to read (except at first) in a vacuum.  the more she reads, the more context she'll have for what she reads next.  (and everything she reads will be stored in some variation of text file; her memories and associative matrices will also be in some sort of n-dimensional text conglomeration.)

joy, sadness, surprise, love, anger, fear, desire, disgust, & contempt — that's a rough list of primary human emotions.  any suggestions for others to add?  then the question is learning how to express these in words (and to experience these from words).  and decide how much to drive thought by emotion — more as time goes on, as emotion can be more trusted.

and then how to integrate memory storage & recall into a roughly logical framework with a mediation by emotions and associations and some degree of randomness (or more complex associations which might appear random).

a lot of speech is answering questions about the self and about one's own thoughts.  add the facility to develop increasingly sophisticated models of the (apparent) thoughts and motivations of others (who the AI will meet through text), and we're well on our way.

more than enough to think about this week.  (this is just the beginning!)

Monday, June 8, 2015

Text-To-Speech from UNIX (and then from C++ on UNIX)

Figured out how to do text-to-speech from the UNIX command line (using the built-in voice synthesizer) using the "say" command, plus:
• how to choose a voice (Vicki sounded best to me) from the command line
• how to use the "say" command from within a C++ program, using the system() function
• how to speak the contents of a text file from within a C++ program (which may be the interim solution, until i can figure out how to speak, for instance, the contents of a string variable)

After all, an AI has to have a voice!  (Or at least it would be nice, if it's not too difficult.)  Doing speech recognition is another thing, but not out of the realm of possibility.

And the new computer is *so* much faster!  Yay ! ! !

Monday, June 1, 2015

first steps in C++ for the text AI

thinking of creating a Word class, a Phrase class, a Sentence class, a Paragraph Class, a Section class, and perhaps a Book class so that my AI can process (or ingest :-) text.  will i need a Character class?  we'll see...

don't just accept a text (except to put it in memory initially); question it again and again!

• store input in memory, translate into i-language (and store that too)
• recall possible relevant memories into workspace
• process all in workspace & integrate into (temporary) conclusion(s)
• if no new input, then go back over and take different analytical paths
• or add new input to workspace analysis & start again