Month: February 2025

  • Speech Recognition Engine for Controlling RixBot

    Finally, I managed to find some spare time today to connect up the audio system to a Raspberry Pi 4, which includes a simple speaker output and MAX9814 condenser mic input module. The purpose of this setup is to allow the voice recognition software (VOSK) to be installed on the Raspberry Pi, and respond to simple prompts. It took a huge amount of tweaking (and a lot of head scratching, cups of tea and chocolate) to finally resolve all the conflicts, but it seems to work, and the great thing about it is that it works offline.

    The screenshot below shows a simple test that illustrates the software responding to voice prompts after I spoke several simple commands into the microphone:

    I was impressed that the mic module would pick up on commands from 2 metres away, and did not make any mistakes.

    The next stage of this work will be to attach the motor controller to the Raspberry Pi and install the whole module into the RixBot body. The body of the RixBot is based upon a RealRobots ‘Cybot’ body, which was originally sold as a kit through a weekly publication (see below). This is an excellent basis for a robot, as it already has geared motors, battery compartments and places for light and ultrasonic snesors built into it.

    Once the Raspberry Pi has been mounted inside the robot body, and the motor controller installed, then it is ready for testing at Rix with our co-researchers.

  • Is Artificial Intelligence (AI) Inclusive? Workshop at Rix (12/02/2025)

    Kate introducing the Generative AI workshop.

    This is the second Generative AI workshop that we have hosted aimed at finding out more about the accessibility and usability of Generative AI software. The workshop developed ideas that were explored during our first workshop, presented as part of  ‘Creative inclusive research explorations of intelligent technologies/Artificial Intelligence (AI) at the University of Ravensbourne MeCCSA Symposium’

    At the Symposium, delegates were working in diverse inclusive teams and given the chance to try out different AI software engines such as Stable Diffusion, in this instance hosted by NightCafé. In order for participants to have the chance of experiencing AI software, we gave no specific goal and requested people to experiment and enjoy using their imagination to see what could be created. For this second workshop, our co-researchers were given the task of generating images for a specific theme – health and wellbeing.

    Researcher and co-researchers entering textual prompts to an AI engine.

    Our co-researchers were all very keen and excited to learn and work with AI and see what they could create for themselves. We worked in teams of people with and without a learning difference and disability (LDD)  Along with gaining a better understanding of the inclusivity of AI software, a related focus was of a future where AI has a prominent role, and that our diverse team should be included in the development of this AI. Our workshop discussions started to showcase the potential advantages for use by people with LLD, and this is particularly relevant in light of the conversations regarding the ethics of AI and access to the services that it provides. 

    Co-researcher entering a text prompt.

    The participants at the workshop session were given the theme of Health and Wellbeing as we age, with the specific focus on healthy food, exercise and maintaining good mental health. They were asked to try and capture these three threads by generating images (including cartoon sequences) using the AI software and by providing the prompts, either through speech, text or by uploading a sketch they had drawn.

    One of the images representing health and wellbeing generated by Nightcafe AI engine

    Being able to talk to the AI seemed to work well for some who found it difficult to type, although some found that the dictation software (built into the Macbook) misunderstood what they wanted to say, possibly because the speech was not clear enough. Some of this was possibly due to the unique formation of sentences they spoke which could perhaps be improved upon with the use of an AI transcriber that could learn the idiosyncrasies of the speaker and transcribe their intentions more accurately. We haven’t yet found such software, so there is potential for a project that addresses this issue.

    We discussed how AI could aid inclusion in many ways, including translating ideas, editing first thoughts, and create imagery to help share ideas and emotions. As an augmentation tool, the generative AI software engines are relevant in many areas, particularly in ideation and despite AI mis-understanding prompts, creating odd versions of reality, or just lacking something original. But no-doubt that as the software (and hardware) improves, so too will the abilities of the software to pre-empt what we really wanted to do and offer its own solutions.

    The almost conversational approach with the AI was highlighted by one co-researcher when seeing the images produced by the AI software to create images showing healthy food, commented that the portions were huge – as illustrated below.

    Gigantic portions generated by AI engine.
    Image produced using the prompt "Create images for cartoon strip of somebody having a healthy meal". Note large portion sizes.

    Another member of the team suggested you we need to provide clear prompts for the AI software – instructing it to create an image with smaller portion sizes. This was a common theme with the software – whichever method is used to generate the images – voice, text or image upload – you need to be quite clear and detailed about what it is that you require. Misunderstood words or keywords out of context could result in strange images being generated.

    Everyone seemed delighted with the image results produced from interacting with the AI, despite some of the unusual and often unintentional consequences of the AI software, such as missing or additional limbs, or odd shaped physical objects (e.g. the dumbbells below).

    Image generated using the prompt "Eating chicken curry with weights"

    Keeping prompts simple and clear seemed to create straightforward, if not a little ordinary, images. Even when providing an obscure prompt such as “Generates an image of not eating pizza, but eating, healthy and small portions with lots of vegetables” produced something realistic, though not particularly out of the ordinary.

    Image generated using the prompt "Generates an image of not eating pizza, but eating, healthy and small portions with lots of vegetables".

    Some of the participants attempted to generate cartoon strips to illustrate health and wellbeing, but this did require quite a lot of effort and unless very clear and detailed instructions were provided, the results were rather simplistic, as illustrated below. As with many of the other images generated, these tended to be in the form of white, physically fit, young people. To be really inclusive and include images of people with disabilities, you would need to specifically ask for this.

    Image generated using the prompt "Create a cartoon strip with someone doing arm exercises". Note the missing legs and additional limbs.

    It was later suggested in a discussion following the workshop that visual prompts, such as physical or digital flashcards, could assist in giving the participants the cues that they were looking for in generating the images that they wanted to produce. To a small extent, some iconic prompts (image thumbnails) were provided by the Nightcafe interface for selecting the styles of image that could be generated.

  • Talking Ruler (AKA RixTalk)

    At a recent visit to the Google Accessibility Discovery Centre with colleagues from Rix Inclusive Research, amongst the many accessible items on display we saw a ruler that is available for people with sight impairments (they called it a ‘Braille Ruler’). The RNIB also sell an identical ruler to the one we saw at Google, but they call it a ‘tactile ruler’. Both have cutaways every 5mm to help you make measurements, and raised numbers and lines at intervals of 5mm. Here is the one from the Google Accessibility Centre:

    Here is the same ruler from RNIB (looks 3D printed to me):

    Not being content with a static ruler, I thought it might be interesting to enhance this a little and add some speech to the ruler. So I created an alternative version. The prototype below is the work in progress, which combines the tactile ruler with speech output using (for the prototype at least) an Arduino Uno, with an Adafruit MP3 player for the speech output:

    The RixTalk Ruler has similar notches every 5mm, but larger notches every 10mm to help distinguish between them. The ruler is also only 15cm long, as it was easier to print the prototypes this way (having to produce many versions before I got the dimensions correct). Inside each ‘V’ slot is a contact wire, each of which connects to an analog port on an Arduino Uno. The contact wires are visible in the side view below:

    The pencil (shown above) has a 3D printed cap that makes contact with the graphite core of the pencil, and is connected to the 5V port of the Arduino Uno. When a slot in the ruler is touched using the pencil, it makes contact with the Uno and speaks the measurement through the MP3 player and speaker. The prototype uses wires to the Arduino (just five of the thirty slots for testing purposes), which is a bit awkward, but necessary to test the idea out. However, it could be made into a single unit, battery operated with an integral speaker and bluetooth connection to the pencil. That will be the next development if we decide to continue with it.

    Video of the prototype is below:

    The next stage of development is to embed the electronics inside the ruler itself, with an integrated battery and speaker, and remove the need for wires by using a bluetooth connection to the pencil. We have also looked at the prospect of adding a thin base to the ruler so that in addition to having the slots at 5mm interval, you can also ruler straight lines. I’ll add an image of the new version as soon as it’s printed.