Work package 2: Dissemination and Exploitation
Over the entire project time the HaH project has seen an increasing interest in the market, especially by service providers. Since service providers are the push factor for a growing interest by additional STB manufacturers and other manufacturers of the food-chain this interest has been fostered by newsblasts, updated website content and other marketing activities, like an outreach at a f2f in year 2 and year 3.
In order to improve the usefulness of the project results for the industry the business cases were refined and redefined during a second business opportunities workshop held in Madrid at TID premises on 2008-10-27 and 2008-10-28. The result was the focus on the “HIC on RG” business case with a fallback on the initial “HIC on STB” business case. In addition, the consortium defined the unique selling points (USPs) as follows:
There are mainly two competitors for the project results, one addressing the assistive domain and the other the home entertainment domain. Nonetheless, the combination of both is unique.
Between December and February 2009 the HaH website has been further updated and re-freshed with a new look&feel, new links, and additional information on publications and events. In addition to the publishing of the second newsletter during that time frame the standardisation efforts have been intensified. As a result, project members work on a contribution to the OSGi specification. A submission of an RFP is on its way.
Regarding the business cases the world economic crisis also started to affect the project: The telecom operators started to delay their decisions regarding their next generation RGs. Therefore the project members decided at their next meeting to contact two potential RG suppliers instead of waiting for a suggestion by the telecom operators (see description of WP9 for details). Despite these industry delays the public interest in the project further increased and showed that the project hits a nerve. Two TV broadcasts have been recorded (German NDR, German O1) and broadcasted, a radio interview of 4 minutes has been recorded by the German Deutschlandfunk (DLF) and was broadcasted in it’s regular section Computer and Communication. Additionally, the results of the project are presented in a special exhibition to the public in Oldenburg, as Oldenburg is currently the “city of science 2009 in Germany. This exhibition has started in April and lasts until fall. The project itself has been highly visible in the Oldenburg region as the opening of the “city of science” exposition and also the “OFFIS day” were reported in the local newspaper. Most reactions of (potential) users have been very positive and the question arised often, whether this system is already purchasable. This shows again the interest of the consumer in such hearing support solution.
New dissemination and exploitation activities by partners, such as the reach-out to STB manufacturers or presentations and articles, have been collected and are posted on the web site in order to spread the news. As we know by now the HaH project members will demonstrate a showcase at the NEM exhibition and also present at the NEM event. The next weeks will be used to personally reach out to the primary contacts and also let the other established industry contacts know about the opportunity to come and see the final results of the project in Saint Malo, France in September. A follow-up with interested contacts is being planned.
In general, it shows clearly that the project results hit a nerve in public and will be used for future exploitation activities. However, due to the financial crisis and also the longer lifecycles of RGs a full deployment of the business results cannot be promised fort he next two years.
Work package 5: Synface Enhancements and Adaptations
Work in WP5 during the reporting period has been focused on research, development, integration and evaluation of SynFace technology. Language support has been widened, not only by developing a German version of the system, but also by researching into methods of adding support for new languages in cost-effective ways. This has been achieved by using existing phonetic recognizers and mapping the output of these into new language sets, which appears to be a promising technique. It allows nearly the performance of a system trained on a full database (thousands of speakers) with only 30 minutes of speech from the new language. In HaH, SynFace has access to broadband audio, for which we have developed new versions of the SynFace recognizer with improved performance. We have also carried out experiments with noisy speech (which is something that can occur e.g. in television audio). We have conducted several experiments to improve non-verbal signalling in the SynFace talking head. We have developed an innovative and a computationally efficient technique for real-time detection of prominence from acoustics, which can drive gestures in the face (nods, eyebrow movements etc), and shown that addition of these movements significantly increase intelligibility.
We have carried out a large number of evaluations during the period. Diagnostic tests were performed after development of the German SynFace version, and showed a clear increase in sentence perception when SynFace was used, compared to an audio-only condition. The Swedish Broadband version of SynFace showed a significant benefit for the SynFace condition when compared to audio only. Large-scale user tests with hard-of-hearing users across three test sites have also been carried out during the period. These experiments show some SynFace benefits, in particular for speech-in-babble-noise, and for particular users, but on the average, SynFace did not show a clear advantage with the hearing impaired subjects. A second user test was carried out where hearing impaired users were asked to listen to an audio book with and without SynFace support. A pre-and post- sentence intelligibility test was carried out where SynFace was compared to audio only and natural video. Users benefited more from SynFace (over the audio only) after the audio book session than before, but both tests showed a significant benefit.
Do non-verbal gestures help? New evaluation setup: using eye_tracking to study the subject ganze behaviour.
Eye-tracks representation using Gaze Plots and Heat Maps
Work package 6: Global SASP Strategies
The noise reduction algorithms and the automatic sound classificator, which have been reported in the preceding newsletters, have been integrated into the final HIC platform. The multi-centred user evaluation of noise reduction algorithms, whose first results have been reported in the last newsletter, has been completed. The overall results basically confirm the first results, namely, mostly no or even somewhat detrimental effects of noise reduction on speech intelligibility and listening effort, however still clear preferences for some processed conditions, especially for a novel stereo noise reduction algorithm. In only one of the four conditions tested, which contained artificial, speech-simulating noise, the processed version was consistently disliked.
Implementation, optimisation and evaluation of an automatic online detection algorithm for commercial breaks have been completed. Evaluation results show a high correct detection rate of about 94% and a moderate false alarm rate of about 85%. Supplementing the main feature analyser, i.e. a logo detector, with additional feature indicators such as audio level (changes), the occurrence of black frames and changes of aspect ratio (4:3 / 16:9) did not improve the performance further.
Development and implementation of an adaptive filter to allow for communication with other persons while watching TV wearing a headset have been completed successfully for mono TV audio. Moreover, a novel approach to overcome the notorious "stereo problem" of adaptive filters has been researched, but could not achieve satisfactory results.
Work package 7: Individual SASP Strategies
First part of final user studies started with December 2009. In December speech recognition tests were performed with 15 older test subjects (age between 60 and 73), in January with 10 younger subjects (age between 15 and 67). Additionally paired comparison of subjective speech intelligibility vs. sound quality and a listening effort scaling was performed.
This iSASP evaluation study gave some clear results. The usability of the self-fitting system (questionnaire) is regarded favourably by the test persons. Especially the ease of handling the system and the intuitiveness of the procedure is a good result for a non-assisted self-fitting tool.
With respect to sound quality (paired comparison) the chosen compression algorithms are clearly preferred over the unprocessed version, but no one algorithm is really preferred over the other algorithms.
In terms of speech perception, the algorithms show no significant effects but clear trends. Averaged over all subjects the SRTs for the compressed OLSA stimuli all are better than the SRT for the identity processed OLSA sentences as well for the younger as the older group. 66% of the subjects improved their SRTs in every condition and both studies.Furthermore long term user studies were performed. The study was dedicated to give information about the developing of the fitting over a certain observation period. This period under review was set to three following days. Three working hypothesis should be revised:
The study allowed to observe two different levels of customization over the time – on the one hand the development over one 90-minutes-appointment (first and second fitting, trainings effects) and on the other hand the progress over the three days. All subjects performed the fitting after the first day much faster and confident.
The preferences of the subjects related to sound quality differed from day to day. The consistency of the reached parameter sets strongly depends on the single subject. A trend to a best fitting was not noticable. The best SRTs were mostly reached with the first untrained fitting of a day as well as with these first untrained parameter sets.
The habituation or trainings time does not influence the SRT in a positive way: after the training less subjects showed an improved SRT than without the training related to the identity processed audio.There is no indication to believe that a compression parameter set will become more useful and duces to better results with the time of using it.
In summary the results affirm that a naive approach of the subjects do not duce to a bad fit, in contrary. On the first day some of the subjects reached an improvement of SRTs, but the results show still a certain spreading and dependence of the subjects. But already with the third and fourth fittings on the second day nearly all the subjects got better SRTs and the spreading of the results is small. The acceptance for the compressed audio is as well positive as the good performance of the fittings related to speech intelligibility. The full results are described in detail in D9.2.
Work package 9:Integrartion
The final HIC platform for the final user tests has been defined. It is a HIC platform based on the initial platform (a powerful STB) to overcome the technical problems for a full demonstration of all components developed in the HaH project.
To prepare this decision a survey has been performed examining as well STB as RG technology. Regarding the actual STB technology we contacted a STB provider (DiscVision) who provided a sample STB for our evaluation. It showed clearly that the system lacks of several options which are necessary for application in the HaH scenario: There is no real-time patched kernel available (this might cause “cracks” in the audio), there is currently no real graphics card installed (this means no SynFace), and the worst point is that there is no access to the A/V stream as it is fed directly through hardware components and software running on the CPU isn’t able to get hands on it. Nevertheless there is still the attempt on investigating a new STB which provides an internal DSP which then might be used for the audio processing – there it will be possible to manipulate the A/V stream.
Further a requirement specification has been discussed with a possible RG manufacturer. The results from the company “inAccess” are quite promising besides that with the version available no partner communication and no SynFace will be possible. The partner communication won’t be applicable due to the latency between the STB (if it has a micro-input) and the RG. SynFace won’t be possible to run on the RG as there is no hardware acceleration available for OpneGL. Nevertheless the project continues to come up with a showcase showing the gSASP and iSASP on a sample RG.
To reduce the technical integration work an integration meeting took place in Oldenburg in May. The GUI and the controller were integrated with the device manager and Alarm manager in the PCs prepared for the uses studies. The SMS integration was not fulfilled during this meeting, but after it was completely integrated.
The final user studies were prepared which are performed at Viataal and OFFIS. The third user tests at KTH are split in two sections, one the helpfulness of non-verbal gestures and the second on long-term usage. The user tests for the complete final system intended to give subjective answers about the usability of the system.The full results are described in detail in D9.2 and show, that overall the system was ranked very high in the subjective acceptance
Additionally the consortium followed the business case thread. The decision was taken to purchase one sample of InAccess’ residential gateway and to come up with a first mock-up for a feasibility tests.