Endre søk
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Automatic Subtitle Generation for Sound in Videos
Högskolan Väst, Institutionen för ekonomi och it.
2009 (engelsk)Independent thesis Advanced level (degree of Master (One Year)), 10 poäng / 15 hpOppgave
Abstract [en]

The last ten years have been the witnesses of the emergence of any kind of video content. Moreover, the appearance of dedicated websites for this phenomenon has increased the importance the public gives to it. In the same time, certain individuals are deaf and occasionally cannot understand the meanings of such videos because there is not any text transcription available. Therefore, it is necessary to find solutions for the purpose of making these media artefacts accessible for most people. Several software propose utilities to create subtitles for videos but all require an extensive participation of the user. Thence, a more automated concept is envisaged. This thesis report indicates a way to generate subtitles following standards by using speech recognition. Three parts are distinguished. The first one consists in separating audio from video and converting the audio in suitable format if necessary. The second phase proceeds to the recognition of speech contained in the audio. The ultimate stage generates a subtitle file from the recognition results of the previous step. Directions of implementation have been proposed for the three distinct modules. The experiment results have not done enough satisfaction and adjustments have to be realized for further work. Decoding parallelization, use of well trained models, and punctuation insertion are some of the improvements to be done.

sted, utgiver, år, opplag, sider
2009. , s. 30
Emneord [en]
Audio extraction, Java media framework, speech recognition, acoustic model, language model, subtitle generation, Sphinx-4
HSV kategori
Identifikatorer
URN: urn:nbn:se:hv:diva-1784OAI: oai:DiVA.org:hv-1784DiVA, id: diva2:241802
Presentation
(engelsk)
Uppsök
Technology
Veileder
Examiner
Tilgjengelig fra: 2009-10-05 Laget: 2009-10-05 Sist oppdatert: 2018-01-13bibliografisk kontrollert

Open Access i DiVA

fulltekst(795 kB)36434 nedlastinger
Filinformasjon
Fil FULLTEXT01.pdfFilstørrelse 795 kBChecksum SHA-512
683eca5b3095654fcb71d5eaebcbfbcb6f957eaa1bc3775c95822225232ee51976035a2a5f44c48ed94b6327736091a678dca26c84788b6679af92fc5f0aa7e0
Type fulltextMimetype application/pdf

Av organisasjonen

Søk utenfor DiVA

GoogleGoogle Scholar
Totalt: 36435 nedlastinger
Antall nedlastinger er summen av alle nedlastinger av alle fulltekster. Det kan for eksempel være tidligere versjoner som er ikke lenger tilgjengelige

urn-nbn

Altmetric

urn-nbn
Totalt: 1856 treff
RefereraExporteraLink to record
Permanent link

Direct link
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annet format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annet språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf