Python Forum
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Youtube video transcripts
#1
Have been testing a Python API that can fetch transcripts from Youtube videos - https://github.com/jdepoix/youtube-transcript-api

Some test scripts based on the documentation.

#!/usr/bin/python

from youtube_transcript_api import YouTubeTranscriptApi

video_id = 'nTg6Rqlz6ts'
# retrieve the available transcripts
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)

# iterate over all available transcripts
for transcript in transcript_list:

    # the Transcript object provides metadata properties
    print(
        transcript.video_id,
        transcript.language,
        transcript.language_code,
        # whether it has been manually created or generated by YouTube
        transcript.is_generated,
        # whether this transcript can be translated or not
        transcript.is_translatable,
        # a list of languages the transcript can be translated to
        transcript.translation_languages,
    )

    # fetch the actual transcript data
    print(transcript.fetch())

    # translating the transcript will return another transcript object
    print(transcript.translate('en').fetch())

# you can also directly filter for the language you are looking for, using the transcript list
transcript = transcript_list.find_transcript(['de', 'en'])  

# or just filter for manually created transcripts  
transcript = transcript_list.find_manually_created_transcript(['de', 'en'])  

# or automatically generated ones  
transcript = transcript_list.find_generated_transcript(['de', 'en'])
when I run it I get this error message

Quote:$ python3 test6.py
Traceback (most recent call last):
File "test6.py", line 14, in <module>
transcript.video_id,
AttributeError: 'dict' object has no attribute 'video_id'

and the second script is very basic ..

#!/usr/bin/python

from youtube_transcript_api import YouTubeTranscriptApi

video_id = 'nTg6Rqlz6ts'
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)

print(transcript_list[0])
print(transcript_list[1])
print(transcript_list[2])
and returns ..

Quote:$ python3 test7.py
{'duration': 3.04, 'text': '[Music]', 'start': 0.82}
{'duration': 8.17, 'text': 'salvation is undoing salvation can be', 'start': 7.36}
{'duration': 7.249, 'text': 'seen as nothing more than the escape', 'start': 12.53}

1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?

There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.
Reply


Messages In This Thread
Youtube video transcripts - by jehoshua - Feb-15-2020, 05:33 AM
RE: Youtube video transcripts - by DeaD_EyE - Feb-15-2020, 10:03 AM
RE: Youtube video transcripts - by jehoshua - Feb-15-2020, 09:55 PM

Possibly Related Threads…
Thread Author Replies Views Last Post
  How to check if video has been deleted or removed in youtube using python Prince_Bhatia 14 11,806 Feb-21-2020, 04:33 AM
Last Post: jehoshua
  I have a question from a YouTube video I saw: nelsonkane 9 5,154 Dec-27-2017, 11:17 PM
Last Post: snippsat

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020