Bottom Page

Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
 Youtube video transcripts
#1
Have been testing a Python API that can fetch transcripts from Youtube videos - https://github.com/jdepoix/youtube-transcript-api

Some test scripts based on the documentation.

#!/usr/bin/python

from youtube_transcript_api import YouTubeTranscriptApi

video_id = 'nTg6Rqlz6ts'
# retrieve the available transcripts
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)

# iterate over all available transcripts
for transcript in transcript_list:

    # the Transcript object provides metadata properties
    print(
        transcript.video_id,
        transcript.language,
        transcript.language_code,
        # whether it has been manually created or generated by YouTube
        transcript.is_generated,
        # whether this transcript can be translated or not
        transcript.is_translatable,
        # a list of languages the transcript can be translated to
        transcript.translation_languages,
    )

    # fetch the actual transcript data
    print(transcript.fetch())

    # translating the transcript will return another transcript object
    print(transcript.translate('en').fetch())

# you can also directly filter for the language you are looking for, using the transcript list
transcript = transcript_list.find_transcript(['de', 'en'])  

# or just filter for manually created transcripts  
transcript = transcript_list.find_manually_created_transcript(['de', 'en'])  

# or automatically generated ones  
transcript = transcript_list.find_generated_transcript(['de', 'en'])
when I run it I get this error message

Quote:$ python3 test6.py
Traceback (most recent call last):
File "test6.py", line 14, in <module>
transcript.video_id,
AttributeError: 'dict' object has no attribute 'video_id'

and the second script is very basic ..

#!/usr/bin/python

from youtube_transcript_api import YouTubeTranscriptApi

video_id = 'nTg6Rqlz6ts'
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)

print(transcript_list[0])
print(transcript_list[1])
print(transcript_list[2])
and returns ..

Quote:$ python3 test7.py
{'duration': 3.04, 'text': '[Music]', 'start': 0.82}
{'duration': 8.17, 'text': 'salvation is undoing salvation can be', 'start': 7.36}
{'duration': 7.249, 'text': 'seen as nothing more than the escape', 'start': 12.53}

1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?

There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.
Quote
#2
They key video_id is not in transcript.

Replace line 14:
transcript.video_id, with video_id,

Then try again.
If the next error occurs, you know that this attribute is also not provided by the current transcript element.

By the way, video_id is always the same for the different translations of transcriptions.
jehoshua likes this post
My code examples are always for Python >=3.6.0
Almost dead, but too lazy to die: https://sourceserver.info
All humans together. We don't need politicians!
Quote
#3
(Feb-15-2020, 10:03 AM)DeaD_EyE Wrote: They key video_id is not in transcript.

Replace line 14:
transcript.video_id, with video_id,

Then try again.
If the next error occurs, you know that this attribute is also not provided by the current transcript element.

Thanks for your reply. Yes, that fixed it, but I had to also remove all of those other attributes. Possibly I need to make sure that package has those attributes. Have been adding extra code to the other script, so that it writes to a file and prints; helps understand the data coming back.

#!/usr/bin/python

from youtube_transcript_api import YouTubeTranscriptApi

video_id = 'My3QHayBBfQ'
transcript_list = YouTubeTranscriptApi.get_transcript(video_id)

print(transcript_list[0])
print(transcript_list[1])
print(transcript_list[2])

with open('your_file.txt', 'w') as f:
    for item in transcript_list:
        f.write("%s\n" % item)
Is there a debug feature with python ? I'm used to stepping through code and examining variables, etc.
Quote

Top Page

Possibly Related Threads...
Thread Author Replies Views Last Post
  How to check if video has been deleted or removed in youtube using python Prince_Bhatia 14 4,339 Feb-21-2020, 04:33 AM
Last Post: jehoshua
  I have a question from a YouTube video I saw: nelsonkane 9 1,994 Dec-27-2017, 11:17 PM
Last Post: snippsat

Forum Jump:


Users browsing this thread: 1 Guest(s)