Feb-15-2020, 05:33 AM
Have been testing a Python API that can fetch transcripts from Youtube videos - https://github.com/jdepoix/youtube-transcript-api
Some test scripts based on the documentation.
and the second script is very basic ..
1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?
There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.
Some test scripts based on the documentation.
#!/usr/bin/python from youtube_transcript_api import YouTubeTranscriptApi video_id = 'nTg6Rqlz6ts' # retrieve the available transcripts transcript_list = YouTubeTranscriptApi.get_transcript(video_id) # iterate over all available transcripts for transcript in transcript_list: # the Transcript object provides metadata properties print( transcript.video_id, transcript.language, transcript.language_code, # whether it has been manually created or generated by YouTube transcript.is_generated, # whether this transcript can be translated or not transcript.is_translatable, # a list of languages the transcript can be translated to transcript.translation_languages, ) # fetch the actual transcript data print(transcript.fetch()) # translating the transcript will return another transcript object print(transcript.translate('en').fetch()) # you can also directly filter for the language you are looking for, using the transcript list transcript = transcript_list.find_transcript(['de', 'en']) # or just filter for manually created transcripts transcript = transcript_list.find_manually_created_transcript(['de', 'en']) # or automatically generated ones transcript = transcript_list.find_generated_transcript(['de', 'en'])when I run it I get this error message
Quote:$ python3 test6.py
Traceback (most recent call last):
File "test6.py", line 14, in <module>
transcript.video_id,
AttributeError: 'dict' object has no attribute 'video_id'
and the second script is very basic ..
#!/usr/bin/python from youtube_transcript_api import YouTubeTranscriptApi video_id = 'nTg6Rqlz6ts' transcript_list = YouTubeTranscriptApi.get_transcript(video_id) print(transcript_list[0]) print(transcript_list[1]) print(transcript_list[2])and returns ..
Quote:$ python3 test7.py
{'duration': 3.04, 'text': '[Music]', 'start': 0.82}
{'duration': 8.17, 'text': 'salvation is undoing salvation can be', 'start': 7.36}
{'duration': 7.249, 'text': 'seen as nothing more than the escape', 'start': 12.53}
1. Can you please advise what is causing the error in the first script
2. How can the 2nd script be modified to search through the transcript for specific keyword/s ?
There is a search example at https://stackoverflow.com/questions/5428...ata-api-v3 . It searches on video title and works okay. Just referencing that as it may help to add code to that 2nd script to include a search on the transcription.