Scraping from multiple URLS to print in a single line.

jb89 · (This post was last modified: Jan-28-2020, 11:49 PM by Larz60+.)

Hi there,

Please forgive me if I have trouble explaining myself, i'm quite new to Python.

Basically I've been tasked with scraping some information from an atlassian site that prints out the plans in a project and the repository's, variables and stages of that plan. I've managed to do this but it prints out each segment individually because i'm pulling data from 4 different URLS one at a time.

The print looks like this:
"Plans"
"Repos"
"Variables"
"Stages"

I've been tasked for it to print like this:
"Plans","Repos","Variables","Stages"

My code is below, thank you.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

import requests
from bs4 import BeautifulSoup
 
params = {
    'X-Atlassian-Token': 'no-check',
    'Accept':'application/json',
    'Content-Type':'application/x-www-form-urlencoded'
}
 
# List of Plans in a project.
r = requests.get(url='https://xxxx/project/viewProject.action?projectKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for td in soup.findAll('td'):
    if td.get_attribute_list(key='class')[0] == 'build':
        print(td.text)
 
# List of Repos in a plan.
r = requests.get(url='https://xxxx/chain/admin/config/editChainRepository.action?buildKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for h3 in soup.findAll('h3'):
    if h3.get_attribute_list(key='class')[0] == 'item-title':
        print(h3.text)
 
# List of Variables in a plan.
r = requests.get(url='https://xxxx/chain/admin/config/configureChainVariables.action?buildKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for td in soup.findAll('td'):
    if td.get_attribute_list(key='class')[0] == 'variable-key':
        print(td.text)
 
# List of Stages in a plan.
r = requests.get(url='https://xxxx/chain/admin/config/defaultStages.action?buildKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for span in soup.findAll('span'):
    if span.get_attribute_list(key='class')[0] == 'stage-name':
        print('\t'+span.text)

**Larz60+** · Jan-28-2020, 11:51 PM

one way to do this is create an empty string before scraping:
combined_text = ''
then add word after each site scraped:
combined_text = f"{combined_text} {new_word}"

jb89 · Jan-29-2020, 12:22 AM

Thank you Larz60+

Do you mean something like the below? I'm unsure what to enter for "new_word"

Thank you

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

combined_text = ''
 
# List of Plans in a project.
r = requests.get(url='https://xxxx/project/viewProject.action?projectKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for td in soup.findAll('td'):
    if td.get_attribute_list(key='class')[0] == 'build':
        print(td.text)
        combined_text = f"{td.text}"
 
# List of Repos in a plan.
r = requests.get(url='https://xxxx/chain/admin/config/editChainRepository.action?buildKey=xxxx', params=params, auth=('xxxx', 'xxxx'), verify=False)
soup = BeautifulSoup(r.text, 'html.parser')
for h3 in soup.findAll('h3'):
    if h3.get_attribute_list(key='class')[0] == 'item-title':
        print(h3.text)
        combined_text = f"{td.text} {h3.text}"

jb89 · Jan-29-2020, 02:05 AM

I put each request into a function named plans, repos, variables and stages and tried the following, how ever it still prints out the same.

        
              combine_text = f'{plans()}{repos()}{variables()}{stages()}'
print(combine_text)

**perfringo** · Jan-29-2020, 06:12 AM

Another option is to append values to list and print that list:

        
              >>> lst = ["Plans","Repos","Variables","Stages"]                     
>>> print(*lst, sep=', ')                                            
Plans, Repos, Variables, Stages

Possibly Related Threads…
Thread		Author	Replies	Views	Last Post
	Web-scraping, multiple webpages	Pabloty92	1	1,922	Dec-28-2022, 02:09 PM Last Post: Yoriz
	BeautifulSoup not parsing other URLs	giddyhead	0	1,764	Feb-23-2022, 05:35 PM Last Post: giddyhead
	Issue facing while scraping the data from different websites in single script.	Balamani	1	2,849	Oct-20-2020, 09:56 AM Last Post: Larz60+
	scraping multiple pages from table	bandar	1	3,435	Jun-27-2020, 10:43 PM Last Post: Larz60+
	expecting value: line 1 column 1 (char 0) in print (r.json))	loutsi	3	13,328	Jun-05-2020, 08:38 PM Last Post: nuffink
	Scraping Multiple Pages	mbadatanut	1	5,037	May-08-2020, 02:30 AM Last Post: Larz60+
	Scrape multiple urls LXML	santdoyle	1	4,106	Oct-26-2019, 09:53 PM Last Post: snippsat
	MaxRetryError while scraping a website multiple times	kawasso	6	21,059	Aug-29-2019, 05:25 PM Last Post: kawasso
	Need to Verify URLs; getting SSLError	rahul_goswami	0	2,810	Aug-20-2019, 10:17 AM Last Post: rahul_goswami
	scraping with multiple iframe	jansky	1	4,976	Nov-09-2018, 11:12 AM Last Post: snippsat

Scraping from multiple URLS to print in a single line.

User Panel Messages

Announcements