Report this

What is the reason for this report?

Write data to two columns with Python CSV writer

Posted on December 8, 2020

I am scrapping a website data and want to write that data in two different columns but all data is printing in same single column

This is the code:

from bs4 import BeautifulSoup
from requests_html import HTMLSession
import csv

s = HTMLSession()
url = f'https://everymac.com/systems/apple/iphone/index-iphone-specs.html'
list_data = []
r = s.get(url)
r.html.render(sleep=1)
soup = BeautifulSoup(r.html.html, 'html.parser')
file = open('OutPut.csv', 'w')
writer = csv.writer(file)
writer.writerow(['Product Name', 'Specification'])

products = soup.select('#contentcenter_specs_externalnav_2 a')
specs = soup.select('#contentcenter_specs_internalnav_2 td')
for item in products:
    a = item.text
    print(a)    # want to write this in column 'Product Name'
    for i in specs:
        b = i.text
        print(b)   # want to write this in column 'Specification'

        writer.writerow([a, b])
file.close()

How can I do that it will be great if you can help me with this



This textbox defaults to using Markdown to format your answer.

You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!

These answers are provided by our Community. If you find them useful, show some love by clicking the heart. If you run into issues leave a comment, or add your own answer to help others.

Heya,

You’re currently doing a nested loop, meaning for each product, you’re writing a row with that product’s name and all of the specifications. It’s likely that’s not what you want, because it duplicates the product name across multiple rows.

It seems that you’re trying to associate each product with a specific specification. If there’s a 1:1 correspondence between products and specifications, you should only need one loop. However, to do that, you need to make sure that the order of products in the products list matches the order of specifications in the specs list. If this isn’t the case, you may need to adjust your selectors or the way you’re scraping the data.

Assuming that each product is associated with a single specification, you could do something like this:

from bs4 import BeautifulSoup
from requests_html import HTMLSession
import csv

s = HTMLSession()
url = f'https://everymac.com/systems/apple/iphone/index-iphone-specs.html'
r = s.get(url)
r.html.render(sleep=1)
soup = BeautifulSoup(r.html.html, 'html.parser')

products = soup.select('#contentcenter_specs_externalnav_2 a')
specs = soup.select('#contentcenter_specs_internalnav_2 td')

with open('OutPut.csv', 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Product Name', 'Specification'])
    
    for product, spec in zip(products, specs):
        a = product.text
        print(a)    # want to write this in column 'Product Name'
        b = spec.text
        print(b)    # want to write this in column 'Specification'
        writer.writerow([a, b])

In the above code, I use the zip() function to iterate over both products and specs at the same time. This assumes that each item in products corresponds to the item at the same index in specs. If this isn’t the case, you’ll need to adjust your selectors or scraping logic accordingly.

Also, note that I’ve added newline='' to open() to ensure that rows are properly written on new lines in the CSV file. Without this, you may end up with extra blank lines between rows when you view your CSV file in certain programs.

The developer cloud

Scale up as you grow — whether you're running one virtual machine or ten thousand.

Get started for free

Sign up and get $200 in credit for your first 60 days with DigitalOcean.*

*This promotional offer applies to new accounts only.