I built a small web scraping application with ruby whereby I scrape data from a website and then store it in a csv file. I'm scraping and storing everything successfully, however I'm unable to structure my csv file in a 'table' format, whereby there are two columns and multiple rows. My csv file should have a name column and a price column, with the name and price of each product item. This is my code:
require 'open-uri'
require 'nokogiri'
require 'httparty'
require 'byebug'
require 'csv'
def whey_scrapper
company = 'Body+%26+fit'
url = "https://www.bodyenfitshop.nl/eiwittenwhey/whey-proteine/?limit=81&manufacturer=#{company}"
unparsed_page = open(url).read
parsed_page = Nokogiri::HTML(unparsed_page)
product_names = parsed_page.css('div.product-primary')
name = Array.new
product_names.each do |product_name|
name << product_name.css('h2.product-name').text
end
product_prices = parsed_page.css('div.price-box')
price = Array.new
product_prices.each do |product_price|
price << product_price.css('span.price').text
end
headers = ["name", "price"]
item = [name, price]
CSV.open('data/wheyprotein.csv', 'w', :col_sep => "\t|", :headers => true) do |csv|
csv << headers
item.each {|row| csv << row }
end
byebug
end
whey_scrapper
I create a row after each item iteration, however the csv file is still very unstructured & messy.
This is how my csv file looks:
name |price
-----------------
"
Whey Perfection Body & fit
" |"
Whey Perfection® bestseller box Body & fit
" |"
Whey Perfection - Special Series Body & fit
" |"
Isolaat Perfection Body & fit
" |"
Perfect Protein Body & fit
" |"
Whey Isolaat XP Body & fit
" |"
Micellar Casein Perfection Body & fit
" |"
Low Calorie Meal Body & fit
" |"
Whey Breakfast Body & fit
" |"
Whey Perfection - Flavour Box Body & fit
" |"
Protein Breakfast Body & fit
" |"
Whey Perfection Summer Box Body & fit
" |"
Puur Whey Body & fit
" |"
Whey Isolaat Crispy Body & fit
" |"
Vegan Protein voordeel Body & fit vegan
" |"
Whey Perfection Winter Box Body & fit
" |"
Sports Breakfast Body & fit
"
€ 7,90 |€ 9,90 |€ 11,90 |€ 17,90 |€ 31,90 |€ 18,90 |€ 12,90 |€ 6,90 |€ 6,90 |€ 10,90 |€ 15,90 |€ 9,90 |€ 26,90 |€ 6,90 |€ 24,90 |€ 9,90 |€ 20,90