So far I have started with this. I can't get the normal text from div.
from BeautifulSoup import BeautifulSoup
import urllib2
get = BeautifulSoup(urllib2.urlopen("https://example/com/").read()).findAll('div', {'class':'h4 entry-title'})
import sys
for i in get:
print i
How can I scrap data from this HTML please ? I only need these color name and paragraph.
<div class="h4 entry-title">
<a href="https://example/com/01/">RED</a>
</div>
<p>
I am paragraph red
<p>
<div class="h4 entry-title">
<a href="https://example.com/02/">WHITE</a>
</div>
<p>
I am paragraph white
</p>
<div class="h4 entry-title">
<a href="https://example.com/03/">PINK</a>
</div>
<p>
I am paragraph pink
</p>
My Questions:
- How can I scrap data from this HTML? I need the text and paragraph only.
Output I need in console:
RED I am paragraph red WHITE I am paragraph white PINK I am paragraph pink
- How can I import these set of data into a SQL file automatically?
Output Database table(name,description) I want:
name: RED,WHITE,PINK description: I am paragraph RED, I am paragraph WHITE, I am paragraph PINK
Aucun commentaire:
Enregistrer un commentaire