import requests
content = requests.get(urlpage)
It gives the whole html of the static page content.
import requests
from bs4 import BeautifulSoup
content = requests.get(urlpage)
soup = BeautifulSoup(content.content,'html.parser')
table = soup.find('table',{'class':'wikitable sortable'})
It is used to scrape static pages.
It can be used to automate web browser interaction with Python. Need to install the browser and its driver.
sudo apt-get install firefox
sudo apt-get install firefox-geckodriver
pip install selenium
Terminal:
python3 bing_scraper.py --search 'living room' --limit 100 --download --chromedriver /home/yui/Downloads/chromedriver_linux64/chromedriver
sudo apt install install ruby-full
gem install wp2txt
wp2txt -i file_name
mv ~/Downloads/kaggle.json ~/.kaggle
kaggle datasets download -d sorour/95cloud-cloud-segmentation-on-satellite-images