we can define it as using a program to get data from the web by pulling content without an API(Application Program Interface).
Many sites have APIs you can connect to and use to pull data from. Such as the Twitter API. This is great! But sometimes you need data from a site that doesn’t have an API.
Really any where you think it would be appropriate to gather data.
Some people have built web scraper to look for jobs & find apartments.
Companies may search for email or contact informations
Competitive analysis on a competing company, what prices do they have?
Realtors may scrape housing listings
Understand sentiment and words in reviews
Anytime you want data!
read sites terms of serive and robot files
https://towardsdatascience.com/ethics-in-web-scraping-b96b18136f01
1.) You should check a site’s terms and conditions before you scrape them.
2.) Space out your requests so you don’t overload the site’s server, doing this could get you blocked.
3.) Scrapers break after time - web pages change their layout all the time, you’ll more than likely have to rewrite your code.
4.) Web pages are usually inconsistent, more than likely you’ll have to clean up the data after scraping it.
5.) Every web page and situation is different, you’ll have to spend time configuring your scraper.
Go to a web page right click select inspect element you should not see a pop up or frame showing the HTML of a web page. Every “web-scraping job” is going to be unique, this is because almost every website is unique.
HTML stands for Hypertext Markup Language and every website on the internet uses it to display information. Even the jupyter notebook system uses it to display this information in your browser. If you right click on a website and select “View Page Source” you can see the raw HTML of a web page. This is the information that Python will be looking at to grab information from. Let’s take a look at a simple webpage’s HTML:
<!DOCTYPE html>
<html>
<head>
<title>Title on Browser Tab</title>
</head>
<body>
<h1> Website Header </h1>
<p> Some Paragraph </p>
<body>
</html>
Let’s breakdown these components.
Every
1.<DOCTYPE html> HTML documents will always start with this type declaration, letting the browser know its an HTML file.
2. The component blocks of the HTML document are placed between <html> and </html>.
3. Meta data and script connections (like a link to a CSS file or a JS file) are often placed in the <head> block.
4. The <title> tag block defines the title of the webpage (its what shows up in the tab of a website you're visiting).
5. Is between <body> and </body> tags are the blocks that will be visible to the site visitor.
6. Headings are defined by the <h1> through <h6> tags, where the number represents the size of the heading.
7. Paragraphs are defined by the <p> tag, this is essentially just normal text on the website.
8. <header>, <main>, <footer> denotes which part of the page elements belong
9. <a href=""></a> for hyperlinks, activates a link in the page
10. <ul>, <ol> creates lists
11. <li> contains items in lists
12. <br> Inserts a single line break
13. <table> for tables, <tr> for table rows, and <td> for table columns..
Self-closing Tags: most HTML tags require an opening and a closing tag. There are a few however that do not:
1. <img src=""> creates an image in the page
2. <br> creates a break in the content
3. <input type=""> creates an input field
4. <hr> Creates a line in the page
IDs, Classes
IDs and classes are very similar. These are used to target specific elements.
1. <h1 id="profile-header"></h1>
2. <h1 class="subject-header"></h1>
IDs should only be used once on a page. IDs can also be used to bring the user to a specific part of the page. your-site/#profile-picture will load the page near the profile picture.
Classes can be used multiple times on a page.
See More tags here
Learn more HTML here
CSS stands for Cascading Style Sheets, this is what gives “style” to a website, including colors and fonts, and even some animations! CSS uses tags such as id or class to connect an HTML element to a CSS feature, such as a particular color. id is a unique id for an HTML tag and must be unique within the HTML document, basically a single use connection. class defines a general style that can then be linked to multiple HTML tags. Basically if you only want a single html tag to be red, you would use an id tag, if you wanted several HTML tags/blocks to be red, you would create a class in your CSS doc and then link it to the rest of these blocks.
Here are three approaches for web scraping which are among the most popular:
1- Sending an HTTP request, ordinarily via Requests, to a webpage and then parsing the HTML (ordinarily using BeautifulSoup) which is returned to access the desired information.
ex: Standard web scraping problem, refer to the case study.
2- Using tools ordinarily used for automated software testing, primarily Selenium, to access a websites‘ content programmatically.
Typical Use Case: Websites which use Javascript or are otherwise not directly accessible through HTML.
3- Scrapy, which can be thought of as more of a general web scraping framework, which can be used to build spiders and scrape data from various websites minimizing repetitions.
Use Case: Scraping Amazon Reviews.
While you could scrape data using any other programming language as well, Python is commonly used due to its easy syntax as well as the large variety of libraries available for scraping purposes in Python.
Note: Since the standard combination of Requests + BeautifulSoup is generally the most flexible and easiest to pick up, we will Note that the tools above are not mutually exclusive; you might, for example, get some HTML text with Scrapy or Selenium and then parse it with BeautifulSoup.
for the needed libraries for the examples below, you can go to your command line and install them with conda install (if you are using anaconda distribution), or pip install for other python distributions.
1.) requests module to visit a URL and get web a webpage. which you can download by typing: pip install requests or conda install requests (for the Anaconda distrbution of Python) in your command prompt.
2.) BeautifulSoup is used to parse HTML and extract the information we need from our web page. which you can download by typing: pip install beautifulsoup4 or conda install beautifulsoup4 (for the Anaconda distrbution of Python) in your command prompt.
3.) lxml: XML toolkit is a Pythonic binding for the C libraries libxml2 and libxslt. It is unique in that it combines the speed and XML feature completeness of these libraries with the simplicity of a native Python API, mostly compatible but superior to the well-known ElementTree API. It aims to provide a Pythonic API by following as much as possible the ElementTree API. We’re trying to avoid inventing too many new APIs, or you having to learn new things – XML is complicated enough. , which you can download by typing: pip install lxml or conda install lxml (for the Anaconda distrbution of Python) in your command prompt.
#pip install requests,BeautifulSoup4
#conda install requests,beautifulsoup4
import requests
import bs4
To grab the title of a page,you can use the HTML block with the title tag.
For this task we will use www.example.com which is a website specifically made to serve as an example domain.
Requests will allow us to load a webpage into python so that we can parse it and manipulate it.
# Use the requests library to grab the page
# Note, this may fail if you have a firewall blocking Python/Jupyter
# Note sometimes you need to run this twice if it fails the first time
res = requests.get("http://www.example.com")
This object is a requests.models.Response object and it actually contains the information from the website, for example:
type(res)
requests.models.Response
print(res.text)
<!doctype html>
<html>
<head>
<title>Example Domain</title>
<meta charset="utf-8" />
<meta http-equiv="Content-type" content="text/html; charset=utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 2em;
background-color: #fdfdff;
border-radius: 0.5em;
box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
div {
margin: 0 auto;
width: auto;
}
}
</style>
</head>
<body>
<div>
<h1>Example Domain</h1>
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
</div>
</body>
</html>
To analyze the extracted page we’ll use BeautifulSoup .
Technically we could use our own custom script to look for items in the string of res.text but the BeautifulSoup library already has lots of built-in tools and methods to grab information from a string of this nature (basically an HTML file).
Beautiful Soup is a Python library for parsing data out of HTML and XML files. It is useful for navigating, searching, and modifying the parse tree. The major concept with Beautiful Soup is that it allows you to access elements of your page by following the CSS structures, such as grabbing all links, all headers, specific classes, or more. It is a powerful library.
Once we grab elements, Python makes it easy to write the elements or relevant components of the elements into other files, such as a CSV, that can be stored in a database or opened in other software.
First, we have to turn the website code into a Python object. We have already imported the Beautiful Soup library, so we can start calling some of the methods in the libary. Replace print(res.text) with the following. This turns the text into an Python object named soup.
An important note: You need to specify the specific parser that Beautiful Soup uses to parse your text. This is done in the second argument of the BeautifulSoup function. The default is the built in Python parser, which we can call using html.parser
You can also use lxml or html5lib. This is nicely described in the documentation.
Using the Beautiful Soup prettify() function, we can print the page to see the code printed in a readable and legible manner.
#Using BeautifulSoup you can create a "soup" object that contains all the "ingredients" of the webpage.
soup = bs4.BeautifulSoup(res.text,"lxml")
print(soup.prettify())
<!DOCTYPE html>
<html>
<head>
<title>
Example Domain
</title>
<meta charset="utf-8"/>
<meta content="text/html; charset=utf-8" http-equiv="Content-type"/>
<meta content="width=device-width, initial-scale=1" name="viewport"/>
<style type="text/css">
body {
background-color: #f0f0f2;
margin: 0;
padding: 0;
font-family: -apple-system, system-ui, BlinkMacSystemFont, "Segoe UI", "Open Sans", "Helvetica Neue", Helvetica, Arial, sans-serif;
}
div {
width: 600px;
margin: 5em auto;
padding: 2em;
background-color: #fdfdff;
border-radius: 0.5em;
box-shadow: 2px 3px 7px 2px rgba(0,0,0,0.02);
}
a:link, a:visited {
color: #38488f;
text-decoration: none;
}
@media (max-width: 700px) {
div {
margin: 0 auto;
width: auto;
}
}
</style>
</head>
<body>
<div>
<h1>
Example Domain
</h1>
<p>
This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.
</p>
<p>
<a href="https://www.iana.org/domains/example">
More information...
</a>
</p>
</div>
</body>
</html>
Beautiful Soup allows us to navigate the data structure. We called our Beautiful Soup object soup, so we can run the Beautiful Soup functions on this object.
# Access the title element
print(soup.title)
<title>Example Domain</title>
# Access the content of the title element
print(soup.title.string)
Example Domain
# Access data in the first 'p' tag
print(soup.p)
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
Use the:
.select() method to grab elements. We are looking for the ‘title’ tag, so we will pass in ‘title’ select(‘css selector’) –> List of Tags
soup.select('title')
[<title>Example Domain</title>]
soup.select("head > title")
[<title>Example Domain</title>]
type(soup.select('title'))
bs4.element.ResultSet
Notice what is returned here, its actually a list containing all the title elements (along with their tags). You can use indexing or even looping to grab the elements from the list. Since this object it still a specialized tag, we can use method calls to grab just the text.
p_tag = soup.select('p')
len(p_tag)
2
title_tag[0]
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
title_tag[1]
<p><a href="https://www.iana.org/domains/example">More information...</a></p>
type(title_tag[0])
bs4.element.Tag
p_tag[0].getText()
'This domain is for use in illustrative examples in documents. You may use this\n domain in literature without prior coordination or asking for permission.'
find_all(tags, keyword_args, attrs={‘attr’, ‘value’}) –> List of Tags
The find_all() method scans the entire document looking for results, but sometimes you only want to find one result. If you know a document only has one <body> tag, it’s a waste of time to scan the entire document looking for more. Rather than passing in limit=1 every time you call find_all, you can use the find() method.
find(tags, keyword_args, attrs={‘attr’, ‘value’}) –> Tag
soup.find_all("p")
#soup("p")
[<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>,
<p><a href="https://www.iana.org/domains/example">More information...</a></p>]
soup.find_all('p', limit=1)
#soup.find('p')
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
soup.find("body").find("p")
<p>This domain is for use in illustrative examples in documents. You may use this
domain in literature without prior coordination or asking for permission.</p>
soup.find("head").find("title")
<title>Example Domain</title>
We choose for this task to grab all the section headings of the Wikipedia Article on Room 641A from this URL: https://en.wikipedia.org/wiki/Room_641A
# First get the request
res = requests.get('https://en.wikipedia.org/wiki/Web_scraping')
# Create a soup from request
soup = bs4.BeautifulSoup(res.text,"lxml")
Now its time to figure out what we are actually looking for. Inspect the element on the page to see that the section headers have the class “mw-headline”. Because this is a class and not a straight tag, we need to adhere to some syntax for CSS. In this case
Syntax to pass to the .select() method |
Match Results |
---|---|
|
All elements with the |
|
The HTML element containing the |
|
All the HTML elements with the CSS |
|
Any elements named |
|
Any elements named |
soup.select(".mw-headline")
[<span class="mw-headline" id="History">History</span>,
<span class="mw-headline" id="Techniques">Techniques</span>,
<span class="mw-headline" id="Human_copy-and-paste">Human copy-and-paste</span>,
<span class="mw-headline" id="Text_pattern_matching">Text pattern matching</span>,
<span class="mw-headline" id="HTTP_programming">HTTP programming</span>,
<span class="mw-headline" id="HTML_parsing">HTML parsing</span>,
<span class="mw-headline" id="DOM_parsing">DOM parsing</span>,
<span class="mw-headline" id="Vertical_aggregation">Vertical aggregation</span>,
<span class="mw-headline" id="Semantic_annotation_recognizing">Semantic annotation recognizing</span>,
<span class="mw-headline" id="Computer_vision_web-page_analysis">Computer vision web-page analysis</span>,
<span class="mw-headline" id="Software">Software</span>,
<span class="mw-headline" id="Legal_issues">Legal issues</span>,
<span class="mw-headline" id="United_States">United States</span>,
<span class="mw-headline" id="The_EU">The EU</span>,
<span class="mw-headline" id="Australia">Australia</span>,
<span class="mw-headline" id="India">India</span>,
<span class="mw-headline" id="Methods_to_prevent_web_scraping">Methods to prevent web scraping</span>,
<span class="mw-headline" id="See_also">See also</span>,
<span class="mw-headline" id="References">References</span>]
for item in soup.select(".mw-headline"):
print(item.text)
History
Techniques
Human copy-and-paste
Text pattern matching
HTTP programming
HTML parsing
DOM parsing
Vertical aggregation
Semantic annotation recognizing
Computer vision web-page analysis
Software
Legal issues
United States
The EU
Australia
India
Methods to prevent web scraping
See also
References
We choose for this task to grab the Cicada image on this Wikipedia Page: https://en.wikipedia.org/wiki/Cicada_3301
res = requests.get("https://en.wikipedia.org/wiki/Mona_Lisa")
#exemple2:
#res = requests.get("https://en.wikipedia.org/wiki/Cicada_3301")
soup = bs4.BeautifulSoup(res.text,'lxml')
image_info = soup.select('.thumbimage')
image_info
[<img alt="" class="thumbimage" data-file-height="750" data-file-width="2500" decoding="async" height="114" src="//upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Mona_Lisa_margin_scribble.jpg/380px-Mona_Lisa_margin_scribble.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Mona_Lisa_margin_scribble.jpg/570px-Mona_Lisa_margin_scribble.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Mona_Lisa_margin_scribble.jpg/760px-Mona_Lisa_margin_scribble.jpg 2x" width="380"/>,
<img alt="" class="thumbimage" data-file-height="600" data-file-width="410" decoding="async" height="249" src="//upload.wikimedia.org/wikipedia/commons/thumb/3/36/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg/170px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/3/36/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg/255px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/3/36/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg/340px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28right_landscape%29.jpg 2x" width="170"/>,
<img alt="" class="thumbimage" data-file-height="569" data-file-width="758" decoding="async" height="165" src="//upload.wikimedia.org/wikipedia/commons/thumb/6/64/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg/220px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/6/64/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg/330px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/6/64/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg/440px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg 2x" width="220"/>,
<img alt="" class="thumbimage" data-file-height="1000" data-file-width="717" decoding="async" height="307" src="//upload.wikimedia.org/wikipedia/commons/thumb/3/33/Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg/220px-Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/3/33/Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg/330px-Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/3/33/Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg/440px-Raffaello_Sanzio_-_Portrait_of_a_Woman_-_WGA18948.jpg 2x" width="220"/>,
<div class="thumbimage"><a class="image" href="/wiki/File:Mona_Lisa_stolen-1911.jpg"><img alt="" data-file-height="887" data-file-width="640" decoding="async" height="277" src="//upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Mona_Lisa_stolen-1911.jpg/200px-Mona_Lisa_stolen-1911.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Mona_Lisa_stolen-1911.jpg/300px-Mona_Lisa_stolen-1911.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Mona_Lisa_stolen-1911.jpg/400px-Mona_Lisa_stolen-1911.jpg 2x" width="200"/></a></div>,
<div class="thumbimage"><a class="image" href="/wiki/File:Mona_Lisa_Found,_La_Joconde_est_Retrouv%C3%A9e,_Le_Petit_Parisien,_Num%C3%A9ro_13559,_13_December_1913.jpg"><img alt="" data-file-height="11805" data-file-width="8533" decoding="async" height="277" src="//upload.wikimedia.org/wikipedia/commons/thumb/1/15/Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg/200px-Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/1/15/Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg/300px-Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/1/15/Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg/400px-Mona_Lisa_Found%2C_La_Joconde_est_Retrouv%C3%A9e%2C_Le_Petit_Parisien%2C_Num%C3%A9ro_13559%2C_13_December_1913.jpg 2x" width="200"/></a></div>,
<div class="thumbimage"><a class="image" href="/wiki/File:Monalisa_uffizi_1913.jpg"><img alt="" data-file-height="1600" data-file-width="1252" decoding="async" height="256" src="//upload.wikimedia.org/wikipedia/commons/thumb/9/93/Monalisa_uffizi_1913.jpg/200px-Monalisa_uffizi_1913.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/9/93/Monalisa_uffizi_1913.jpg/300px-Monalisa_uffizi_1913.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/9/93/Monalisa_uffizi_1913.jpg/400px-Monalisa_uffizi_1913.jpg 2x" width="200"/></a></div>,
<div class="thumbimage"><a class="image" href="/wiki/File:After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg"><img alt="" data-file-height="2850" data-file-width="4000" decoding="async" height="143" src="//upload.wikimedia.org/wikipedia/commons/thumb/e/e5/After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg/200px-After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/e/e5/After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg/300px-After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/e/e5/After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg/400px-After_photo_for_the_return_of_Gioconda_at_the_Louvre_Museum_1914.jpg 2x" width="200"/></a></div>,
<img alt="" class="thumbimage" data-file-height="2304" data-file-width="3072" decoding="async" height="165" src="//upload.wikimedia.org/wikipedia/commons/thumb/0/0d/MonaLisaShield.jpg/220px-MonaLisaShield.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/0/0d/MonaLisaShield.jpg/330px-MonaLisaShield.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/0/0d/MonaLisaShield.jpg/440px-MonaLisaShield.jpg 2x" width="220"/>,
<img alt="" class="thumbimage" data-file-height="3240" data-file-width="5760" decoding="async" height="169" src="//upload.wikimedia.org/wikipedia/commons/thumb/2/25/Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg/300px-Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/2/25/Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg/450px-Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/2/25/Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg/600px-Crowd_looking_at_the_Mona_Lisa_at_the_Louvre.jpg 2x" width="300"/>,
<img alt="" class="thumbimage" data-file-height="2305" data-file-width="2598" decoding="async" height="195" src="//upload.wikimedia.org/wikipedia/commons/thumb/d/da/JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png/220px-JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png" srcset="//upload.wikimedia.org/wikipedia/commons/thumb/d/da/JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png/330px-JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png 1.5x, //upload.wikimedia.org/wikipedia/commons/thumb/d/da/JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png/440px-JFK%2C_Marie-Madeleine_Lioux%2C_Andr%C3%A9_Malraux%2C_Jackie%2C_L.B._Johnson%2C_unveiling_Mona_Lisa_at_National_Gallery_of_Art.png 2x" width="220"/>]
len(image_info)
11
image = image_info[2]
type(image)
bs4.element.Tag
You can make dictionary like calls for parts of the Tag, in this case, we are interested in the src , or “source” of the image, which should be its own .jpg or .png link:
image['src']
'//upload.wikimedia.org/wikipedia/commons/thumb/6/64/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg/220px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg'
Now that you have the actual src link, you can grab the image with requests and get along with the .content attribute. Note how we had to add http:// before the link, if you don’t do this, requests will complain (but it gives you a pretty descriptive error code).
image_link = requests.get('http://upload.wikimedia.org/wikipedia/commons/thumb/e/e8/Mona_Lisa_margin_scribble.jpg/380px-Mona_Lisa_margin_scribble.jpg')
image_link = requests.get('http://upload.wikimedia.org/wikipedia/commons/thumb/6/64/Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg/220px-Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_%28dite_La_Joconde%29_-_Louvre_779_-_Detail_%28hands%29.jpg')
# The raw content (its a binary file, meaning we will need to use binary read/write methods for saving it)
image_link.content
b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x00H\x00H\x00\x00\xff\xfe\x00\xa1File source: https://commons.wikimedia.org/wiki/File:Leonardo_di_ser_Piero_da_Vinci_-_Portrait_de_Mona_Lisa_(dite_La_Joconde)_-_Louvre_779_-_Detail_(hands).jpg\xff\xdb\x00C\x00\x06\x04\x05\x06\x05\x04\x06\x06\x05\x06\x07\x07\x06\x08\n\x10\n\n\t\t\n\x14\x0e\x0f\x0c\x10\x17\x14\x18\x18\x17\x14\x16\x16\x1a\x1d%\x1f\x1a\x1b#\x1c\x16\x16 , #&\')*)\x19\x1f-0-(0%()(\xff\xdb\x00C\x01\x07\x07\x07\n\x08\n\x13\n\n\x13(\x1a\x16\x1a((((((((((((((((((((((((((((((((((((((((((((((((((\xff\xc0\x00\x11\x08\x00\xa5\x00\xdc\x03\x01"\x00\x02\x11\x01\x03\x11\x01\xff\xc4\x00\x1c\x00\x00\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x02\x03\x00\x01\x04\x05\x06\x07\x08\xff\xc4\x00:\x10\x00\x01\x03\x03\x02\x04\x03\x06\x05\x03\x03\x05\x01\x00\x00\x00\x01\x02\x03\x11\x00\x04!\x121\x05AQa\x13"q\x062\x81\x91\xa1\xb1\x14#B\xc1\xd1R\xe1\xf0\x15br\x16%4S\xd2\xf1\xff\xc4\x00\x19\x01\x00\x03\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x02\x03\x04\x05\xff\xc4\x00*\x11\x00\x02\x02\x02\x02\x02\x00\x04\x06\x03\x00\x00\x00\x00\x00\x00\x00\x01\x02\x11\x03!\x121\x04A\x13"Q\xf0\x052aq\x81\xa1\x91\xc1\xd1\xff\xda\x00\x0c\x03\x01\x00\x02\x11\x03\x11\x00?\x00\xf9\xd5\xd5\xd3\x8f\xad_\xa5\n9\x83\x13\xebY\xfd\xe0\x9d*\x11;t\xf5\xaaR\x8a\xd4 \x006\xdbaL|4X`6\xa5$\xa5%.(\xf5\xd4MeI#\xa2\xd8\x0e\x02\x84\x84\x87\x10I\x13\x89\x14\x931<\xf9\xe6\x9c\x94x\xe5*\xd2W\x88\x10>\xfd*(\x06\x96\xa0\xb1\xa4\xa4\xfa\xc5O5\xd2\x1f\x16\xf6\xc1\x876NGS\x83W\xa9R\xa4\x94.D\xc8\x07\xebI7\x12\xa1\x85\x98\xdb\x15?\x1c\xea\xa1)\x1aS2q\xbdKO\xe8=\x04\xea\xc1AJ\x8a\xd2c\x1a\xb3?\xbdcrR\xb9\x90DN+K\xcb\xf1\xe1k\xc2#\x1dM(\x06\xb7@"9\x934\x93\xa0ft\x99D\x82s\xf4\xa5%Y#\x91\xcd\x19\x84\xf9\xb9\x1aR\xfd\xe5G:\xda&r\x0bbg2b\x85$i$\x9c\xc5\x1b\x98\xce6\x9aI$\'\x03\x1b\xd3$2\xa5$\xc4\xe4\x8d\xbdi\x8c\xa8\xb7\xa9Bd\xfdi\x12\n\x89\x1e\x82\x9c\xd2\x14\xe3\x81)\x05_\xa8\x81I\x80\xe6\xa4\x00TaGo\x8d\x17\x95&\\2\x01\x80\x01\xdf\xfbS\x14P\xd2\x7f0\xeb_@p+\x13\xce\x15\xac\x95u\xc5f\xbea\xb7A\xbc\xf2\x9c88;\xc5gQ\x98\x03\x90\xc5MX\xcd\t\x89$|+T\xa8\x91\x83\x96N\xf4S\xa4\x80\'\xbd\n\x7fQ\x15\x15\xb8\x14\x01\x14y\xf6\xa8\x99\x81V\x90\t<\xb1\x89\xab\x1d;\xd00\xd2L\xaa11O\xb7QCz\xc6\xe4i\x06\x90\xa1)@H\xf3g\xd7\xa51\xc0\x02\x8aR|\xa9\x00TKe\x01t\xf1R\x89\x99\x03\x03\xd2\x94\x1c$b@\xf5\xa09Y&\x8c\'"yr\xaaZA\xd8\xf6\x81R\x84\x13\xdeMid($\x81\x07;\x90),\x88\x85\x11OO\x86\x06J\xa7\x9c\n\x96\xcd\x127\xad%\t\xc8\x04\x01\xf3\xacE\xc2\xa8\x03R\xc9\xd9#\x99\xeb\x14\xe7\xd4L$\x1c\x02r\x0fj\x12\xd1\xf0\x82\x10\x95\x95\xbf\x01)L\x02S\xd3\xe3Cu\xd8\x92\xbe\x85)\xc4"|E\x15\x7f\xb1\n\x81=\xfa\xd2\x12\xa7\x16\xa3\xe0\xa1a\\\xccDWr\xc6\xc4%.!\x96\x92\xf3\xc3%iV\x96\x9a\x8f\xeap\xee}1\xebM{\x852\xd1\n\xbe\xbc\xd6`~[\t\xd2\x90;\x93\x9a\xc5\xe7\x8c]\x1a,Rg\x04~!$\x03\xe1\xc8\xdeH\x15n-\xef\xe9\xd4;*k\xa6\x97\xecY?\x97h\x85n%\xc3\xa8\xd1\xa2\xd8\xdd<\x12\xc1Sj \x10\x14\xd2y\x9e[R\xf8\xbe\xda\xa0\xf8\x7fFy\xe2\xe2\x95\x01q\xe94\xd0\x15\xa4@\xc0\xfd#\x9dm\xe2\x16.4J\x16\x1aqC\xa0\x82\rs\xdaYh\xe9P\xc6\xdenU\xb4d\xa6\xae&m8\xbd\x91\xe2V\x9f\xf6r\xa4G\x9aOJk\xa4\x00\x92\x0e\xfc\xfb\xd2yL\xe6\xae$H5*G-\xe8\t\x91\xb7*\xbc\xe1"\xac$\x129\xf2\xa6I\x1bF\xa1\xa9P\x13\xb7\xadjJ\xc3h\x84\xe9O<oY\xf5i\x85\x1f*\xa7\x15\x07\x9c\x03\x14\x9a\xb0L\xb5.z\xe3\xad-R\xa5O\xd2\x98T\x010&9\xf5\xa0R\xc9\xdb\x14$ \t\xe9\xb7J\xa5{\xd8\xc5T\xfa\xd1\x0cJ\x81\xcfj\xa0.b\' w\xaa\x99 \xf4\xe5TA)\xedD\x910y\n@\x18\xea&zP\x95FG*%\x18\x1c\xe7\xbd\x04r\xa0\x065\x85\x05\x92\x06\x91\x8e\xe6\xa12\x0e2j9\tHH\xdfs@\x938\x8aT>\x88\x813L\xd2\';\xd2\xc1\xc7z4\xc11\x07U\x0chzIV\xdb\x03\xca\x9e\x16F\xea\t\xed\x9aK!\x01#\\H\xfd#\xf9\xa7\x10?\xa1?\x12\x7f\x9a\x83T5c]\xc2ZN$\xe9W\xf9\xe9[\xd0\xe2TM\xca\xd1(s\xf2\xdaITBG?\xa4|\xeb\x98\xd8%\xd2\xa5\x02t\xa7\x99\xe6p+\xa0\x96\xf5\xbe\x1bmI\x08h%\x1ec\x05Q\xd3\xd7&\xa35\x15\x8c\xd7f\xea\x8b\xfep\x12\xca|\xeaBp\x0fH\xfbu\xa5)\x9b\xae1rV\x13\xa1\xa9\x92Bq<\x80\xeaz\n\xea\xa6\xd9\xa3\xe0\xb6\xea\x14\xb58\xadZS\x89\xde\x00\x1d>\xf2f\x9e\xae8\xc5\x8f\x0e\n\xe1a\x7f\x8cPSa\xf5$\x042\x90b\x1aOx\xf7\x8eMp|Gw\x05\xb3\xa5\xafL\xc3q`\xdf\x0cn\x18(r\xf2F\xa4\x97BtvR\xba\xff\x00\xc6\x00\xeak:Y\xbb\xb82.\x9aL\xe5I\xb4\x04\x84\x81\xd5\xc5c\xe4Oz_\x0f\xe1\x8e^\xbc]\xb8\x92\x88\xf1T\xe3\xb3\xa5\t\xfe\xa5\x1e\x87\x90\xdc\xf2\xa7\x17\xc5\xfa\xcd\xab~+\x89JIn\xdfHKGNJ\xdeT\x8f(\x00\x90\x91\xe5\x10\x07Z~\xea\xed\xfbd\xfe\xa78X\x1b\xc2\xef\xe1\x14\xf3\xcd6N\xa7\xd6\xb0\x86\xd3\xdc\x9d\xbe\x1b\xd6;\xde\x14\xed\xb3)y\xc7e\xa5\x12\x12\xa2\x92\x01#\xa4\xe6\xbd3\x1e\x15\xd3\x7f\x8d\xbfZ\xff\x00\xd1\xedJ\x90\xca\x01\xd2.]\x1b\xe9\x1c\x92=1\x81Y\x11n\xae2\xe7\xfa\x9f\x11Yn\xc9\xb3\xe1\xa1\xb4\x0eC`\x91\xd4\x93\x03\xa9\xab\x8eyE\xef\xa5\xf7_\xb92\x82k\xf5<\x89\x95yS\xe6\x8f\x89\xa1ILi\x1eev\xcf\xda\xbe\xdb\xec\xef\xb1\xd6\x8ai\x97\xb8\xbb(R\x89\x96\xec\xd3!\xb6\x87=b<\xca\x98\x99\xc7\xad{\x16\xed\x19\xb5i\x7f\x85a\xa6\x84\xe1-\xa4$|"\x94\xbf\x12K\xa8\x92\xbcV\xfbg\xe6Y\t\x1ec\xa4\x93\xce\x8d\xbd(:\x84,s\xaf\xbf\xdf\xd9\xa5\xe4C\xa1*\x07}I\x06k\xc7\xf1Od,o\x14He6\xee\x1f\xd6\xc8\xd2g\xd3jp\xfcJ/\xf3*\tx\xaf\xd3>`\xf1B\xd4N\xf3\xda\x96|\x80\x01\xf3\xe9^\xa3\x8a\xfb\x1b\xc4-\t6\xcbE\xc2F\xff\x00\xa1_-\x8dyw\xd8z\xd9~\x1d\xc3JmC\x04(Ewc\xcb\x0c\x8b\xe5vs\xce\x12\x8fh\x02w\x8cb\x84H\xdcoP\x8c(\x90b\x84\x82\x08\x04w\x8a\xd4\xc8`\x13\xb0\x8a\x91\xbd\nNy\xc5\x18\xf7\xba\n\x00\xb0\x9eC&\xaeG\xacTR\xa3\x03z\x1d\x8f:C)Fj\xdb\x92\xa9\'l\xd0\xab\x1d(\x9b\x18\x1d\xfaS\x00\x95$\xcf\xda\xac\x0f\x9d1I\x85\t1\x1b\xd0\x932\x10\x08\xff\x00q\xa4:)\xb4\x05\x12O\x95#\xae\xe6\x9a\x95% \x84\x9d=O3J*H&\x0c\x83\x92*\r\xe2@\x8a\x96\xac\xb44AP\xdc\xc7\xc2\xb5!\x08P\xca\x02\xbb\xea5\x95\t\x93"O*\xe92T\x84B[\x04L\xe6\xa6\xe8\xb4\xac\xce\x82C\xc3\xbc}\xebM\xb4\x17\x03\xb1%2~;\n\xc0\xb5\xe9R\x08=\xff\x00\xb5jC\xa00\xa00h\xc9\x1b\x14\x19\xe8..\x93cop\xf6\xa2n\x1c\xd4\xc5\xb8\x9c% iR\x87\xed\xeb\xda\xb1p\xfbO\x19V\x81\xd4\xe9i\x96\x03\xae\x11\xb0I$\xe7\xe1\\\xeb\x9b\xa5\xba\xf0.\xa8\xac\x81\x00\xf4\x11\xb5t\xef\x1f\x16\xdc5M5\xef\\\x04"g%\t\x1b\xf6\x93\xf6\xae\x1f\x86\xe0\x94Wl\xeaRM\xb7\xf4\x0f\x8b\xf1\x81sf\xd5\x95\x9bjj\xd1P\xe2\xc12\xb7W\xc9J\x8e}\x07!\x81JHM\xb5\xb0\xe1,,\x0b\x8b\xc2\r\xf3\xe0\xfb\xa9\x19\xf0\xc7d\x80T\xa3\xcc\x88\xd8V\x04$\x17\x9b($\x84\xb2\x17\x83\xcc$\xfe\xf5\xa2\xdd(e\xab\x85\xe1C\xc0\r#\xfeK\x89\xfa\x03M\xc5ER\xfbd\xdb{fk\xa7\xcd\xc25 \x14\xdb\xa5\xc5\x06\x9b$\x9d\x08\x03\xca\x9f\xdc\xf7&\xbd\xc7\xb3\xd6\xff\x00\xf7\x9e\x17f\xa0\xaf\x06\xcd\x94\xbe\xa4r.\x112~&~\x15\xe1\x1b\xff\x00\xc5p\x7fJ\xc9\x9fQ\xfd\xab\xd9\xfb1z\x978\xed\xf3\xbe\xe9\x01)\xc9\xe8+?!Tiz\xbf\xf4<{g\xd3\x11p\x0b`\x12U\xe6\x07\xb59o\xa599\xce#\xd2\xbc\xf3wC@$\x94\xc1\x9fJ5\xdc\xa5Fg\'\xbdy\x94t\x9d2\xe8ZeX\x11\xd3\xbdfSpH*\x06H)\x11Y\xd8\xb8\x04\x00N\xe4r\xefZ\n\xd3\x19\xcfzT3\x15\xd25\x1c\x8c\xefY\x95emr\x82\xdd\xcb\r:\x93\x00\xa5i\x04\x1a\xe9\xb85\t\x89\xdf\x1dk1\x05*\xc6\x083M:\xe8tyN9\xecO\x0b}\xb2\xab&\xcd\xa3\x84O\xe5\x92S=\xc1\xfd\xab\xe6\xdc[\x86\\p\xb7\xf4\\\x8cl\x95\xa7e\x7fz\xfb{\xa9+\xdc\xef\\\xdb\x8e\x0e\x9b\xc5\x99\x08X\xd8\xa5I\x04\x1e\xc4\x1eU\xdb\xe3\xf9\x93\xc6\xeaN\xd1\x86_\x1e3Z\xd3>2\x8e\xf4D\xe3\x1b\xfaW\xbe\xe2\x9e\xc5\xda\xba\xb5&\xc9F\xca\xe4n\xd2\x8e\xa6\xd7\xff\x00\x19\xc8\xf4\xaf\x19\xc5xU\xe7\nuM\xdd\xb5\xe5L\x8di2?\x9a\xf51y8\xf2\xea/g\x0c\xf0\xcf\x1fh\xc9\xcf\x91\xa2P\t\x883\x8eF\x93\xaf\xcb\xbe;Q\x13\xab\x7f\x85ndQ\x88\xce\xf4i\xc7-\xb9\xd2\xc9\x9c\x93\xdb\xd6\x8ft\xcd0\r*\x00\xc0\xf9\x9a\xa2L\x99T\xc8\xaah\xc3\x88\xf5\xaa>RG:C\nRH\x93"(\x92\xa02\x91\x02f\x930d\x82c|\xd4\xd5\x03\xcd\x8a\x06j\x0bR\x81\xcc\x88\xadm\x18@\xc4\xfa\xe2+\x9a\x95`i\xf9V\xb6\xdcV\x81\xb5KE\xc5\x82\xf1:\xbb\xf7\xa3a^`#\xb5-\xe3\'l\xd56\xe1A\xd8\x1c\xc9\x9a\xa6\xad\x10\x9d3P\xd2>Q\xe9DW\xf9m\xa8\x92\xa3\xa62v\xedIK\x80\t\x11\xb4+\x9d\x02T\x13\xb7\xd3\x9dd\xe2j\xa4ki\xd4\xb6\x85\tL\x96\xc3~\x99\x93\xfbU%\xe5\xe9CegCd\x94\x8eBk;j\xc7)\x9ai\x80d\x10gs\xd34\xb8+\x1f \xed\x1dHq\xd6\x96|\xae\r9\xff\x00=ku\x95\xf1\xb2\xe2\xae+W\xe5\xac\x84\xea\x9f\xf3\xb5q\xd6!z\x81\x12\x93\xb84E\xe4-\x03Q:\xfaT\xcb\x15\xff\x00!\x19\xd1\xef\xd3\xc5\xa5\x84\x92\xa8\x93\xb1\xfb\xd3\xad\xaf\xf5\x19\x0b\x9cr\xaf\x9d\xb7v\xff\x00\x88\x942J\xd6L\x04\xe4\x9f@+\xd3\xf0\xdb^#n\x9f\x16\xe9)m\xbd\xfc5+\xcc;\xc0\xdb\xe2k\x8f\'\x8e\xa0\xbb:#\x97\x91\xec\xadnH\x1380v\xef[\xdb\xb8\xe4p9W\x19\x85 "A\x92kSn\x02\xac+\x13\xb0\xaf>h\xe9Ga.\x8dQ\xb0\x8a\x8ap\x02v\x13\xce\xb16\xf0RU$DoIZ\xd7\xaaQ\x91\xd2bj(\xa3\xac\x94\x82\x90dm\x14\xb4\xdc\xa1\xa5\xc7\x94\x19\xe5Y\xed\xa5\xc4\xe1FF`\x99\xae7\x1df\xec\xb6\\`\x99\x19\x8d\xe7\x14\xe3\x0bt\r\xd6\xcfP--\xefP|P\x16\x9385\xc2\xe3\xdc.\xee\xd9\xa3\xe0E\xd3 {\x8eeC\xb0<\xc5y\xce\t\xed\x88a\xef\x02\xe8B\x81\x89>\xb5\xeeZ\xe3,\xde0\x9f8T\x9egz\xb9c\x9e\'\xb4\nQ\x9fG\xc7\xf8\x9e\x97\x10\xef\x87\xc3\xd3\xe5\x94\xeaCpA\xaf>\xac\x1d\nJ\xa7\xa4\x11;\xd7\xd6x\xe5\x97\x80\xf2\xee\xecT\x12\xea\xa0\xad\xb3:\x1c\x8fN}\rr\x97\xc4\\R\xd2\x94\xa5\x92\x81\x9f\n\xe1!@v\x07q^\x86/-\xa5\xa8\xdf\xf2re\xf1\xed\xdbg\x8d\xb2\xe0<J\xe9\x80\xf3V\xda\x18\'O\x88\xea\xc2\x13\'\xebZ\x7f\xe9\x9e+\xa4\xa9\r\xb0\xf0H\x92\x19}+;t\x15\xeal/\xadm\xdc#S\xf6\xae*%\x0e\'\xc4eC\xaco\xcf\x04mL\xe2\x8a\xb4)K\xac\x9f\x10\x11\x1a\xda\'Z\x15\xce\t\xf7\x87c\xf3\xa8~n^U_\xd3\x14|hQ\xe1\x13b\xfa\\(\xb8mL\x1ca\xc4\x90k\xad\xc3\xf8e\x92\xd5\xf9\xda\xddP\xce\x82J>\x9c\xfeu\xdaE\xe2\x9c@b\xf1-]0\xe0\x01\xb7\xce\xe7\xb1<\x8f(5\x8d\xd6\xedX.xn9l\xa0cJ\xfc\xe89\xe6\x0f\xbbN^L\xe7\xae\xbfb\xa3\x82+e\xdc\xf0\xce\x12\xda\n\x1c\xe1\x8bB\xf4\xcc\xa5\xe2\x85}pk!\xe1\xdc=AB\xde\xe9\xebe(\x00\x1b\xbaoP\xdb>a\x91]F\x16\xb6\x18\x1a\xc2\x1d\xb7V\x9c-Z\x9b&v\x9d\xd2O\xca\x95{j\x12\x99i\'N\x91,\xb9\x9eG)<\xc7z\xca\x19d\xb4\xe4\xff\x00\xcf\xfd4x\xe3\xdd\x18\x1c\xe0\xae\xb4\xd1[\x96\x8d\xbe\xda\x88\xd2\xbbU\x85\x01\xf2\xc8\xf9Vtp\x958%\x8f\x18$`\x856d\x1f\x85=\t\x00\xa5v\x8e)\x0b\xe91\xa6\xb5\xdb=t\x1b\x82\xa77\xea+\xa63\x9a\xf7\xf7\xfd\x998\xc5\xf6\x8f4\xf0%\xc04\xd0)\x00\x99\x1b\xf6\xad\x0e\'\xce4\xccm4*\x04aD\x1eu\xe9Y\xc5Fr\x81\x00\x8f*\x85XJ\xc4A\x91\xd6\x9eH\xc1\x1b\x8aI\x90|\xbf:C\xaa\x03QI\xd8\xc7\xda\xaf\xc6\xe9\xbf0y\xd0\x92 dR\xcb\x84\x93\x9d\xf7\xc5\x14M\x9b,\xed\xee8\x85\xd8\xb7\xb2ao<\xbd\x90\x81\xb7rv\x03\xb9\xc5z\xee\x17\xecm\xb2Z\xf1x\xcb\xebR\xbf\xf5\xb1\x84zj\xdc\xfc)\xde\xcc\xa9\x9e\x1b\xec\xfbKC \xbe\xf8\xf1\x16\xb3\xb9&c\xe4)\xab\xbdr\xe1i\x0e\xa4\xc90\x80\x06s\xb0\x15\xe7f\xcf6\xdca\xa4vc\xc3\x14\xae[;<.\xd2\xd1\x95\xf8\x166\xcc\xdb\x83\x00\x94&TD\xf5\xc9\xae\xbf\x10\xb2l2@ \xa8\xf2\x8a\xc7\xc2\xec\x95\xc3\x92\xb7\xae\x88\xfcK\x83N\x94\x99\x08\x1d=k\xa8\xd2?\x16\x92bD\xc4\xf4\xaf>OwgJJ\x8f\x1a\x14\xabW\x16\xc2\xcf?)\xe4F\xf5\xa9\x97\xc1"\x15\x19\xad\x9cn\xc5\x0e6\xa4\xec\xb4\xfb\xaa\xe9^k\xc4r\xdd\xef\r\xe2d`(\x0fz\xb6IdV&\xdcOE\xe2\xc2r@;V\x86\xc2\xc2fdMqZ\xb9\x84\xe4\x93\xe9[l\xf8\x82P\xbd* \x81\x18\'\xefY\xca\x0f\xd1jH\x0b\xbb\xf7,]\x01\xd5@Q\xc1\'\x06\xbb\xfc*\xe5\x97\xd0\x9f\x10\x82T\x04\x8d\xf1@\xf5\x9d\xaf\x15\xb4,>5\x05\xf9\x92F\xe3\xbdx\xcb\xc6\xaf}\x99\xb9\x1e)[\x96J>U\xf3O\xad\x11\x82\xc8\xa9i\x83\x97\r\xbe\x8e\xbf\xb5\x9e\xc9\xb5z\xb5\xdd\xf0\xe4\xa1.\xe7R#\x0b\xfe\rx\xabg\xef\xb8[\xea\r\xeb\xd2\x93\x05\xa5n+\xe8\\+\x8c\xb5t\x84\x92\xa0A\xc69W7\x8f\xd8#\x88iv\xdc\x04\xdcD\xe3\x1a\x87J\xd7\x1eYG\xe4\x9fDK\x1a\x7f4\x0e}\xb7\x18G\x10d\x02\xa5j\x88)<\x8f\xa5\x1b\xadk@[f\x15\xb63\xfe\x1a\xf3\x97\x10\x9b\xa9RKn\'\x00\xa4g\xe3\xd6\xb6Y\xf1b\xc2\xc3Wa \xc6\x17\x92\x0f\xafJ\xd7\xe1{\x88\xa3\x95=H\xdbs\xa1\xd4\xa8\xa4C`y\xd0\x0c\xe9\xee9\x81\\\xe6\xd6m/\x13\x9dM\x13\x83;\x83O\xbd\xbdd\xb8\xc2\xd8RW\xa8\xe9X\x03\x04\x1a\xc8\xfe\x85\x95\x86\xcf\x98\x12\x7f\xccT\xc6\x15\xa7\xd3\x14\x9aOF\x95\x84\xb6\x958\x94%\xc6\xdc\xca\xdb\xe5;H\xefM(\xfcK`\x0c\xa8\xa4\xe8t\x8fx\x0f\xd2\xa1\xd6\xb1\x0b\x95%\x84\xe9VD\x88\xdf\x9c\xd3\xecn\xdbqJ\x81\x1a\x81\xd6\x8d\xa7\xb8\xefJP\x92V\ni\xb1-\xa9\xeb]~\x01PA\xf7\x9a9\xf8\x8a\xd5k\xc65 6\xb4\x80 \r1\x8d\xa2\x87\x88%!\x96\x9ch\xc9\x12& \xfa\xd6\x05 \\\x02\xb4\x90\x1c\x03"\xa9F3V\xd0\x9c\x9c]#C\xcd\xeauJ\x00\'\x7f.\xd9\x9a\xd3h\x16[2\x9dY\xde\x05s\x92\xb5)\xb8\n\xf3Q\xdb:\xea[\x80\xb2\x9c\xff\x00TV\xb1\x8b\xe8\x875\xd9\x85\xc5I\x06 \x01\x03;R\x88J\x8c\x92d\xf2\xedOR5\xa8y\x93=\xcd\x0e\x84\x82L\xa4\xfdk\xbe\xceQ\x1apy\xe6\x81I\x9c\xf2<\xcf:t\x89\x04\x89\xdf\xde\x1bR\xd5\xc8c\x19\xc5\n\xc4\xc4)\xb3\x06)E11[4\x81\xcc\xaaF\xd4\x0c\xa2\\\x1a\x84\x8e\xd5V.\'\xb4t\x0b{\x0bVD\x0f\r\xa4\x88\xeb\x8f\xe6\xbb\\\x1e\xd9<9\xbf\xc6\xdd\xc7\xe2\xd6!\t;\xb2\x9f\xfe\x8f\xdb\x1dk\x88n\xd2\xed\xd5\xa3\xf8\xd2\x08"v\x91\xb7\xd6\xb4^\xde\xa9\xd1.(\xcfB$o^<\x94\x9e\x8fKH\xe99x\xe5\xd5\xd2Zl\xfb\xca\t\x1f\xe7\xa5zf\x1f\r\xb2Yo\x00\xe0\x93\xfa\xab\xcap4\x96\x18r\xed\xc8\xd4\xb1\x08\x93\xb0\xe6k\xbfd\xf7\x85l\x1f|\x1dk2\x01\xe9\xcb\xf9\xac2%t\x8a\x88<@\x06\x90\xa2\xa5\x1c\x9cu\x9a\xf2\xfca\x9d`\x93\x00\x9d\xa2\xb7q~"VF\x92LL\x91\\\'\xae\xca\x94\xa0\xac\x99\x81Z\xe2\x83N\xc9\x9bTU\xb5\xee\x87B\x1d!$\x9c\x11\xb1\xadW\xec\x07\x9a\xf1\x1aV\x97\x01\xc4W\x06\xe9\xd0\x01\'\x04\xcd\r\x97\x15[_\x96\xf4\x94\x83\x01S\xf7\xae\x97\x89\xbf\x9a&\x1c\xd2\xd3=\x0f\x05\xe3k\xb7\xb8K7J\x08_#\xc9_\xe7J\xf6N=o\xc4\xed\x0bO\xc2\x90\xa4\xe67\xaf\x99\xdd\xb8\xcd\xca\x94G\xbd\xca\x9bm\xc5\x9f\xb3G\x86\xe2\xd4\xa4\xe0\x02\x0eb\xb1\xc9\x83\x96\xe3\xa6k\x0c\xb5\xa9\x17{h\xe7\x07\xbe&\xcdz\xed\xf9\x89\x8d5\xd9\xe0\xbcQ\xa7@q\n\x13\xbew\x06\xb9\x86\xf1\x17\x89$\xac\x10w\xaes\xcc\xf8\nS\xb6\xaa)^\xd8\xd8\xd6\x9c9\xaa\x9fb\xe5\xc1\xdcz=\x17\x15\xe1\xed\xf11\xe2[\x90\x1fFw\xc2\xbdk\xcb)^\x1d\xe3\x8c\xbe\x060\xa0k\xb1\xc1x\xe2L\xa1\xd1\xa1\xd4\x88P\xeb\xcej\xbd\xa3J.\xad\x7f\x16\x8d\x01\xc4\x89\xd49\xf6\xa9\x82\x94\x1f\x19\x0eiMs\x8fg%\xa6\xd2\xd3\x89\xc6\xc4\x91Hip\xa2\xa0\xb9\xf8\n\x04\xdc~R\xbc\xdec\x81\xdazP4A\x10\x0e\xc2\xba\x12{\xb3\x9a\xd5\xaa6-\xe4\xa6\xd0\x85\x18!D|\xe96\xee\x1dAD\xe4\xe7\x1b\xd2\xdep\x90\x90\x0fs=\xea\x90\xb8\\\x9d\xc55\x1d\t\xcbf\xf2\xf9Rbv\xf2\x89\xa5\x05\x10AI\x03\xd0\xd6w\x1d;U\x05\x90\xa1\x06\x92\xc7@\xe7f\xd4iQ\x0bL\x05\x010v\xa8\xd8%>\xf7>\xd5\x909\x07\x14\xc4<\x98\xde~\x15J4\x17b\x1d\x1aT5\x02\x93\xca\x82b2\xb8\x89\xad\xce\xb6@\x1a\xc8\xc9\xacj@V\xa1 \xf75\xd5h\xc9\xa1&y\xa8\xfd\xea$\xa8\xf2\xcfJxm\xb4\xb7\x89\xd5\xeb@\x99\x93\x18\x00oE\x8a\x8aI\xc2\xa4\xccr\x14H^\x82\x92> Uh\x00\x99\xe7\xc8\xd3\x14\x9d8\xf2\xcf\xda\x95\xfa\x1d\x1a\xac\xdfZY\xd2\x080\xac\x05\r\xbaWm/!\xff\x00\x01\xbdQ d\xf2\x15\xe5\xcb\x86`V\x96oT\xdbh\x05#R63\xcb\xa1\xae|\xb8\xbd\xa3|y=3\xe9<44\xb5-\xf7\xd2E\xb38J@\xc2\xd49z\n\xe5{C\xc6\x14\xf2\x95\xa4\x94\x8c\xe2yV{\xcb\xe5\xff\x00\xa7[\xb6\xd0\x80\x1bN\xc6 \x11;|k\xca\xf1\x1b\x85\xb8R\x06\xa0y\xcdqa\xc2\xe4\xed\x9d\x19rqZ4\xae\xf2\xe5\xc4\xcc\xa4\x0eR)\x05\x0bY:\x9dP\x9c\xf9k+\x0e\x1d\nA\x11\x07\x11F\xa7T\x93\xef\x08\x8a\xee\xe3ZG#\x95\xf65v\xc9Q>u\x9fSE\xf8di\xcc\x1a\xc6\xa7U\x82\x16\x98\xe9V\\Y@%b\x0f\xad6\xa5\xf5\x0b\x8f\xd0p\xb6\x00\xf9U\xa7\xd0\xd3\x91j\xee\x82u\x83\xcf5\x88\x15L\x95c\xd2\x9d\xad`\x10\x1c#\x11\x81K\x8b\x0eH\xb3h\xe2\x17\xa9\x0e)*\xed\x8a2\xe3\xd9\n\xd2\xb1\xd4VB\xa5\x19\x95\xa8\xfa\x8a\xa2\xa8L\x12I\xec)q\xbe\xc4\xa5]\x03p\xcb\x8b_\x89\xa8\x0209\x1f\x9dG\x1d\xbaS%\xb5\xa9E?:\xadC\x96\xa9\xa1\x91\x89\x04\xd5\x93\x7fA\x01KA\x00\xcf\xca\x98\x95\x94\x0c\x8aa\xc8\xc2s\xd6ie#~~\xb5Zd\x8bS\xa4\xaaH\x93F\x1c)V \xc1\xdf\xadE$\x9d\x8e;\x8a\x18\x00\x99\x83T\xa865.\t\x15<@w\xe9\xce\x95 \x18\x1bU\x95\x0e\x94q\x00\xd4\xb3"\x06\xf4\xc0\xe1\x8cVe\x938\xc51\x12\x13\x8aj!f\xfb\x85\xa9Zt\x99I\xe9HJ\x14d\xa8\xc04\xd2\xa0L\x92\r\tR\xb1\x9c\x13\xb9\x14\xe8vPH\x05D\x9c\xf4\x8a0!$\xf6\xa5\xca\xb7P=\xb9S\n\xe0G\xea\xa4\xd0 \n$\xc6>uN()[\xfc\xa8\x1cs\x04\xce\xfb\x11T\x94I\x93 w\xa2\xab\xb1\x96\x9f*\x86b:\xd1Fd\x88\x1dj\'\xcagL\xe3sB\t3\'\xbcT\xbd\x8d#\xbe\xbb\xb0\xed\xb7\x97\x04\x00 \xc6yW\x15\xc5\x85#\xcd2\x0cE\x0f\x8a\xa4yF\xc6\x97\xbc\x92`\xfa\xd6Q\x87\x12\xe5;\x18\x95\x94\xacd\x12y\xf5\xa2L\xa8\x90S\xab\xb8\xac\xee($\x8d\'\x1d\xaa\xc3\x82\'\'\x9e\xf5n$rF\x9d\x11\x8d\n\x07\xd6\x88#L\xc8?:\xce\x97\n\x91\xa9)\x03H\xd4u9\x13\xe9\xde\x94\\\x9fu?Y\xa2\x98Y\xb4\x8c\r\x85\x02\xca\xa2%=\xb23X\xcb\xbc\xb4\xe3\xb94\nyG\x90\xa3\x8b\x13\x926\xc1\x991\x11\xd4U\x18&Al\x01\x1c\xeb\x10q@\xec\x98=\xa8T\xea\x89\xdc|\xaa\xb8\x13\xcd\x1b`\x01\xef\xa2*\x95\xa7?\x98\t\xec&\xb1kV\x91&\x89*T\xce\xb3\x8eS\x14p\x0ec\x94\xa4\xc6\x14\xb3\xe8*(\xa7"\x17\xf11I\xd7\'\xbd\x16 JE:\xa1]\x85\x89\x11\xb8\xdeU@glO\xa5\x10 r\x02\x94\xac\x13B@\\\xcf?\xda\xae\x04oK\x9eTi%D\x04\x8c\x9cGZ\xa0,\xc1<\xf7\xa7\xa0yp>\xb5\x98\xd3\x12\xa0\x06~\xf4\xc0\xe9)\x07IPH5 \xe9\x90q\x1c\xc5YN\x9c\x10H\xeb\xca\x85\xc5jO)\x18\xa94\x16V\x01\xe7\x8a\x00\xa9I\x04\xe6h\x96\x04dA\xd8\xf7\xaam\xb5/a\xb5\x1d\x08\xa4\xc4\x9c\x01D\x01\x9cL\x9aj[I\xf7\x94>\x19\x8a\x12\x12\x91\xe4\x07\x18\xa9lt\x0e\xa1"d\xe2\r\x03\xcb\tp\xc4|(\x94\x92\xa5\x1ds\xcb\x7f\xe2\x94\xe2\n\xcc\x8d\xe8@\xd8J:\x921\x9d\xf5N\x05\x00L\xa7&}(T\x14\x12yP\xb4\xb3\xa7I\x14\xf8\xd1-\x96S\x03\x03\xe3T\x07\x94g\xff\x00\xcabgs\xb9\xa1Y\xc9\x03\'\xb5\x16 \xc0QH\xd3\xf4\x9a\x04\x82\xe2`\';\xe3\xb55\x9dI%)\x04\xabh\xac\xcb\x9dK\x00\xe9$b\xa4\xb0\x0eN\xdf:\xa1"sJ2\x0eD\x9abU\x1c\xe0\x1a\xd3\xa3+,\x9d\xf7\x14\'\x9e\xf4K\x1a\xa7I\xcfZ\x12\x82\x13\x91\x93\xb5\x16*\x05Ga\xce\xad\'\'4\x1aHsJf;\x8a\x84\x11\xebLC\'0\x0eh\x86\xff\x00zWI=\xc54\x13\x00\x93\xce\x93\x1a-@\xe9\xdf\x9e\xf4\xb3\x95S\x15:v\x10M*L\x9aQ\x1b,\x11\xdc\x8a\xb4\x91\xd2q\x031\x9e\xb4\x12g"\xac+\xadP\x06\t\xf9T\xd5\xa7\x10\rP\xce\xfb\xd1\x1fC\xf0\xa43\xac\x95\x18JD\x02bM\x0b\xa9*T\xa9S\x02v\xa9R\x99\xa3,\xb7\xa9^c8\xa60\xd0q\xc1\x04\xa5:\xa0\r\xe2\xa5J\x96\x08%7\x8c\x109\xe0U~\x1fN\xb9\\\xe9=*T\xac\xc6\x8c\xe6d\x9aR\x89\xd7\x02\x06zT\xa9T\xbb%\x80\xa4\x11\xa8j\xdcd\xd0\x04i"\x0e\xf5*U\x1044J\x0f\x98\xfc\xa8\xc2 \xe9\x10<\xb2q\xbdJ\x95\x9b.$i\x10\x82\xa9\xc91Y\x9eA\x92A\x18\x9eU*Q\x1e\xc7/\xcagmE*\x92\x12\xad\xb9ST\x0c\xef\xda\xa5J\xd5\xa3\x18\xb0t\x9c\xe7\xe9Q)\x9c\x93R\xa5 \x10\xa4\xfet\x08\x023Z\x12\x80N\x98\x11\x15*Q \x88*h\x11\xd0\x8c\xd3m\xd9\nN\xf9\x1f\xcdJ\x952z\x1a\xecqjB\x91\xab\x00N\xdd\xeb\n\x9b\x12fMJ\x94\xa0T\x91\x0bP\xbfz\xabGz\x95+Bh4"N\xf5yN\x02\x8dJ\x941\x9f\xff\xd9'
Let’s write this to a file:=, not the ‘wb’ call to denote a binary writing of the file.
f = open('my_file1.jpg','wb')
f.write(image_link.content)
6383
f.close()
Now we can display this file right here in the notebook as markdown using:
<img src="'my_new_file_name.jpg'>
Just write the above line in a new markdown cell and it will display the image we just downloaded!