{"id":6523,"date":"2023-10-18T14:47:43","date_gmt":"2023-10-18T14:47:43","guid":{"rendered":"https:\/\/royadata.io\/blog\/?p=6523"},"modified":"2023-10-18T14:47:43","modified_gmt":"2023-10-18T14:47:43","slug":"get-financial-data-from-yahoo-finance-with-python","status":"publish","type":"post","link":"http:\/\/royadata.io\/blog\/get-financial-data-from-yahoo-finance-with-python\/","title":{"rendered":"How to Get Financial Data from Yahoo Finance with Python (in 4 Simple Steps)"},"content":{"rendered":"<blockquote>\n<p>Are you interested in algorithm trading and you need to scrape financial data from Yahoo Finance? Stick on this page to discover how you can use Python and its associated library to extract financial data from Yahoo Finance \u2013 code inclusive.<\/p>\n<\/blockquote>\n<p><picture class=\"aligncenter size-full wp-image-12420 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-768x426.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20555'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20555'%3E%3C\/svg%3E\" alt=\"Financial Data from Yahoo Finance with Python\" width=\"1000\" height=\"555\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-768x426.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-12420\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-768x426.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg\" alt=\"Financial Data from Yahoo Finance with Python\" width=\"1000\" height=\"555\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Financial-Data-from-Yahoo-Finance-with-Python-768x426.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/p>\n<p>In the financial and investment market, professionals do not just depend on their guts and emotions as that has proven not to yield desirable results in the long run. Instead, they depend on data to decide which asset to buy and which to sell. Data-driven investment has yielded more gains than gut-driven investment and as such, those that have taken the step have introduced the idea to others. The\u00a0 Yahoo Finance web service is one of the web services that provide financial data which investment companies and individuals can use to make decisions either manually or extracting a bunch of them and feeding their trading bot to make decisions on their behalf based on predefined trading models.<\/p>\n<p>Yahoo Finance has an API that you can use to fetch financial data. However, this API is fragile, limiting, and does not provide you with all of the data you need. This is the reason there are many unofficial APIs for Yahoo Finance. In this article, we would be showing you how to develop a web scraper that will extract financial data from Yahoo Finance using Python.<\/p>\n<hr\/>\n<h2 id=\"overview-of-web-scraper-to-be-developed\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Overview_of_Web_scraper_to_Be_Developed\"><\/span><strong>Overview of Web scraper to Be Developed<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/-3lqUHeZs_0\" data-id=\"-3lqUHeZs_0\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-e4e6c5b1dad7792f219b6954->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/-3lqUHeZs_0\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/-3lqUHeZs_0?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>A web scraper is a computer program developed to automate the process of collecting data from the Internet. This is done usually in an aggressive manner, sending many requests within a short period of time, making it possible to collect data across thousands of web pages within a few minutes.<\/p>\n<p>In this guide, we would be developing a web scraper that would collect the summary of financial data of a stock listed on Yahoo Finance. The class accepts the ticker of the stock as an argument and pulls the summary of its data for you.<\/p>\n<hr\/>\n<h2 id=\"requirement-for-coding-and-running-the-script\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Requirement_for_Coding_and_Running_the_Script\"><\/span><strong>Requirement for Coding and Running the Script<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We would be using the python programming language and its associated third-party libraries to code the Yahoo Finance scraper.<\/p>\n<hr\/>\n<h3 id=\"python\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Python\"><\/span><strong>Python<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p><picture class=\"aligncenter size-full wp-image-12416 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-300x168.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-768x429.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20559'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20559'%3E%3C\/svg%3E\" alt=\"Python language\" width=\"1000\" height=\"559\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-300x168.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-768x429.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-12416\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-300x168.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-768x429.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg\" alt=\"Python language\" width=\"1000\" height=\"559\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-300x168.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Python-language-768x429.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/p>\n<p>You need to have Python installed on your PC. Many of the Operating Systems have Python pre-installed. However, the version is Python 2 and is no longer in active development. You will need to install the newer Python 3. Installing Python 3 is easy and only requires a few steps. Mac users can <a href=\"http:\/\/docs.python-guide.org\/en\/latest\/starting\/install3\/osx\/\"  rel=\"noopener noreferrer nofollow\">visit the official download page<\/a> to download and install Python. If you are a Linux user, <a href=\"http:\/\/docs.python-guide.org\/en\/latest\/starting\/install3\/linux\/\"  rel=\"noopener noreferrer nofollow\">click here<\/a>. For Windows users, follow the steps highlighted below.<\/p>\n<ul>\n<li><a href=\"https:\/\/www.python.org\/downloads\/windows\/\"  rel=\"noopener noreferrer nofollow\">Visit the Windows download page<\/a> and download the latest version of Python.<\/li>\n<li>Launch the installer and follow the prompt.<\/li>\n<li>Check the \u201cAdd Python 3.x PATH\u201d<\/li>\n<li>Click on \u201cCustomise Installation\u201d<\/li>\n<li>Make sure Pip is checked if not, check it.<\/li>\n<\/ul>\n<hr\/>\n<h3 id=\"pip\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"PIP\"><\/span><strong>PIP<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/sidzL1XoHCw\" data-id=\"sidzL1XoHCw\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-e4e6c5b1dad7792f219b6954->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/sidzL1XoHCw\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/sidzL1XoHCw?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>The Pip package is <strong>pip<\/strong> is a package management system used to install and manage software packages written in Python. Many packages can be found in the Python Package Index (PyPI). If you follow the steps above as a Windows user, you already have pip installed. Linux and Mac users can read this official documentation to learn how to install pip.<\/p>\n<p>With this package installed, all you need to install the other packages is a pip command like this pip install <package-name>.<\/p>\n<hr\/>\n<h3 id=\"requests\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Requests\"><\/span><strong>Requests<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/iv-Uc8d3tDs\" data-id=\"iv-Uc8d3tDs\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-e4e6c5b1dad7792f219b6954->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/iv-Uc8d3tDs\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/iv-Uc8d3tDs?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>The Requests library is a Python third-party library developed to make sending HTTP requests easier for coders. It is built on top of the messy urllib package that is difficult to use. Run the below in pip command in command prompt if you are a Windows user or Terminal for Mac users.<\/p>\n<pre><strong>pip install requests<\/strong><\/pre>\n<p>We would be using Requests for downloading the full web page that contain the data. Requests is easy to use and best for accessing data from web pages that do not depend on Javascript to render content. <a href=\"https:\/\/docs.python-requests.org\/en\/latest\/user\/quickstart\/\">Read the Quickstart guide for Requests here<\/a>.<\/p>\n<hr\/>\n<h3 id=\"beautifulsoup\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Beautifulsoup\"><\/span><strong>Beautifulsoup<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/87Gx3U0BDlo\" data-id=\"87Gx3U0BDlo\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-e4e6c5b1dad7792f219b6954->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/87Gx3U0BDlo\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/87Gx3U0BDlo?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>Requests only download the content of a page. With the help of Beautifulsoup, you can extract the required data from the page. It is not a parser but makes use of a parser under the hood to make the process of transversing and reaching data of interest easy as using regular parsing libraries can be difficult. Below is the pip command to run in command prompt or Terminal to install Beautifulsoup.<\/p>\n<pre><strong>pip install beautifulsoup4<\/strong><\/pre>\n<p><a href=\"https:\/\/www.crummy.com\/software\/BeautifulSoup\/bs4\/doc\/\"  rel=\"noopener noreferrer nofollow\">Click here to read Beautifulsoup\u2019s official guide<\/a>.<\/p>\n<hr\/>\n<p>Related,<\/p>\n<ul>\n<li><a href=\"https:\/\/royadata.io\/blog\/scrapy-vs-selenium-vs-beautifulsoup-for-web-scraping\/\">Scrapy Vs. Beautifulsoup Vs. Selenium for Web Scraping<\/a><\/li>\n<li><a href=\"https:\/\/royadata.io\/blog\/web-scraping-with-python\/\">Python Web Scraping Libraries and Framework<\/a><\/li>\n<\/ul>\n<hr\/>\n<h2 id=\"step-1-create-a-web-scraper-class\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Step_1_Create_a_Web_Scraper_Class\"><\/span><strong>Step 1: Create a Web Scraper Class<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>We want to use the Object-Oriented approach to programming and as such, we would be housing all ion the logic and methods in a class known as YahooFiScraper. This class would have two custom methods \u2013 the _scrape_data() method and get_data() method. The _scrape_data() is a private method and written with the underscore convention to make it private. So, the only method expose to other classes and methods\/functions is the get_data() method.<\/p>\n<pre>import requests\n\n\n\nfrom bs4 import BeautifulSoup\n\n\n\n\n\nclass YahooFiScraper:\n\n\n\n\u00a0\u00a0\u00a0 def __init__(self, ticker):\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.url = \"http:\/\/finance.yahoo.com\/quote\/{0}?p={1}\".format(ticker, ticker)\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.name = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.current_price = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.market_cap = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.previous_close = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.open = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.bid = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.fifty2_weeks_range = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.volume = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.average_volume = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.beta = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.pe_ration = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.eps = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.earning_date = \"\"<\/pre>\n<p>From the above, you can see the constructor. The self.url variable holds the URL to the page. You can see we used there {} formatter to add the ticker expected as a parameter when initially the class. The other variables will hold the information when scraped \u2013 their names are quite revealing.<\/p>\n<hr\/>\n<h2 id=\"step-2-inspect-page-source-for-element-of-interest\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Step_2_Inspect_Page_Source_for_Element_of_Interest\"><\/span><strong>Step 2: Inspect Page Source for Element of Interest<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p><picture class=\"aligncenter size-full wp-image-12418 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-300x144.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-768x368.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20479'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20479'%3E%3C\/svg%3E\" alt=\"Inspect Page Source for Element of Interest\" width=\"1000\" height=\"479\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-300x144.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-768x368.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-12418\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-300x144.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-768x368.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg\" alt=\"Inspect Page Source for Element of Interest\" width=\"1000\" height=\"479\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-300x144.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Inspect-Page-Source-for-Element-of-Interest-768x368.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/p>\n<p>Let go to the page of interest <a href=\"https:\/\/finance.yahoo.com\/quote\/TSLA?p=TSLA\"  rel=\"noopener noreferrer nofollow\">https:\/\/finance.yahoo.com\/quote\/TSLA?p=TSLA<\/a> \u2013 this is for Tesla and let see the elements with the data we are interested in. We would be taking a look at the HTML of the page, looking for the HTML elements wrapping the data and their unique attributes that could be used to reach them. Open the page on your browser and view the page source. The exact process to do that would be determined by the browser you are using.<\/p>\n<p>Fortunately for us, all of the data of interest are wrapped in elements with a property \u2013 reactid \u2013 each of which has a number attached to it as an attribute.<\/p>\n<hr\/>\n<h2 id=\"step-3-code-_scrape_data-method\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Step_3_Code_Scrape_data_Method\"><\/span><strong>Step 3: Code _Scrape_data() Method<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>With the page source inspected and we knowing where each of the data of interest is located, it is now time to code the method for scraping the data. The below is the method for scraping the financial data on the page.<\/p>\n<pre>def _scrape_data(self):\n\n\n\n\u00a0\u00a0\u00a0 x = requests.get(self.url).text\n\n\n\n\u00a0\u00a0\u00a0 soup = BeautifulSoup(x, \"html.parser\").text\n\n\n\n\u00a0\u00a0\u00a0 self.name = soup.find(\"h1\", {\"data-reactid\": \"7\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.current_price = soup.find(\"span\", {\"data-reactid\": \"31\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.market_cap = soup.find(\"span\", {\"data-reactid\": \"84\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.previous_close = soup.find(\"span\", {\"data-reactid\": \"43\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.open = soup.find(\"span\", {\"data-reactid\": \"48\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.bid = soup.find(\"span\", {\"data-reactid\": \"53\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.fifty2_weeks_range = soup.find(\"span\", {\"data-reactid\": \"66\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.volume = soup.find(\"span\", {\"data-reactid\": \"71\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.average_volume = soup.find(\"span\", {\"data-reactid\": \"76\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.beta = soup.find(\"span\", {\"data-reactid\": \"89\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.pe_ratio = soup.find(\"span\", {\"data-reactid\": \"94\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.eps = soup.find(\"span\", {\"data-reactid\": \"99\"}).text\n\n\n\n\u00a0\u00a0\u00a0 self.earning_date = soup.find(\"td\", {\"data-reactid\": \"103\"}).text<\/pre>\n<hr\/>\n<h2 id=\"step-4-code-get_data-method\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Step_4_Code_get_data_Method\"><\/span><strong>Step 4: Code get_data() Method<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>From the above, the code is already complete. However, we need an easy method to return the data for the stock being scraped. This method will do that. All it does is return all of the data in a dictionary so you can use dictionary methods to get relevant values. Below is the code for this method.<\/p>\n<pre>def get_data(self):\n\n\n\n\u00a0\u00a0\u00a0 return {\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"name\": self.name,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"price\": self.current_price,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"marketcap\": self.market_cap,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"previous_close\": self.previous_close,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"open\": self.open,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"bid\": self.bid,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"52_weeks_range\": self.fifty2_weeks_range,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"beta\": self.beta,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"pe_ratio\": self.pe_ratio,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"eps\": self.eps,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"earning_date\": self.earning_date\n\n\n\n\u00a0\u00a0\u00a0 }<\/pre>\n<hr\/>\n<h2 id=\"full-web-scraper-code-for-scraping-yahoo-finance\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Full_Web_Scraper_Code_for_Scraping_Yahoo_Finance\"><\/span><strong>Full Web Scraper Code for Scraping Yahoo Finance<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<pre>import requests\n\n\n\nfrom bs4 import BeautifulSoup\n\n\n\nclass YahooFiScraper:\n\n\n\n\u00a0\u00a0\u00a0 def __init__(self, ticker):\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.url = \"http:\/\/finance.yahoo.com\/quote\/{0}?p={1}\".format(ticker, ticker)\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.name = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.current_price = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.market_cap = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.previous_close = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.open = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.bid = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.fifty2_weeks_range = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.volume = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.average_volume = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.beta = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.pe_ratio = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.eps = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.earning_date = \"\"\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self._scrape_data()\n\n\n\n\u00a0\u00a0\u00a0 \n\ndef _scrape_data(self):\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 x = requests.get(self.url).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 soup = BeautifulSoup(x, \"html.parser\").text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.name = soup.find(\"h1\", {\"data-reactid\": \"7\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.current_price = soup.find(\"span\", {\"data-reactid\": \"31\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.market_cap = soup.find(\"span\", {\"data-reactid\": \"84\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.previous_close = soup.find(\"span\", {\"data-reactid\": \"43\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.open = soup.find(\"span\", {\"data-reactid\": \"48\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.bid = soup.find(\"span\", {\"data-reactid\": \"53\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.fifty2_weeks_range = soup.find(\"span\", {\"data-reactid\": \"66\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.volume = soup.find(\"span\", {\"data-reactid\": \"71\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.average_volume = soup.find(\"span\", {\"data-reactid\": \"76\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.beta = soup.find(\"span\", {\"data-reactid\": \"89\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.pe_ratio = soup.find(\"span\", {\"data-reactid\": \"94\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.eps = soup.find(\"span\", {\"data-reactid\": \"99\"}).text\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 self.earning_date = soup.find(\"td\", {\"data-reactid\": \"103\"}).text\n\n\n\n\u00a0\u00a0\u00a0 def get_data(self):\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 return {\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"name\": self.name,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"price\": self.current_price,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"marketcap\": self.market_cap,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"previous_close\": self.previous_close,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"open\": self.open,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"bid\": self.bid,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"52_weeks_range\": self.fifty2_weeks_range,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"beta\": self.beta,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"pe_ratio\": self.pe_ratio,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"eps\": self.eps,\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \"earning_date\": self.earning_date\n\n\n\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 }\n\n\n\nc = YahooFiScraper(\"TSLA\")\n\n\n\nprint(c.get_data())<\/pre>\n<hr\/>\n<pre style=\"text-align: center;\"><strong>Conclusion<\/strong><\/pre>\n<p>As a way of concluding this guide, I need to mention that the code above is nothing but a proof of concept. You cannot use it for scraping many stock data before getting blocked. If you also take a look at the code, you will see that exceptions are not handled and as such, cannot be used as a production-level script. As someone interested in coding a Yahoo Finance scraper, you should build on this and make it rugged and block-proof.<\/p>\n<hr\/>\n<ul>\n<li><a href=\"https:\/\/royadata.io\/blog\/how-to-scrape-data-from-website-to-excel\/\">How to Scrape Data from Website to Excel<\/a><\/li>\n<li><a href=\"https:\/\/royadata.io\/blog\/css-selector-cheat-sheet\/\">CSS Selector Cheat Sheet for Web Scraping in Python<\/a><\/li>\n<li><a href=\"https:\/\/royadata.io\/blog\/curl\/\">Curl 101: What It Is &#038; How to Use Curl for Web Scraping<\/a><\/li>\n<li><a href=\"https:\/\/royadata.io\/blog\/selenium-web-scraping-python\/\">Web Scraping Using Selenium and Python: The Step-By-Step Guide for Beginner<\/a><\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Are you interested in algorithm trading and you need to scrape financial data from Yahoo Finance? Stick on this page to discover how you can use Python and its associated library to extract financial data from Yahoo Finance \u2013 code inclusive. In the financial and investment market, professionals do not just depend on their guts &#8230; <a title=\"How to Get Financial Data from Yahoo Finance with Python (in 4 Simple Steps)\" class=\"read-more\" href=\"http:\/\/royadata.io\/blog\/get-financial-data-from-yahoo-finance-with-python\/\" aria-label=\"More on How to Get Financial Data from Yahoo Finance with Python (in 4 Simple Steps)\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":699,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts\/6523"}],"collection":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/comments?post=6523"}],"version-history":[{"count":0,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts\/6523\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/media\/699"}],"wp:attachment":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/media?parent=6523"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/categories?post=6523"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/tags?post=6523"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}