{"id":6163,"date":"2023-10-18T14:47:43","date_gmt":"2023-10-18T14:47:43","guid":{"rendered":"https:\/\/royadata.io\/blog\/?p=6163"},"modified":"2023-10-18T14:47:43","modified_gmt":"2023-10-18T14:47:43","slug":"octoparse-tutorials","status":"publish","type":"post","link":"http:\/\/royadata.io\/blog\/octoparse-tutorials\/","title":{"rendered":"Octoparse Tutorials 2022: How to Use the Octoparse [Step By Step]"},"content":{"rendered":"<blockquote>\n<p>Are you looking forward to using the Octoparse scraping tool for collecting public data on the Internet? Then you are on the right page as we would be providing you a step-by-step guide on how to scrape data using Octoparse.<\/p>\n<\/blockquote>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17823 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-768x426.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20555'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20555'%3E%3C\/svg%3E\" alt=\"Octoparse Tutorials\" width=\"1000\" height=\"555\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-768x426.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17823\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-768x426.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg\" alt=\"Octoparse Tutorials\" width=\"1000\" height=\"555\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Tutorials-768x426.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p>In the past, <a href=\"https:\/\/royadata.io\/blog\/web-scraping\/\">web scraping<\/a> requires you to directly write code or get someone that can write code to provide you a web scraper for collecting the data you are interested in. This is no longer the case as there are a good number of web scrapers that have been developed for those with no coding knowledge.<\/p>\n<p>These web scrapers range from visual web scrapers such as Octoparse, ParseHub, and WebHarvy, as well as some web scrapers that provide you structured data without you carrying out any data identification task such as <a href=\"https:\/\/royadata.io\/blog\/data-collector\/\">Bright Data\u2019s Data Collector<\/a>.<\/p>\n<p>Our focus is on the Octoparse web scraper which is a visual web scraping tool. In theory, Octoparse is an easy-to-use web scraper \u2014 all you have to do is use the point-and-click interface provided to select some of the data of interest and the tool would automatically identify similar elements on the page.<\/p>\n<p>In practice, if you never learned how to make use of it, you might find it difficult to use especially if you are not a technical person. In this article, we would be providing a tutorial on how to make use of Octoparse to collect data from the Internet. Before doing, let\u2019s take a look at an overview of Octoparse.<\/p>\n<hr\/>\n<h2 id=\"overview-of-octoparse\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Overview_of_Octoparse\"><\/span><strong>Overview of Octoparse<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The Octoparse tool claims to make web scraping easy for everyone regardless of coding skills. With this web scraper, you will not need to write a single line of code to extract data from web pages on the Internet. All you need is to have the skills of using the mouse which is what is required to use the point-and-click interface.<\/p>\n<p>With Octoparse, you can scrape data from all kinds of sites, including javascript-heavy pages. Being <a href=\"https:\/\/royadata.io\/blog\/web-scraping-tools\/\">a web scraper<\/a> meant for non-technical people, most of the complexities have been hidden.<\/p>\n<p>However, you will need to provide <a href=\"https:\/\/royadata.io\/blog\/best-proxy-server\/\">proxy servers<\/a> to avoid getting blocked.\u00a0 You can download scraped data in a good number of file formats such as CSV, Excel, JSON, and databases.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer nofollow\"><picture class=\"aligncenter size-full wp-image-17811 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-300x124.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-768x316.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20412'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20412'%3E%3C\/svg%3E\" alt=\"Octoparse Overview\" width=\"1000\" height=\"412\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-300x124.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-768x316.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17811\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-300x124.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-768x316.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg\" alt=\"Octoparse Overview\" width=\"1000\" height=\"412\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-300x124.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Overview-768x316.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p>Octoparse is available as a Windows application (Support Mac also). The service also has support for a cloud scraping solution that you can use without necessarily installing any software. This one even makes it possible to schedule scraping tasks and get them delivered to you at specific periods.<\/p>\n<hr\/>\n<h3 id=\"latest-version-of-octoparse\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Latest_Version_of_Octoparse\"><\/span>Latest Version of Octoparse<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>On Feb 2022, The new 8.5 version of Octoparse software have released, Octoparse now is available on Mac as well.<\/p>\n<blockquote>\n<p dir=\"auto\" data-pm-slice=\"1 3 []\"><strong>What&#8217;s new in Octoparse 8.5 version?<\/strong><\/p>\n<ul>\n<li>\n<p dir=\"auto\">Live logs for troubleshooting local runs<\/p>\n<\/li>\n<li>\n<p dir=\"auto\">Boost Mode for up to 3X faster local runs<\/p>\n<\/li>\n<li>\n<p dir=\"auto\">Auto-backup local data to the Cloud<\/p>\n<\/li>\n<li>\n<p dir=\"auto\">Manage your tasks with batch actions<\/p>\n<\/li>\n<\/ul>\n<p>You can learn more from <a href=\"https:\/\/www.octoparse.com\/blog\/octoparse-85-empowers-the-local-web-scraping?AgentCode=303\"  rel=\"noopener noreferrer\">here<\/a>, and you can watch this video to lean the details,<\/p>\n<\/blockquote>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/nVycXF3np1o\" data-id=\"nVycXF3np1o\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-70f8771c4de7d6d240c93da3->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/nVycXF3np1o\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/nVycXF3np1o?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<hr\/>\n<p>Octoparse is one of the best web scrapers out there if you do not have coding skills. The Octoparse service also offers a professional data service. This service is for those that d not want to be involved in the web scraping workflow. All they need to do is provide information on the data they are interested in and the Octoparse professionals will scrape and provide the data for them.<\/p>\n<hr\/>\n<h2 id=\"octoparse-interface\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Octoparse_Interface\"><\/span><strong>Octoparse Interface<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>The Octoparse web scraper is a GUI application. That is, it provides you with a user interface that you will use to access the service you want. This user interface is quite simple and easy to get used to even without technical knowledge.<\/p>\n<p>In this section of the article, we would be describing the user interface and letting you know what to expect if you intend to make use of this web scraper. This section will be filled with screenshots of the UIs.<\/p>\n<p>If you log into an account, you will see the interface above. The interface is divided into two sections. The left-hand side of the interface is the sidebar and contains the menu. You can also call it the navigation area. The area labeled the home screen is the main page.<\/p>\n<ul>\n<li>\n<h3 id=\"sidebar-menu\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Sidebar_Menu\"><\/span><strong>Sidebar Menu<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<\/li>\n<\/ul>\n<p>The sidebar menu is the section where all of the navigation options are available. From the main page above, you can see that it holds the main navigation\u00a0 \u2014 dashboard, quick filters, recent tasks, team collaboration, data service, and contact us. A click on each of the options would lead to a change in the main interface.<\/p>\n<p><picture class=\"aligncenter size-full wp-image-17836 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png.webp 1412w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-300x165.png.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-1024x564.png.webp 1024w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-768x423.png.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201412%20778'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1412px) 100vw, 1412px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201412%20778'%3E%3C\/svg%3E\" alt=\"Octoparse 8.5\" width=\"1412\" height=\"778\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png 1412w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-300x165.png 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-1024x564.png 1024w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-768x423.png 768w\" data-sizes=\"(max-width: 1412px) 100vw, 1412px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17836\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png.webp 1412w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-300x165.png.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-1024x564.png.webp 1024w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-768x423.png.webp 768w\" sizes=\"(max-width: 1412px) 100vw, 1412px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png\" alt=\"Octoparse 8.5\" width=\"1412\" height=\"778\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5.png 1412w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-300x165.png 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-1024x564.png 1024w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-8.5-768x423.png 768w\" sizes=\"(max-width: 1412px) 100vw, 1412px\"\/>\n<\/picture>\n<\/noscript><\/p>\n<p>However, the sidebar would remain the same in most cases, it is only in a few cases like in the case of the workspace page such ad the below that even the content of the sidebar changes.<\/p>\n<ul>\n<li>\n<h3 id=\"home-screen\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Home_Screen\"><\/span><strong>Home Screen <\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<\/li>\n<\/ul>\n<p>The home screen as shown in the first screenshot is rightly called the main area of the application. Octoparse calls this section the workspace. This is because the main tasks, including point and click would be done in this section of the interface. The sidebar in most cases is just for navigation.<\/p>\n<p>When you start creating any new scraping task, this section would change \u2014 and even the navigation section would change to reflect that. With a new task started, the sidebar would become the workflow area. Aside from the workflow section, 3 other interface exists \u2014 data preview, browser, and smart tips.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17810 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-768x427.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20556'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20556'%3E%3C\/svg%3E\" alt=\"Octoparse Homescreen\" width=\"1000\" height=\"556\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-768x427.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17810\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-300x167.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-768x427.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg\" alt=\"Octoparse Homescreen\" width=\"1000\" height=\"556\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-300x167.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Homescreen-768x427.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Workflow: <\/strong>The workflow section shows you a flow chart of all of the actions you take using the in-browser tool to access the data to be scrapped. It is this workflow the tool will follow in getting the data you are interested in scraping.<\/p>\n<p><strong>In-browser:<\/strong> Remember from the beginner of the article we started that web scraper is a visual web scraper right? Well, it does come with its own browser and this browser is what you use to access the page that holds the data you are interested in. It is from the browser that you point and click on the required data \u2014 the browser is the point and click interface been taunted.<\/p>\n<p><strong>Data Preview:<\/strong> This section provides you with details of the data you are scrapping. Without this section, you might not know whether you are scrapping the wrong data or not. You can actually hide this section if you wish.<\/p>\n<p><strong>Smart Tips:<\/strong> This UI element is available at the top right side of the page. The Smart Tips section\u00a0 comes with a toggle button which is meant for removing it if you do not want it. This element is optional and you wouldn\u2019t need it when you know how to make use of the tool.<\/p>\n<hr\/>\n<h2 id=\"step-by-step-guide-on-how-to-use-the-octoparse\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"Step_By_Step_Guide_on_How_to_Use_the_Octoparse\"><\/span><strong>Step By Step Guide on\u00a0 How to Use the Octoparse<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/Qmo1zObV5b0\" data-id=\"Qmo1zObV5b0\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-70f8771c4de7d6d240c93da3->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/Qmo1zObV5b0\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/Qmo1zObV5b0?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p><strong>\u00a0<\/strong>The Octoparse web scraper can be used to collect different kinds f data ranging from job listing to price info, and even emails on forums, among other things.<\/p>\n<p>While the process of using the software to scrape each detail might vary a bit, the fundamentals remain the same. All you will need is to make use of the paint and click interface provided to identify important data points.<\/p>\n<p>Interestingly, you might not even need to do that manually for some data points as templates have been built which you can just select and use in other to save time and effort. Follow the steps below to learn how to make use of the Octoparse tool.<\/p>\n<p><strong>Step 1:<\/strong> Download the Octoparse desktop client. Make sure you download it from the <a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer nofollow\">official website of Octoparse<\/a> to avoid downloading a malicious copy from third parties. The client is available for Windows and Mac. If you are using a different Operating System other than these two, then you will need to run it on a virtual machine.<\/p>\n<p><strong>Step 2:<\/strong> The file comes packed in a zip file and as such, you will need to unzip it to find the installer. Before installing it, you should turn OFF any anti-virus software you have running.<\/p>\n<p>This is because Octoparse has some features that make it looks like a virus or malware to anti-virus software and as such, you need to stop the anti-virus from working to get them installed. With the anti-virus out of the way, you can then install the software.<\/p>\n<p><strong>Step 3:<\/strong> Launch the Octoparse application and provide your authentication details. If you enter the right information, you will be greeted with the home screen and sidebar. There are basically two methods of scraping data using the Octoparse web scraper \u2014 template mode and advanced mode.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17816 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-300x56.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-768x144.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20188'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20188'%3E%3C\/svg%3E\" alt=\"Octoparse Template and Advance Mode\" width=\"1000\" height=\"188\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-300x56.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-768x144.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17816\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-300x56.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-768x144.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg\" alt=\"Octoparse Template and Advance Mode\" width=\"1000\" height=\"188\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-300x56.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Template-and-Advance-Mode-768x144.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Template Mode: <\/strong>Octoparse has prebuilt templates for popular websites of the Internet which you can use to collect data from them right away.<\/p>\n<p><strong>Advanced Mode: <\/strong>If you have a special need not captured in the template mode, then you can customize the scraper to collect the specific data you need. This is the mode you use for other kinds of websites not supported by the template mode.<\/p>\n<p><strong>Step 4:<\/strong> Let\u2019s start a new task using the advanced mode. For the advance mode, all you need to do is provide the URL of the site you want to collect data from. As you enter the URL in the URL input field, Octoparse will automatically detect the website.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17814 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-300x145.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-768x371.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20483'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20483'%3E%3C\/svg%3E\" alt=\"Octoparse Search Engine\" width=\"1000\" height=\"483\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-300x145.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-768x371.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17814\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-300x145.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-768x371.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg\" alt=\"Octoparse Search Engine\" width=\"1000\" height=\"483\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-300x145.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Search-Engine-768x371.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 5:<\/strong> Once the website loads, you are taken to an in-browser software. As you navigate the website, the actions you take are added to the workflow section of the page \u2014 the in-browser tool occupies the main area of the page.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17817 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-300x98.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-768x250.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20325'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20325'%3E%3C\/svg%3E\" alt=\"Octoparse Workflow\" width=\"1000\" height=\"325\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-300x98.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-768x250.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17817\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-300x98.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-768x250.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg\" alt=\"Octoparse Workflow\" width=\"1000\" height=\"325\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-300x98.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Workflow-768x250.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 6:<\/strong> The tool automatically identifies important data points on a page. If the data you are interested in is not highlighted, then you will need to do that yourself using the point and click tool. As you click on the data, similar elements are highlighted too.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17808 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-300x59.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-768x152.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20198'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%20198'%3E%3C\/svg%3E\" alt=\"Octoparse Data Preview\" width=\"1000\" height=\"198\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-300x59.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-768x152.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17808\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-300x59.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-768x152.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg\" alt=\"Octoparse Data Preview\" width=\"1000\" height=\"198\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-300x59.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Data-Preview-768x152.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 7:<\/strong> Once you are done, you can check the preview section which is directly below the in-browser tool. From the preview section, you can remove any column or rows you are not interested in. If you are satisfied with the preview, then check the workflow to make sure it is correct to avoid getting undesirable results.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17812 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%2037'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%2037'%3E%3C\/svg%3E\" alt=\"Octoparse Save and Run option\" width=\"1000\" height=\"37\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17812\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg\" alt=\"Octoparse Save and Run option\" width=\"1000\" height=\"37\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 8:<\/strong> Click on \u201cSave\u201d and then \u201cRun\u201d you have the option of running it on your machine or on the cloud. For now, we would be using our computer so go with the first option. You would then choose the format you want the data downloaded.<\/p>\n<hr\/>\n<h2 id=\"how-to-setup-proxies-for-octoparse\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"How_to_Setup_Proxies_for_Octoparse\"><\/span><strong>How to Setup Proxies for Octoparse<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/dt6jhAM2NLc\" data-id=\"dt6jhAM2NLc\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-70f8771c4de7d6d240c93da3->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/dt6jhAM2NLc\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/dt6jhAM2NLc?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>In the above, we didn\u2019t set up proxies. While it might work if the tool will only has to send a few requests, it will not work if too many requests would be sent. For this reason, we would need to use proxies if we would be sending many requests in other to anonymize our requests.<\/p>\n<p>Below is the method to follow to configure proxies for Octoparse.<\/p>\n<p><strong>Step 1:<\/strong> The first step is to get your hands on high-quality proxies. You can purchase good proxies from <a href=\"###brightdata\/\"  rel=\"noopener noreferrer nofollow\">Bright Data<\/a>, <a href=\"###smartproxy\/\"  rel=\"noopener noreferrer nofollow\">Smartproxy<\/a>, or <a href=\"###soax\/\"  rel=\"noopener noreferrer nofollow\">Soax<\/a>. There are many other providers out there. Regardless of the provider you choose, what you require are a proxy address (host) and port.<\/p>\n<p>You will also require a username and password for authentication except if the service supports IP authentication. With the <a href=\"https:\/\/royadata.io\/blog\/how-to-find-my-proxy-server-address\/\">proxy address<\/a>, <a href=\"https:\/\/royadata.io\/blog\/proxy-port\/\">port<\/a>, username, and password, you can move to the next step.<\/p>\n<p><strong>Step 2:<\/strong> Create your task as described above. When you are done creating the task, do not click the \u201cSave\u201d or \u201cRun\u201d button yet.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17812 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%2037'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%201000%2037'%3E%3C\/svg%3E\" alt=\"Octoparse Save and Run option\" width=\"1000\" height=\"37\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg 768w\" data-sizes=\"(max-width: 1000px) 100vw, 1000px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17812\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg.webp 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg.webp 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg\" alt=\"Octoparse Save and Run option\" width=\"1000\" height=\"37\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option.jpg 1000w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-300x11.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Save-and-Run-option-768x28.jpg 768w\" sizes=\"(max-width: 1000px) 100vw, 1000px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 3:<\/strong> Instead, click on the \u201cSettings\u201d icon, and an interface for setting will popup.<\/p>\n<p><strong>Step 4:<\/strong> Scroll down to the Anti-blocking settings and check the \u201cUse Proxies\u201d option.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17809 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg.webp 821w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-300x216.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-768x553.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20821%20591'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 821px) 100vw, 821px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20821%20591'%3E%3C\/svg%3E\" alt=\"Octoparse Generated IP address option\" width=\"821\" height=\"591\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg 821w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-300x216.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-768x553.jpg 768w\" data-sizes=\"(max-width: 821px) 100vw, 821px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17809\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg.webp 821w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-300x216.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-768x553.jpg.webp 768w\" sizes=\"(max-width: 821px) 100vw, 821px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg\" alt=\"Octoparse Generated IP address option\" width=\"821\" height=\"591\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option.jpg 821w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-300x216.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Generated-IP-address-option-768x553.jpg 768w\" sizes=\"(max-width: 821px) 100vw, 821px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 5:<\/strong> This will open another interface for entering the list of proxies. You can set the interval of IP rotation from the interface. If you bought your proxies from either <a href=\"###brightdata\/\"  rel=\"noopener noreferrer nofollow\">Bright Data<\/a>, <a href=\"###smartproxy\/\"  rel=\"noopener noreferrer nofollow\">Smartproxy<\/a>, or <a href=\"###soax\/\"  rel=\"noopener noreferrer nofollow\">Soax<\/a>, you can generate a list of proxies and paste them into the \u201cIP Proxies\u201d field.<\/p>\n<p>However, because the proxies are rotating proxies, you can use just the proxy address and port pair generated earlier and the proxy service will take care of the <a href=\"https:\/\/royadata.io\/blog\/ip-rotation\/\">IP rotation<\/a> on your behalf.<\/p>\n<p><strong>Step 6:<\/strong> Click on the \u201cOK\u201d button and then go back to the top of the page and click on the \u201cSave\u201d button then \u201cRun\u201d and your scraping task will be done via different IP addresses, thereby making it possible for you to send too many requests without getting detected and blocked.<\/p>\n<hr\/>\n<h2 id=\"how-to-schedule-web-scraping-task-using-octoparse\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"How_to_Schedule_Web_Scraping_Task_Using_Octoparse\"><\/span><strong>How to Schedule Web Scraping Task Using Octoparse<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<div class=\"su-youtube su-u-responsive-media-yes\">\n<div class=\"perfmatters-lazy-youtube\" data-src=\"https:\/\/www.youtube.com\/embed\/mhEupoQtbv8\" data-id=\"mhEupoQtbv8\" data-query onclick=\"if (!window.__cfRLUnblockHandlers) return false; perfmattersLazyLoadYouTube(this);\" data-cf-modified-70f8771c4de7d6d240c93da3->\n<div><img loading=\"lazy\" decoding=\"async\" class=\"perfmatters-lazy\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20480%20360%3E%3C\/svg%3E\" data-src=\"https:\/\/i.ytimg.com\/vi\/mhEupoQtbv8\/hqdefault.jpg\" alt=\"YouTube video\" width=\"480\" height=\"360\" data-pin-nopin=\"true\"><\/p>\n<div class=\"play\"><\/div>\n<\/div>\n<\/div>\n<p><noscript><iframe loading=\"lazy\" width=\"600\" height=\"400\" src=\"https:\/\/www.youtube.com\/embed\/mhEupoQtbv8?\" frameborder=\"0\" allowfullscreen allow=\"autoplay; encrypted-media; picture-in-picture\" title=\"\"><\/iframe><\/noscript><\/div>\n<p>One good feature you will come to like with Octoparse is its support for scheduling scraping tasks. Yes, If there is data that you want to collect periodically, that is not a reason to ditch this tool. You can create the task and then get it to run on its own.<\/p>\n<p>However, this option is only available to paid users as scheduled scraping only works on the cloud. Follow the steps above to schedule a scraping task.<\/p>\n<p><strong>Step 1:<\/strong> Make sure you have a paid plan before attempting to schedule a scraping task. Octoparse does have a free plan but this plan does not give you access to the cloud platform which supports schedule scraping.<\/p>\n<p><strong>Step 2:<\/strong> Create a scraping task as stated above. Click on the \u201c<strong>Save<\/strong>\u201d button at the top left corner of the page and click on the \u201c<strong>Run<\/strong>\u201d button. 3 run options are presented \u2014run local extraction, cloud extraction, and set a schedule.<\/p>\n<p><a href=\"http:\/\/agent.octoparse.com\/ws\/303\"  rel=\"noopener noreferrer\"><picture class=\"aligncenter size-full wp-image-17813 perfmatters-lazy\" loading=\"lazy\"><source type=\"image\/webp\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg.webp 818w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-300x186.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-768x476.jpg.webp 768w\" srcset=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20818%20507'%3E%3C\/svg%3E\" data-sizes=\"(max-width: 818px) 100vw, 818px\" \/><img decoding=\"async\" src=\"data:image\/svg+xml,%3Csvg%20xmlns='http:\/\/www.w3.org\/2000\/svg'%20viewBox='0%200%20818%20507'%3E%3C\/svg%3E\" alt=\"Octoparse Schedule Cloud Extraction\" width=\"818\" height=\"507\" data-src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg\" data-srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg 818w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-300x186.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-768x476.jpg 768w\" data-sizes=\"(max-width: 818px) 100vw, 818px\" loading=\"lazy\" \/>\n<\/picture>\n<noscript><picture class=\"aligncenter size-full wp-image-17813\"><source type=\"image\/webp\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg.webp 818w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-300x186.jpg.webp 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-768x476.jpg.webp 768w\" sizes=\"(max-width: 818px) 100vw, 818px\"\/><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg\" alt=\"Octoparse Schedule Cloud Extraction\" width=\"818\" height=\"507\" srcset=\"https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction.jpg 818w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-300x186.jpg 300w, https:\/\/royadata.io\/blog\/wp-content\/uploads\/2023\/10\/Octoparse-Schedule-Cloud-Extraction-768x476.jpg 768w\" sizes=\"(max-width: 818px) 100vw, 818px\"\/>\n<\/picture>\n<\/noscript><\/a><\/p>\n<p><strong>Step 3:<\/strong> Click on the \u201c<strong>Set a Schedule<\/strong>\u201d button and the interface for configuring the schedule scraping will show on the main area of the page.<\/p>\n<p><strong>Step 4:<\/strong> The first thing to set is the frequency \u2014 once, weekly, monthly, or interval. The interval option gives you more control if the frequency is not once, weekly, or monthly.<\/p>\n<p><strong>Step 5:<\/strong> Next is the days of the weeks you want the task to run. You have the option of choosing the days you want.<\/p>\n<p><strong>Step 6:<\/strong> In the \u201c<strong>Run at<\/strong>\u201d section, choose the time you want the task to run for the periods you choose above.<\/p>\n<p><strong>Step 7:<\/strong> When you are done, save the settings, and then you have successfully set up a scraping task that would run periodically. You do not even have to keep your computer ON as scraping would be done from the cloud.<\/p>\n<hr\/>\n<h2 id=\"faqs-about-octoparse\" class=\"ftwp-heading\" style=\"text-align: center;\"><span class=\"ez-toc-section\" id=\"FAQs_About_Octoparse\"><\/span><strong>FAQs About Octoparse<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3 id=\"q-is-octoparse-free-to-use\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Q_Is_Octoparse_Free_to_Use\"><\/span><strong>Q. Is Octoparse Free to Use?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The Octoparse service is not a non-profit service and as such, the tool wasn\u2019t introduced into the market for it not to generate funds for its developers. However, it does have support for free use. If you visit the homepage of the web scraper, you will see that you can use it for free for 14 days.<\/p>\n<p>However, a look at the pricing page will reveal to you that there is a free plan available with lots of limitations. Fortunately, the free plan is still useful to a lot of people and there won\u2019t be the need for opting in for their paid plan. However, the features you get from the paid plan are unrivaled.<\/p>\n<h3 id=\"q-can-octoparse-scrape-modern-websites\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Q_Can_Octoparse_Scrape_Modern_Websites\"><\/span><strong>Q. Can Octoparse Scrape Modern Websites?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Yes, the Octoparse scraping tool does have support for scraping modern websites. Octoparse is built for the modern web and can be used to scrape data from Javascript-heavy web pages including scraping Ajaxified pages, infinite scrolling, and many others.<\/p>\n<p>The major requirement for scraping is that the process is replicable and predictive and once that is met, the data of interest can be reached. However, if you are dealing with complex data or websites, the process can be messy at times and you will need to spend some time learning.<\/p>\n<h3 id=\"q-is-proxy-usage-a-must-for-octoparse\" class=\"ftwp-heading\"><span class=\"ez-toc-section\" id=\"Q_Is_Proxy_Usage_a_Must_for_Octoparse\"><\/span><strong>Q. Is Proxy Usage a Must for Octoparse?<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>The Octoparse web scraper does have support for proxies and IP rotation. However, it does not make that compulsory for you. This does not mean you can escape the use of proxies in most cases.<\/p>\n<p>You can only avoid the use of proxies if you will only scrape data from a few pages. If you need to send too many requests, then proxies are non-negotiable as websites would block your IP address if they get too many requests from your IP address.<\/p>\n<pre style=\"text-align: center;\"><strong>Conclusion<\/strong><\/pre>\n<p>Octoparse is one of the popular options out there in terms of code-free web scraping and a good number of businesses rely on it to gather data from public sources online. While it is developed to make web scraping easy, that does not mean its usage is easy at first glance.<\/p>\n<p>You still need guidance and time to grasp the full skills required to use the tool at a pro-level. With the guide provided above, you have the base skill required to make use of Octoparse for web scraping.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Are you looking forward to using the Octoparse scraping tool for collecting public data on the Internet? Then you are on the right page as we would be providing you a step-by-step guide on how to scrape data using Octoparse. In the past, web scraping requires you to directly write code or get someone that &#8230; <a title=\"Octoparse Tutorials 2022: How to Use the Octoparse [Step By Step]\" class=\"read-more\" href=\"http:\/\/royadata.io\/blog\/octoparse-tutorials\/\" aria-label=\"More on Octoparse Tutorials 2022: How to Use the Octoparse [Step By Step]\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":350,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts\/6163"}],"collection":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/comments?post=6163"}],"version-history":[{"count":0,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/posts\/6163\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/media\/350"}],"wp:attachment":[{"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/media?parent=6163"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/categories?post=6163"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/royadata.io\/blog\/wp-json\/wp\/v2\/tags?post=6163"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}