Web scraping or automatic data extraction, can be an incredibly valuable tool for individuals and businesses alike. While web scraping can be done manually, it can quickly turn into an incredibly tedious task. To speed up the process, it is recommended that users turn to a web scraping tool instead, such as the one offered by Octoparse. The company recently launched a new version (8.4) of its software, which brings a number of improvements. In this article, we take a closer look at what Octoparse 8.4 brings to the table.
Note: this is a sponsored article and was made possible by Octoparse. The actual contents and opinions are the sole views of the author who maintains editorial independence even when a post is sponsored.
Getting to Know Octoparse 8.4
Octoparse is a simple-to-use web scraping tool that is rich in features. It comes with a series of convenient templates that allow users to start web scraping immediately without much effort. Since Octoparse doesn’t require any coding knowledge, anyone can go ahead and use the data-mining software.
There is, however, a consistent learning-curve to consider if you want to use this program to the fullest of its abilities. Fortunately, Octoparse puts a wide library of tutorials at your disposal so that you can get educated on how to perform various tasks in no time.
Octoparse 8.4 is available for Windows (7, 8, 10) or macOS (10.10 and above) users on the official website. If you are on a Windows XP or x32 system, you will have to download the older Octoparse 7.3.0 version.
What Can You Do with Octoparse 8.4?
With Octoparse, you can extract all kinds of data, including product data from major e-commerce websites such as Amazon, eBay, Target, Walmart and more. In addition, the tool can target major social media websites, such as Facebook, Twitter, Instagram, YouTube, etc., to grab posts, comments, images and more.
You’ll find a series of templates targeting these very websites as you open Octoparse 8.4. For instance, the Facebook template is designed to scrape comments for each post from a Facebook account page. To give it a go, all you have to do is hit the blue “Try it” button.
Moreover, Octoparse can help you track hotel prices, ratings and reviews on websites such as Booking or TripAdvisor or create a specific database by scrapping info from websites such as Yellow Pages, Yelp, Crunchbase and more.
With the process of web scraping completed, Octoparse users can export the results into various formats, including Excel, HTML, TXT, CVS or databases such as MySQL, SQL Server, and Oracle.
Working with Advanced Mode
Templates aside, Octoparse allows you to scrape data off any website. It’s quite straightforward to set up an operation. There is a new layout in the new version that switches the workflow from the left to the right. There’s also an advanced setting area sitting at the corner, making it easier for users to define wanted actions.
Overall, the interface is roomier and feels like you have plenty of space to breathe. Even so, we recommend using a larger monitor when working in Octoparse. Despite the update, the experience still feels a bit cramped on a standard laptop.
In Advanced mode, you’ll need to paste a relevant URL into the application.
Next, the program will automatically load the page and extract what it considers to be relevant information. The results show up in the lower part of the display. You can remove the fields you’re not interested in, just by clicking the three dots, then selecting the option to “Delete.”
The latest version takes advantage of the Webview technique inside the browser, which offers excellent antifreeze abilities. Our testing didn’t turn up any annoying page-freezing issues.
Keep Your Eyes on the Tips
Following the instructions above, Octoparse will extract data only from the current page, but if you want the program to data mine from all the pages, you’ll need to create a pagination loop. The first step towards doing so is to create a workflow. Click the button to begin.
The suggestion box will now bring up a number of options. Select “Click on a Load More button,” then scroll down to the bottom of the page until you find the “Next page” button or something similar. Click on it and hit the “Confirm” button.
If you need more data than what Octoparse originally picked up, you can create a second element that will selecy every item in the list and grab the data you want.
To begin, go to an item on the list and click it, then select the “Click URL” option from the Tips menu.
The dedicated page of the item will now load. Click the relevant fields, and they will show below. You can edit them if you would like.
Run the Task
When you’re finally satisfied with the outline of the task you’ve created, it’s time to run it on your device or schedule it (Local). It’s also possible to run it in the Cloud, but that’s an option that is only available for those on a plan.
The process of scraping everything doesn’t take too long, and when it’s done, you can immediately click the “Export Data” button and choose your preferred format from there.
Octoparse is quite complex and you can achieve more with it than just setting up simple tasks. For example: refining the data you’ve extracted. With the RegEx Tool in the Tool box, you can clean the data, such as replacing text.
We should also note that with version 8.4, Octoparse has joined forces with Zapier, and this integration means that users can now use the web scraping service in combination with thousands of apps, such as Google Drive, Google Sheets, Slack and others.
To start integrating workflows, you’ll need to access Zapier on your device. Then click on the “Create Zap” button on the right side of the display. We wanted to set up a Zap that could replace Google Drive files with new documents processed in Octoparse.
To set up a trigger, you’ll need to use the search bar to find and select Octoparse. Connect with your Octoparse account and start setting up the trigger. Choose the target Octoparse task, which you can search by ID, then set your ideal Task status. Finding the task ID is a bit tricky when you’re doing it for the first time. Fortunately, the documentation has you covered, so you can quickly figure it out. (Tip: you need to run the task in the cloud.)
Next up, you’ll need to select the action app, which in this example is Google Docs.
In this section you will have to define several parameters. The most important one is the Action event, so make sure you choose a suitable option. After that, you’ll have to specify more details regarding the action in the “Set up action” fields.
The process proved quite seamless the next time around when we tried creating a new Zap. It just takes a little bit of getting used to. It might also require you to do a bit of reading. Fortunately, both Zapier and Octoparse offer their own library of tutorials, so you won’t be forced to invest a large amount of time into research.
Get Octoparse Now
You can try Octoparse for free, which is perfect for those who are looking to undertake a few simple projects. Sign up with an account to get started. However, to get access to the full set of features you’ll need to upgrade to one of the three paid plans:
- Standard Plan: $75/month
- Professional Plan: $209/month
- Enterprise Plan: customized features available on demand
While there are many things you can do in the free version, the paid versions bring advanced options. This includes access to a larger number of crawlers, scheduled extractions, concurrent cloud extractions, auto IP rotation, API access, email support and more.
If you’re curious about Octoparse, you can get the free tier first and see how well it caters to your needs. The latest version is available for download on the official website right now.