How to get started

Home Private: Elements How to get started

Before get started you need:
– scrapeall account (demo or paid subscription)
– a chrome browser
– scrapeall extension installed

Open your chrome browser and install ScrapeAll extension available on google web store.
After installation, open the extension popup menu and login using you scrapeall account credentials.

After chrome browser extension authentication, a small button will be appear on every website while navigating.
To create a new scraping project for a target website, click the “scrape” button, fill form fields and create project:

That’s all. Now your scraping popup must look like this:

Datasources are actually pages that can contain a list of target pages links (where useful data lives). There are 3 types of datasources:

1. Auto Discovery Datasource: Usefull to navigate using all links found on a page and discover data automatically when no structured list data available, for example to recover related article directly from an article page.
2. Listing Datasource: Can be used to scrape a list of target pages such as a store category or a blog category paginated listing. The advantage of this datasource is automatic pagination exploring.
3. Single Page Datasource: This type of datasource can be used to recover data from a single page when no navigation is required, for example a currency web service that update it’s content every 6 hours.

Scraping for every datasource can be automated and executed at a custom time intervals.

Now, we will create a listing datasource in order to collect each new product from multiple pages:

 

There are a few things to note here:

– after a datasource is created, it is impossible to change it’s type without deletion
– if you have multiple types of data, each data type must have it’s own data source
– scraped data results are owned by datasource and not by project

 
Target Links and Pagination mapping 

To map products from a page, click ‘+ add mapping‘ button under TARGET LINKS section. After that, hover overt the products title and you must see found element highlighted. When the correct desired element is identified, click it and the mapping will be added to the list (you can have multiple mappings).
Pagination need to be mapped also if you have multiple page of products that needs to be scraped. To map pagination element, the same procedure as above will be used.

A correct mapping will look like this:


Click on next and follow the next section instructions.

Data model is the last required step of mapping. It will help you to identify what text sections will be recovered on each scraping execution.
Each mapped page text position is transposed as a field and will be available after the execution.

To start data model mapping click on “Next” button and follow the instructions:

Then map important page content:


Click on “Create data model” button and inspect mapped data:

That’s all!
Now we can configure scraping automation by following the next step instructions.

  • Go to SCRAPING CONSOLE and login with the credentials from the first steps.
     Click on the Projects left menu button and all created projects will appear. For this tutorial, we have “Scrapa” project created in the steps before:


Click on “Details” button to inspect the datasource and scraping profile of this project:

As we can see, the datasource execution profile was not configured. 
An execution profile allows us to configure an automated scraping process where we can specify how many page will be scraped, how many pagination are parsed, the number of credits available for this execution profile and the scheduling options (can be configured to run every n minute, hour or day).

The execution profile configuration look like this:

** Each type of datasource (listing/auto discovery/single page) has a different set of options available in execution profile configuration.

After datasource creation and configuration we can start our automated scraping process.

In the following screens presented below, for any available datasource you can lunch scraping using ‘Start’ button and then view scraped data using ‘Data’ button:

Now the data can be exported as JSON or CSV.

 

We offer a demo plan with 3000 credits to test our solution. After that, a premium plan can be purchased from this page.

For small projects we recommends our small plan. A medium plan is also available for medium projects and if you have a custom use case you can contact us to discuss more about a custom plan.

Our implementation requires no coding skills, data can be mapped very easily using our chrome browser extension.
We can also help within custom use cases in order to fit better your business requirements.

There are new features on the way! Do you have a suggestion?

Please contact us to discuss more.