Scraping & crawling TP: Home – Support Portal content?

Hi,

I´m looking into scraping and possibly crawling content from an authenticated web page and all its sub, pages to access all the information inside it.

I´m just need to confirm if this is allowed, and how authentication should be handled programmatically (e.g. API key, session token, etc.).

Anyone tried this?

Thanks!

Hi @ASOS08,

The built-in scrapers currently do not support authentication. However, you have a couple of options:

  1. If the platform offers APIs, you can use an HTTP Request block or create a Custom Function to make API calls with your queries
  2. Alternatively, you can create a Custom Function to call third-party scrapers that support authentication

Hope this helps!

Hi Alex, can you show me some exemples os third-party scrapers that i can use?

Hi @Alex_MindStudio, got this problem as well, in this “hot summer” with all the massive improvments in MindStudio, do you think there is a solution for this now?

Thanks! :wink:

Hi @Fjmtrigo,

The best approach is to use the APIs of the service you’re trying to fetch data from. If no APIs are available, you can try third-party scrapers that integrate through API, Zapier, or Make.com. Each has its pros and cons, so make sure they support the platform you need to scrape.

From there, you can use blocks like HTTP Request or Custom Function to call the scraper from within your agent.

1 Like

Hi @Alex_MindStudio , do you think that Firecrawl or Apify might do the job?

Thanks

Hi @Fjmtrigo,

They could, but I’d suggest testing them first to make sure they can scrape the platform you need. After that, you can set up a Custom Function to trigger them through the API as part of your Agent’s Automations.

1 Like

Hi Fernando,

Did you figure out your scraper? Did you use Firecrawl or Apify? I heard Apify costs more…

1 Like

Hi @regina.nickles not yet, still exploring it…, firecrawler does scrapes and crawls nicely, but it seem that they had to put a pause on the Authentication feature (which is why it’s no longer advertised on their site).

Will try APIFY, or other tool that i didn’t figured out yet.

Have you tried something else?

Hi Fernando! I’m looking at APIFY too. They’ve got a ton of web scrapers. I guess I’ll just go with the ones that have the best recommendations.
Then I’ll try the HTTP Request in MindStudio and see what happens… :crossed_fingers:Fingers crossed - I will ping you when I get it working.

1 Like