Description
Overview
This Scrape Trustpilot Reviews with DeepSeek, Analyze Sentiment with OpenAI workflow automates the extraction and sentiment analysis of customer reviews from Trustpilot. This automation workflow targets businesses or analysts needing structured insights from online reviews by scraping review pages, extracting detailed review data, and categorizing sentiment using AI models.
The workflow initiates via a manual trigger and utilizes DeepSeek’s AI-powered information extractor for parsing review content and OpenAI’s language model for sentiment classification into positive, neutral, or negative categories.
Key Benefits
- Automates review scraping and data extraction from Trustpilot pages using a no-code integration pipeline.
- Applies AI-driven information extraction to obtain structured review fields accurately from HTML content.
- Performs sentiment analysis with OpenAI to classify reviews into positive, neutral, or negative categories.
- Stores extracted and enriched data reliably in Google Sheets for subsequent analysis or reporting.
Product Overview
This orchestration pipeline begins with a manual trigger node that sets parameters such as the Trustpilot company identifier and the maximum number of pages to scrape. The workflow sends HTTP GET requests to Trustpilot review pages sorted by recency, extracting review URLs from the HTML response.
Each review URL is processed individually, limiting the number of reviews handled at once to three by default. The system checks Google Sheets for existing records to avoid duplicate processing. If a review is new, it fetches the full review page and extracts the review HTML content.
DeepSeek’s AI information extractor parses the review HTML to extract seven key attributes including author name, rating, review date, title, text, reviewer country code, and total reviews by the user. Subsequently, OpenAI performs sentiment analysis on the extracted review text, classifying it into one of three categories. The final structured data, including sentiment, is appended or updated in a Google Sheets document.
Error handling follows default n8n behavior with no explicit retry or backoff configured. The workflow uses OAuth2 credentials for Google Sheets access, and API keys for DeepSeek and OpenAI integrations. No persistent storage beyond Google Sheets is used.
Features and Outcomes
Core Automation
This no-code integration ingests review URLs from Trustpilot, applies AI-based extraction and sentiment classification, and conditionally processes new reviews. The workflow uses nodes such as HTTP Request, HTML Extract, and LangChain-based AI nodes.
- Single-pass evaluation of each review URL with conditional branching for new data only.
- Automated extraction of structured review metadata from raw HTML content.
- Integrated sentiment categorization into positive, neutral, or negative classes.
Integrations and Intake
The workflow integrates multiple APIs and services to facilitate review scraping and analysis. OAuth2 authentication secures Google Sheets access. Trustpilot pages are accessed via HTTP GET requests with pagination parameters.
- Trustpilot HTTP requests for paginated review scraping sorted by recency.
- DeepSeek AI for structured information extraction from HTML review content.
- OpenAI language model for sentiment analysis of extracted review text.
Outputs and Consumption
Extracted review data and sentiment results are outputted asynchronously and appended to a Google Sheets document. The data schema includes review ID, URL, author, date, rating, title, text, sentiment, location, and reviewer statistics.
- Google Sheets rows updated or appended with enriched review metadata.
- Sentiment classification included as a discrete field for filtering or analysis.
- Data output structured for compatibility with spreadsheet-based downstream workflows.
Workflow — End-to-End Execution
Step 1: Trigger
The workflow begins with a manual trigger node initiating the process. Upon activation, it sets parameters including the Trustpilot company ID and maximum pages to scrape, establishing the scope of review collection.
Step 2: Processing
HTTP requests retrieve review pages filtered by recency. The HTML Extract node parses these pages to extract review URLs. The Split Out node separates each URL for individual processing. A limit node restricts processing to three reviews per run. Review entries are checked against Google Sheets to identify new or existing records.
Step 3: Analysis
For new reviews, the workflow fetches the full review page and extracts the main review content. The DeepSeek AI node extracts structured fields such as author name, rating (1-5), review date, title, text, reviewer country code, and total reviews by the author. Following this, OpenAI’s sentiment analysis categorizes the review text as Positive, Neutral, or Negative.
Step 4: Delivery
Extracted and analyzed data are appended or updated in a Google Sheets document. The output fields include review metadata alongside the sentiment category. The workflow completes without persistence beyond Google Sheets and does not include explicit error retries.
Use Cases
Scenario 1
A company wants to monitor customer sentiment trends on Trustpilot but lacks resources for manual data collection. This automation workflow scrapes recent reviews, extracts detailed data, and classifies sentiment, enabling structured analysis without manual intervention.
Scenario 2
Data analysts require a reliable pipeline to aggregate and enrich Trustpilot reviews with sentiment labels. The workflow provides an event-driven analysis by programmatically fetching reviews, extracting key fields, and performing sentiment classification for downstream reporting.
Scenario 3
Marketing teams need to identify negative reviews for rapid response. This orchestration pipeline automates review scraping and sentiment tagging, delivering categorized data to Google Sheets for easy filtering and prioritization of negative feedback.
How to use
To operate this workflow, import it into n8n and configure the OAuth2 credentials for Google Sheets, plus API keys for DeepSeek and OpenAI. Replace the placeholder company ID with the target Trustpilot company name and set the desired maximum number of review pages. Trigger the workflow manually to initiate scraping, extraction, sentiment analysis, and data update. Results appear in the configured Google Sheets document, structured for easy review and further processing.
Comparison — Manual Process vs. Automation Workflow
| Attribute | Manual/Alternative | This Workflow |
|---|---|---|
| Steps required | Multiple manual steps: browsing, copying, pasting, sentiment tagging | Single automated pipeline with conditional processing and AI extraction |
| Consistency | Variable, prone to human error and incomplete data capture | Deterministic extraction and sentiment classification with AI models |
| Scalability | Limited by manual labor capacity and time constraints | Scales with API limits and configured page scrape maximums |
| Maintenance | High, requires frequent manual updates and validation | Medium, requires credential updates and occasional parameter tuning |
Technical Specifications
| Environment | n8n automation platform |
|---|---|
| Tools / APIs | Trustpilot HTTP endpoints, DeepSeek AI, OpenAI, Google Sheets API |
| Execution Model | Event-driven with manual trigger, asynchronous API calls |
| Input Formats | HTTP HTML pages from Trustpilot, JSON for parameters |
| Output Formats | Structured rows in Google Sheets with JSON-based sentiment data |
| Data Handling | Transient processing, no local persistence; updates Google Sheets |
| Known Constraints | Dependent on Trustpilot page structure and external API availability |
| Credentials | OAuth2 for Google Sheets; API keys for DeepSeek and OpenAI |
Implementation Requirements
- Valid OAuth2 credentials configured for Google Sheets API access.
- API keys for DeepSeek and OpenAI configured and linked in n8n credentials.
- Correct Trustpilot company identifier provided and max page limit set.
Configuration & Validation
- Set the company_id parameter to the target Trustpilot company slug.
- Verify OAuth2 and API key credentials are authorized and active in n8n.
- Run the workflow manually and confirm new review entries appear in Google Sheets with expected fields and sentiment labels.
Data Provenance
- Triggered manually via the “When clicking ‘Test workflow’” node.
- Uses “Get reviews” HTTP Request node to scrape Trustpilot review pages.
- Processes reviews through “Information Extractor” (DeepSeek) and “Sentiment Analysis” (OpenAI) nodes.
FAQ
How is the Scrape Trustpilot Reviews automation workflow triggered?
The workflow is initiated manually via a trigger node designed for on-demand execution. Parameters such as company ID and maximum pages are set before scraping.
Which tools or models does the orchestration pipeline use?
It integrates DeepSeek’s AI for extracting structured review information and OpenAI’s language model for classifying review sentiment.
What does the response look like for client consumption?
Data is output as structured rows in Google Sheets, including review metadata and sentiment labels for easy filtering and analysis.
Is any data persisted by the workflow?
Data persistence occurs only in the Google Sheets document; the workflow processes data transiently without local storage.
How are errors handled in this integration flow?
No explicit error retry or backoff mechanisms are configured; the workflow relies on default platform error handling.
Conclusion
This workflow automates the extraction and sentiment analysis of Trustpilot reviews, providing structured, AI-enriched data delivered to Google Sheets for analysis. It offers deterministic processing with conditional evaluation to avoid duplicate data entries. The approach depends on the availability and stability of external APIs including Trustpilot, DeepSeek, and OpenAI. This solution supports ongoing review monitoring and sentiment insights without requiring manual data handling or specialized programming.








Reviews
There are no reviews yet.